Software Development

Project 3

Due: Part 1 February 28, Part 2 March 8

PROJECT OVERVIEW

The goal of this project is the development of an index for the Bible class to improve lookup time and efficiency. Currenty the Bible class has a lookup function that finds references by scanning through the entire Bible file text, a rather CPU intensive method. This project will first modify the Bible classes Lookup, Next and Prev functions to work much more quickly by building and using a lookup index based on the C++ STL Map associate lookup template abstract type. The Map function allows the user to create a lookup function that uses ANY data type as a key to lookup another associated value in a manner similar to indexing in an array. How the Map abstract type is implemented is hidden, the point is that it is very fast and efficient way to map from one domain toanother. In this case we will create a Map association which maps from Bible Ref to Bible positions (byte offset of the verse from the beginning of the Bible file). The Bible class interface should NOT change, the modifications will be hidden in the Bible class.

The problem with this solution is that the creation of the Map index of the Bible is time consuming, it requires iterating through the Bible completely to build the MAP index. Moreover, since the solution uses a web CGI methind, the Bible lookup program has to rebuild the index each time a verse is searched for. There at least are two possible general solutions. One is to build and store the index ahead of time in a file, so it doesn't need to be rebuilt for every search. This is a reasonable idea in general, but not practical for a Map, since the STL library does not support storage of the Map data structures in secondary storage.

The other solution is to load the index once into a running server program, and then having the CGI program ask this program to look up the reference for it. This can be done using a built in interprocess communication method called a "pipe". A Pipe allows separate running programs to pass messages between them on demand. This is the solution we will use.

Two demo systems have been created that demonstrate these types of solutions to do word searches on the complete works of Shakespeare.

One system that runs standalone on the console, and uses a Map type to map words to an inverted list of file positions for those words, can be found here: http://cs.mvnu.edu/home/class/csc3004/indexdemo.

Another program, that uses the same indexing code as the previous, add an web interface, and uses pipes to communicate to a server to which will look up the position in the Bible given the reference. This project can be found, and run here:

LEARNING GOALS

This project has the following learning goals:

  1. Creation and use of the MAP STL template class for an associate index within the Bible class by the lookup. next, and prev methods
  2. Use of STL list and map classes
  3. Interprocess communications interprocess communication using "pipes" (I have provided a class called "Fifo" which encapsulates the use of pipes.)
  4. Client Server architecture

Project Milestones

As explained above, the goal of this code is to create a user-friendly application that permits the fast lookup of verses from references. The goal is a high speed, active search the displays matching verses. This project is has several distinct aspects, each of which will be complete in a stepwise manner.

  1. Build the indexing and lookup system into the Bible class, so that the lookup, next, and prev methods take advantage of it. The index will be created as part of the Bible constructor operation.
  2. Build a console test program to text the new indexed Bible Class
  3. Designing a Client/Server solution using the Fifo class for communication
  4. Implementing a Client Server solution using a console test program
  5. Adding the code to the web based Bible lookup system, thus creating a "fast" web lookup experience for the user.
  6. Testing, refining and demostraiting the solution

User Interface

As before, a major goal of this assignment is to create a friendly and robust user interface. You should be able to reuse the same user interface from before, only changing the Bible class.

Methods

  • Adding a operator< to the Ref class. The Map template class requires the the data type used for the index elements (in this case the Ref class) support the operator< function. This is required so the algorithms can build a seachable data structure internally. Thus you will need something like this:
bool Ref::operator<(const Ref::Ref a) const {	// Compare if one Ref is less than another Ref
      bool result;			
      result = ....;   // You figure it out!
      return result;
   }
  • Indexing: Review how the indexing is done on the Shakespeare text. This is a simpler index using the following mapping data structure. Below is some sample code:
    map<Ref, int> > index;
    Here a Bible Ref class is the index type, which maps to an int type which will be the byte offset into the Bible that where that reference can be found. This this will allow "direct access" on the file, without need for scanning through it sequencially.
    (see: http://www.cplusplus.com/reference/map/map/?kw=map)

map <Ref,int> refmap;
...Verse v = Verse("1:1:1 In the beginning...");

// Insert ref in Map, position is position in file (int)
Ref r = v.getRef();            // Get the reference from the verse
int position = bfile.tellg();  // Get the file position
refmap[r] = position;         // Associate reference r with position "position"

// Later, read and print the position;
Ref r2 = Ref(2:4:5);         // Create a Ref
map <Ref,int>::iterator it;  // An iterator for the find
it = refmap.find(r2);
if (it == refmap.end()) {
   cout << "Ref not found" << endl
} else {
   int position = ref[r2];      // get the associated position from the Map
   cout position;               // display the position
}
  • Direct File Access: The C++ fstream functions
  • Fifo Pipes: The use of named pipes in C++. The demo code given to you includes a wrapper class "Fifo" which encapsulates the use of pipes for easy interprocess communication is C++. Please carefull review the use in the demo code above it's use. Also note - creation of a pipe creates a special files in the /tmp folder on the Linex file system. In order to prevent file naming colisions you MUST modify the code in Fifo.h to use your username for the pipe "SIG" string constant. This is appended to the pipe names, assuring that all pipes are private for your program. Fifo methods

{
  Fifo();             // Default Constructor                                     
  Fifo(string);       // Construct from pipe name string                         
                      // If a pipe already exists, does nothing                  
                      // Otherwise it creates the names pipe                     

  void openread();    // Open a pipe for reading (recv)                          
       	       	      // openread allows other readers but                       
       	       	      // blocks writers from the pipe                            

  void openwrite();   // Start a pipe for writing (send)                          
       	       	      // openwrite block all other opens to the pipe             
       	       	      // blocked opens proceed with previous user                

  void fifoclose();   // Close the pipe                                          
       	       	      // Frees the pipe for other users (processes)              

                      // You MUST call openread() before recv                    
  string recv();      // Receives - get the next record from the pipe            
                      // Process will vlock until pipe has data.                 

                      // You MUST call openwrite() before a send                 
  void send(string);  // Sends - a record into the pipe                          
                      // Does not block - always writes and returned </sticky> <sticky>
}
  • Background Processes: Once your server program is complete, you can run it in the background, even when not logged in, by appending "&" to end of the run command. Please make sure you only run one instance of you server at a time, or they will compete with each other, and cause problems. In class I will demonstrate how to kill these background processes once started.

GRADING

  • 50% - How well the program works with respect to the given specifications. (NO credit will be given for a program that doesn't run.)
  • 20% - User friendliness. Should meet the criteria covered in class. Invalid user entries should not crash program. All input should be range checked. Error messages should be meaningful and helpful.
  • 20% - Internal program structure. The program should be well organized into procedures. Remember "maximize cohesion, minimize coupling".
  • 10% - Comments - The code must be clearly commented. Each procedure should be preceded by a detailed description. This should include a description of the function and any assumptions the calling routine must make. It should describe the function, purpose, and range of values for each parameter passed, as well as whether the value if passed, returned, or both.
  • Up to 10% Extra Credit – You may receive extra credit f or extra features. This includes multiple Bible versions integration, expanding search range to a given verse range to be considered a match (or a whole chapter as a range), improving the stemmer for the language used in the KJV of the Bible. Etc as proposed and approved.

Milestone Due dates

Milestone Completion date Points
Good progress on getting the indexing operation working within the Bible class Feb 25 25
Operation of the indexed lookup with a console test program Feb 28 50
Partial operation of the client server system March 4 25
Complete operation of the web based, client/server system March 8 100
Total   200
Topic revision: r4 - 2013-02-26 - JimSkon
 
This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback