Lab Assignment 4

Bible and Strong's Lookup Using XML

Project Objectives

  1. Learn how XML is used to represent information.
  2. Learn why XML is useful.
  3. Learn how to process XML structures using both C++ and JavaScript .

Project Overview

Thus far we have been using low level files to store and retrieve our Bible information. As mentened in class, an alternative format for data and record storage involves internally labeling the data. This way the organization and meaning of the data can determined, and processed without extermally defined and documented structures, since the data is what is called "self-describing".

XML such a self-describing format for organizing arbitrary data collections. XML stands for Extensible Markup Language, and has the following major characteristics:

  • its emphasis on descriptive rather than procedural markup.
  • It is independant of specific hardward or software systems.
  • It is hierarchical - data can be described in terms of relationships, composition, and aggragation.
A basic introduction to XML is here and here.

This lab assignments will take the following three collections of data to create a useful system for looking up verses in the Bible, getting the original language source for each word in the bible, along with it's definition, and then looking at other verses with the same original language words.

  1. The complete King James Version (KJV) in XML with embedded original language word association.
  2. A complete lexicon, or dictionary, of Greek and Hebrew words used in the New and Old Testiments.
  3. An inverted list of Greek and Hebrew words, associating each original language word with the list of verses it is in.
Our goal is to augment our existing bible lookup system with the following functions:
  1. When looking up Bible references from the KJV, the user will have an option to include Strong's lexicon references in the text. If they choose this, each word with an original language reference will have a link (or something) that can be clicked to lookup the associated original language word, and the associated definition.
  2. The definitions are themselves interlinked. You should show the words that are linked in the definitions, and follow them when clicked.
  3. When viewing an original language definition (from #1 above), the user the will be able to view the list of all verses which have this same original language word in it, with the related English word clearly highlighed in some way.

The Bible in XML

The complete Kings James Version of the Bible is found on the CS server in /home/class/csc3004/XMLBible/kjv_by_book. There is one file per book, in the form n.xml where n in the book number.

A sample of the XML format follows from Genesis ( 1.xml):

<?xml version="1.0"?>
<bible version='kjv' description='The King James Bible with Strongs Numbers' language='ENG' strongs='true'>
    <book number='1' name='Genesis' abbreviation='gen' testament='old'>
        <chapter number='1'>
            <verse number='1'>In the <strongs hebrew='7225'>beginning</strongs><strongs hebrew='430'>God</strongs><strongs hebrew='1254'>created</strongs><strongs hebrew='*853'></strongs>the <strongs hebrew='8064'>heaven</strongs>and the <strongs hebrew='776'>earth</strongs>. </verse>
            <verse number='2'>And the <strongs hebrew='776'>earth</strongs><strongs hebrew='1961'>was</strongs>without <strongs hebrew='8414'>form</strongs>, and <strongs hebrew='922'>void</strongs>; and <strongs hebrew='2822'>darkness</strongs><em>was</em> <strongs hebrew='5921'>upon</strongs>the <strongs hebrew='6440'>face</strongs>of the <strongs hebrew='8415'>deep</strongs>. And the <strongs hebrew='7307'>Spirit</strongs>of <strongs hebrew='430'>God</strongs><strongs hebrew='7363'>moved</strongs><strongs hebrew='5921'>upon</strongs>the <strongs hebrew='6440'>face</strongs>of the <strongs hebrew='4325'>waters</strongs>. </verse>
            <verse number='3'>And <strongs hebrew='430'>God</strongs><strongs hebrew='559'>said</strongs>, Let there <strongs hebrew='1961'>be</strongs><strongs hebrew='216'>light</strongs>: and there <strongs hebrew='1961'>was</strongs><strongs hebrew='216'>light</strongs>. </verse>
            <verse number='4'>And <strongs hebrew='430'>God</strongs><strongs hebrew='7200'>saw</strongs><strongs hebrew='*853'></strongs>the <strongs hebrew='216'>light</strongs>, <strongs hebrew='3588'>that</strongs><em>it</em> <em>was</em> <strongs hebrew='2896'>good</strongs>: and <strongs hebrew='430'>God</strongs><strongs hebrew='914 996'>divided</strongs>the <strongs hebrew='216'>light</strongs><strongs hebrew='996'>from</strongs>the <strongs hebrew='2822'>darkness</strongs>. </verse>
            <verse number='5'>And <strongs hebrew='430'>God</strongs><strongs hebrew='7121'>called</strongs>the <strongs hebrew='216'>light</strongs><strongs hebrew='3117'>Day</strongs>, and the <strongs hebrew='2822'>darkness</strongs>he <strongs hebrew='7121'>called</strongs><strongs hebrew='3915'>Night</strongs>. And the <strongs hebrew='6153'>evening</strongs>and the <strongs hebrew='1242'>morning</strongs><strongs hebrew='1961'>were</strongs>the <strongs hebrew='259'>first</strongs><strongs hebrew='3117'>day</strongs>. </verse>
            <verse number='6'>And <strongs hebrew='430'>God</strongs><strongs hebrew='559'>said</strongs>, Let there <strongs hebrew='1961'>be</strongs>a <strongs hebrew='7549'>firmament</strongs>in the <strongs hebrew='8432'>midst</strongs>of the <strongs hebrew='4325'>waters</strongs>, and <strongs hebrew='1961'>let</strongs>it <strongs hebrew='914 996'>divide</strongs>the <strongs hebrew='4325'>waters</strongs>from the <strongs hebrew='4325'>waters</strongs>. </verse>

Note that each book has a bible tag which contains a book element. The bible and book element include attributes as to the bible information, book name, and book number. The book contains a list of chapter elements, each with appropriate chapter number attributes. And each chapter element contains a list of verse elements, all with number attributes.

Finally, each verse is a somewhat more complete list of both text and object elements which describe the verse in question. The elements in the list are of four types:

1. A word or list of words. This is merely unmarked text in the verse in the order it occurs. In Genesis 1:1, this includes In the, the, and and the. These are unmarked phrases, and are to be output exactly as in the list.

2. A word or list of words embedded in a Strong's element. These are in the form:

<strongs hebrew='430'>God</strongs>

These elements tell us the word(s) in the element are associated with a particular word, in this case the Hebrew word indexed by number 430 in the Hebrew lexicon. (more on the lexicon later). Sometime more then on number occurs, as there is more then one original language word associated with this word or words:

<strongs hebrew='914 996'>divide</strongs>

Note that in the New Testament, the Strong's elements use greek rather then hebrew for the attribute names. Also, for some reason, some of the Bible files use "number" instead of "greek" or "hebrew". The current running demo code taked this into accout.

3. A strongs element without textual content. These are in the form:

<strongs hebrew='*853'></strongs>

These elements, I believe, tell us of word(s) in the original text that do not have associations in the English version. In this case the Hebrew word indexed by number 853 in the Hebrew lexicon. The * apparently indicates this case.

4. A emphasis element: This is used to for whatever reasons to empathise certain words (in the demo code these are italicised). An example:


Sample C++ Code to process the Bible code

I have written simple sample code in both C++ and JavaScript to process the Bible XML files. To do this, an XML prarser has to be used to read the XML, and dissect it into the pieces we need. In C++ I have used an XML library from this site. The author has given me permission to use it in this course. The code is attached here:

In order to use this code, you simply need to include the xmlParser.h file, and compile and link in the xmlParser.cpp code.

Sample C++ code to process the Bible XML is XMLDemo.cpp:

Here are the actual files for download to build the demo program:

Sample JavaScript Code to process the Bible code

I have written the following application in HTML and JavaScript to allow single verse lookup in the Bible XML code. THis is just a sample, and will need work for your project. It does, however, should how to process the XML using the DOM.

Try it out here.

The actual html code is here: XMLBible.html

Below are useful resources on XML processing

Strongs Lexicon and Inverted List files

The Hoth directory /home/class/csc3004/XMLBible/ contains both the Lexicon files and the Inverted list files:

  • /home/class/csc3004/XMLBible/greek_strongs - Greek Lexicon
  • /home/class/csc3004/XMLBible/heb_strongs - Hebrew Lexicon
  • /home/class/csc3004/XMLBible/bible_refs_of_strongs_numbers - Inverted list of verses with matching Greek or Hebrew words.
I will NOT give further explaination about these at this point. It is considered part of this assignment to figure out how they represent the data in questions (e.g. and learn how XML really works).

XML Related Information

Project Milestones



Due Date

Design Presentation Give a design presentation in lab of your proposed design solution. Include an architectural diagram, and discriptions of your classes. This will be presented in Lab on April 14 March 31
Integrate XML Bible Lookup into Existing System Use either JavaScript or c++ solution to give the user the option to lookup the KJV references from the XML version of the Bible. For this first version you can simply display the lexicon reference numbers. April 7
Allow Lexicon lookups from Bible Verses.

In this update the lexicon reference numbers should be clickable (or the words themselves, with some provision for multiple word matches). When clicked you should pop-up a window, and display the lexicon entry nicely formatted. This will require you to write your own XML processing code! You will need to use the Hebrew and Greek Lexicon files for this.

Also, note that definitions in the Lexicon refer to other words in the Lexicon. Allow the view to follow these links by clicking on them.

April 14
Allow the display of ALL verses with matching words from Lexicon Lookup. From the Lexicon definition page of the above milestone, you should be able to request the displaying of ALL verses with matching versions of the Lexicon entry displayed. In each verse, the matching word (or location for missing word) should be clearly highlighted. April 21
Final Presentation Given a final presentation and demo of your system. Show all features. Show your final system architecture, and explain how it changed. April 25

-- JimSkon - 2011-04-25

Edit | Attach | Print version | History: r7 < r6 < r5 < r4 < r3 | Backlinks | Raw View | Raw edit | More topic actions...
Topic revision: r5 - 2016-03-24 - JimSkon
This site is powered by the TWiki collaboration platformCopyright &© by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback