An Introduction to JDBC
At Mitre, we collect information from the Internet using commercial search engines such as Altavista and Lycos for a variety of topics. This information is stored as a collection of text documents for any topic and can be searched via keywords. We use the public-domain search engine Glimpse from the University of Arizona to index and search the document collection.
Some of the document collections can be fairly large (over 1500 documents). If a common keyword is entered, the list of matching documents will be large. We display the results from the search engine using Java and JDBC to avoid scanning long lists of matching documents. Java was used to build a 3-D space and plot circles at locations representing the frequency of the occurrences of keywords in a document. JDBC was used to retrieve the titles of the documents stored in a table. Passing all the titles of all documents in the collection as parameters to the Java applet would significantly increase the time to load the applet.
Glimpse returns the frequency of occurrence of a keyword in a document. We use that number to locate a circle representing the document in 3-D space (see Figure 3). Each axis represents a keyword. If fewer than three keywords are entered, documents will be displayed in a plane or on a line. If more than three keywords are entered, three or fewer keywords must be chosen in order to display matching documents.
The frequency of occurrence of keywords is normalized for each axis, and the frequencies of keywords in documents are passed as parameters to the applet. The color of the circle was computed based on the position of the circle in the three axes. Red is used for documents on the z-axis, green for documents on the y-axis and blue for documents on the x-axis. Brighter shades of the three primary colors are used for documents with higher keyword frequencies. A mix of the primary colors is used for circles which contain more than one keyword.
JDBC is used to retrieve the titles for documents containing non-zero occurrences of the keywords. This number is usually fewer than the total number of documents when a fairly unique keyword is used. When the mouse is located over the document, a window is displayed with the document's title. Sometimes, more than one document can have the same frequency of occurrence of a keyword. In such cases, the window displays multiple titles of documents. The color of the circle changes to white to indicate the document where the mouse is located. An option to click on a box in the window is provided and will retrieve the text corresponding to the document in a separate window.
This article has described the basics of working with JDBC under Linux: the design of JDBC, the installation of JDBC for MySQL and example code to retrieve/store data. Metadata statements can be used to interrogate the structure of a database and its tables. Finally, we looked at an example using a search engine with JDBC and Java. Viewing the results from a Java applet made the user's task more interesting than it would have been through a CGI program.
The listing referred to in this article is available by anonymous download in the file ftp.linuxjournal.com/pub/lj/listings/issue55/2846.tgz.
Manu Konchady (firstname.lastname@example.org) works at Mitre Corporation developing software for information retrieval. As the lone user of Linux in a group of 50, he is striving to promote its many benefits.
- Bruce Nikkel's Practical Forensic Imaging (No Starch Press)
- Transitioning to Python 3
- Progress on Privacy
- Stepping into Science
- Linux Journal December 2016
- Radio Free Linux
- CORSAIR's Carbide Air 740
- The Tiny Internet Project, Part II
- FutureVault Inc.'s FutureVault
- A Better Raspberry Pi Streaming Solution