Object Databases: Not Just for CAD/CAM Anymore
Let's take a look at an example of how easy it is to make things persistent in Texas. Sticking with tradition, we write the familiar “hello world” program, but with a persistent twist: we record how many times the program has been executed. Listing 1 shows the code for this task.
First we initialize the Texas library, passing it the argc and argv arguments from main. The program then opens up a persistent store named “hello.pstore” in the current working directory.
The persistent store is queried to see if a named root "COUNT" exists using the is_root() function. If the named root does not exist, allocation of a new integer takes place. The new integer is initialized to zero and named "COUNT" using the add_root() function. Otherwise, we retrieve the integer from the database. The counter is incremented and the results printed to standard output. All the dirty objects are committed and the database is closed. The library is uninitialized and the program exits.
With each successive run of the program, the integer named "COUNT" will be retrieved, incremented and rewritten to the database. You will notice this is all quite seamless: there are no explicit calls to queries, inserts, loads, or saves.
Next, we briefly explore how Texas swizzles pointers at page fault time and handles memory management. This is by no means a complete discussion of these topics. Readers interested in learning more about the Texas system should download the white papers and source code.
Texas uses conventional virtual memory access mechanisms to ensure the first access of any persistent page is intercepted by the Texas library. This page is loaded from the database and scanned for persistent pointers. Swizzling to in-memory addresses occurs on all persistent pointers on that page. All new pages are reserved and access protected. This faulting and reserving process repeats as the program traverses the object hierarchy of unloaded pages. The pages of virtual memory are reserved one step ahead of the actual referencing page. This implies the program can never see a swizzled pointer, only access protected pointers to unloaded objects. The Linux implementation uses the mprotect() system call to set up the access protection on the pages. An in-depth discussion of this topic can be found in the Texas white paper presented at the Fifth International Workshop on Persistent Object Systems [SIN92].
Texas allows you to access multiple databases, each with its own persistent heap. The standard transient heap and stack are also available for non-persistent memory allocation. Texas does not partition its address space into regions, allowing pages from different heaps to be interleaved within memory. Each page must belong only to a single heap, so Texas maintains separate free lists for each heap. A new page is created when the free list is empty or no free memory chunk available is large enough. New pages are partitioned into uniformly sized memory chunks large enough to hold the object being allocated. All of the other chunks are linked onto the free list. This uniform chunking of a page makes for trivial identification of the object headers on the page. Only the first header of a page needs to be examined to determine the size of all memory chunks on that page. The alignment of the other object's headers follows trivially. The object's header stores the schema information for the object so it can be identified and correctly swizzled.
While the Hello Persistent World program is not very exciting, it shows the minimal effort needed to make an object (an integer in this case) persistent. The next example demonstrates the power of object databases to capture the relationships between objects. This contrived example shows several many-to-many relationships. It also exposes some of the current deficiencies in the Texas library. The example is a system to track the many different research papers and books that clutter my office. See Figure 1 for a class diagram using non-unified Booch notation. The design file and the source code for both examples are available on my home page at www.qds.com/people/gmeinke.
The class diagram shows class PublishedWork, an abstract base class for all published material. It presents trivial methods for querying the object for its title, price, the number of pages, and a list of authors. The relationship between an Author and their PublishedWork is an example of a many-to-many relationship. The relationship between a Publisher and the Books they have published is one-to-many. Expressing these complex relationships in relational databases is awkward due to the foreign keys and the intermediate join table needed for the many-to-many relationships. By contrast, Texas handles these complex relationships with C++ containers and stores them directly in the object database. No compromises are necessary to the object design for foreign key data members.
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Managing Linux Using Puppet
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- Doing for User Space What We Did for Kernel Space
- SuperTuxKart 0.9.2 Released
- Google's SwiftShader Released
- Parsing an RSS News Feed with a Bash Script
- Rogue Wave Software's Zend Server
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide