One of the core ideas of software engineering is to divide a large project into separate modules. Modularization makes it easier to customize a system for your own specific needs, allowing you to write new modules and remove unnecessary ones. Using modules also makes it easier to distribute the work among many different programmers. A quick review of the available Linux, Apache, Perl and Python modules freely available on the Internet makes this point very clear.
OpenACS 4 (Open Architecture Community System), the toolkit for creating on-line communities that was initially examined here last month, dramatically improves on earlier versions in a number of ways. But perhaps the most important change is the division of functionality into modules, which are called “packages” in the OpenACS world. Because each package is self-contained, and because it is possible to connect any package with any URL, OpenACS 4 has made it easier than ever to create flexible community web sites.
This month, we take an initial look at OpenACS packages, including how we can install and use them. (This article assumes that you already have installed PostgreSQL, AOLserver and the core OpenACS functionality, as described in the last two installments of At the Forge.) Since most OpenACS sites use some of the functionality that comes with the built-in applications, rather than write everything from scratch, installing packages is something every OpenACS administrator needs to know how to do soon after installing the core system.
Consider the following simple CGI program written in Perl:
#!/usr/bin/perl use strict; use warnings; use CGI; my $query = new CGI; print $query->header(); print $query->start_html(-title => "Testing"); print "<p>This is some text</p>\n"; print $query->end_html();
If I install this program as test.pl in my web server's CGI directory, others can see the results of its execution by retrieving www.lerner.co.il/cgi-bin/test.pl. If I want this program to be available under a number of different names, I can copy it; the name that I choose will be reflected in the URL.
Things get a bit trickier if my server-side application consists of several CGI programs rather than a single program. If I want to have several copies of such an application suite running on my system, I must copy all of the program files. In many cases, it'll be easier to place all of the files in a directory, then copy the directory and all of its contents each time I want the application to run somewhere else.
Making such copies carries potential synchronization problems: if I fix a problem in one copy of a program, I will have to make the same change to every copy of the program. I can resolve some of these problems with CVS, but I also could eliminate this issue by keeping only one copy of my program on the filesystem. Then I could configure the web server (either Apache or AOLserver) to treat one or more URLs as requests for my program.
Now consider what happens if this application suite takes advantage of a relational database. Installing the application is no longer as simple as copying files or configuring the HTTP server. Now, we also need to have some way of resolving potential conflicts and confusion between the copies of a single application, such that the forums at /foo/bboard don't get confused with /bar/bboard in the database. If and when we remove our application from the system, we also will need a way to remove the database tables it used.
In OpenACS, the solution to this problem is APM, the ArsDigita Package Manager. APM was originally written by ArsDigita, a now-defunct consulting company that wrote the predecessor to OpenACS. ACS worked only with an Oracle database server, whereas OpenACS works with both Oracle and PostgreSQL.
APM handles a number of different issues inherent in server-side applications that use a database, including version control, scripts for table creation and removal and database independence. APM also has been designed to allow each copy of an application to have independent configuration variables and to be associated with one or more separate URLs.
An APM really is nothing more than a .tar.gz file with an .apm extension. The file is typically named like this: packagename-0.5d.apm—where packagename is the unique name associated with the package. This example package contains development version 0.5. Opening a package with tar -zxvf reveals a standard file and directory structure:
packagename.info, an XML file describing the contents of the package. This file, normally created automatically by the OpenACS APM application, tells OpenACS which files are associated with the package and which configuration parameters are available for the user. It also indicates whether the application is a singleton (i.e., provides services for the rest of the system) or an application (i.e., can be run from a particular URL).
The sql directory is where the table-creation (and table-destruction) scripts are located. Originally, when ACS supported only Oracle, this directory normally would contain two files: packagename-create.sql and packagename-drop.sql. The APM installer would run the create script when the package was installed and the drop script when it was removed. (The create script often runs INSERTs as well, seeding database tables with standard data for later use.)
Now that OpenACS supports PostgreSQL as well as Oracle, this directory structure has changed somewhat. Within the sql directory are oracle and postgresql directories that have parallel scripts for creating and dropping the tables. Each installed copy of OpenACS knows which databases it supports (based on the value of a variable in AOLserver's nsd.tcl configuration file), and thus chooses the most appropriate script.
The tcl directory contains Tcl files containing procedure definitions. These procedures are loaded into AOLserver at startup time, giving them a speed advantage over those defined inside of .tcl (or .adp) pages elsewhere in the OpenACS system.
The www directory contains what we normally expect to be associated with a web application. This is where we put our .tcl and .adp pages, as well as any graphics and auxiliary files associated with the application. OpenACS's query dispatcher, which makes it possible for server-side programs to support multiple database servers, works with XML files with an .xql extension; these also go in the www directory.
Because of how the OpenACS templating system works, it's not unusual for a single web page to use three files: a .tcl file for setting variables, an .xql file that defines the SQL query used to retrieve rows from the database and an .adp file that is responsible for turning the information into HTML.
APMs also may contain a number of other files, such as database upgrade and migration scripts (for those users who are upgrading from a previous version of the package), regression tests (to ensure that the package works correctly), administration facilities (under www/admin) and HTML-formatted package documentation (under www/doc).
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Validate an E-Mail Address with PHP, the Right Way
- Technical Support Rep
- Senior Perl Developer
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?