PHP as a General-Purpose Language
August 18th, 2004 by Marco Tabini in
If PHP is your scripting language of choice when it comes to developing dynamic Web sites, you probably have grown to love its immediacy and power. An estimated ten million Web sites use at least some PHP scripting to generate their pages.
Although most people use PHP primarily as a Web development scripting system, it possesses all the characteristics of a proper general-purpose language that can be useful in a variety of other environments. In this article, I illustrate how it's possible to use the command-line version of PHP to perform complex shell operations, such as manipulating data files, reading and parsing remote XML documents and scheduling important tasks through cron.
The contents of this article are based on the latest version of PHP at the time of this writing, 4.3.0, which was released at the end of 2002. However, you should be able to use older versions of PHP 4 without many problems. I explain the differences you may encounter as necessary.
With the release of PHP 4.3, a new version of the interpreter called command-line interface (or PHP-CLI) is available. PHP-CLI is not a shell as the name implies but, rather, a version of PHP designed to run from the shell. As far as software development is concerned, only a few differences exist between PHP-CLI and its CGI or server API (SAPI) counterparts. For one thing, traditional Apache server variables are not available, as Apache isn't even in the picture, and the HTTP headers are not output when a script is executed. Also, the engine does not use output buffering, because it would be of no benefit in a non-Web environment.
PHP-CLI is created by default when you compile your version of PHP, unless you use the --disable-cli switch when you execute the configuration script. It is not, however, installed by default. But, you can force make to compile it and install it by using a special command:
make install-cli
To verify whether the CLI version of PHP is installed on your server, all you need to do is execute this command:
php -v
The resulting version information should specify whether the CLI or CGI version of PHP is being executed. If you have only the CGI version and don't want to install the CLI, you still can use PHP as a shell-scripting language. Their differences are mostly aesthetic, and their effect can be toned down somewhat by using the right command-line switches when invoking the interpreter.
Being a lover of weblogging, I routinely visit a certain number of blogs on the Net. This is a somewhat tedious process, because I don't like the idea of a news aggregator running on my machine on a continuous basis, and I do not see the need to pay for one. It seemed, though, that an RSS aggregator might be a great way to show how some of PHP's powerful features, such as the fopen() wrappers and the built-in XML parsing engine, could be used to create a script that runs from the command line.
An RSS feed is, essentially, a simple XML document that contains information about items published by a news source, such as Linux Journal. Its format consists of a channel container that includes several optional elements, such as a title and description, in addition to a set of item subcontainers. Each of these, in turn, contains a title, a description and a link to the news story it represents.
Typically, a news aggregator loads the information from an arbitrary number of news feeds and presents everything together in a given format, such as HTML. For users, a news aggregator represents a convenient way to create a single point of information for all the news sources of interest.
My PHP-based news aggregator, called Feeder and shown in Listing 1, presents its results in a plain-text e-mail that is sent to the user, who then executes the script. Feeder loads a list of RSS feeds from a file located in ~/.feeder.rc (Listing 2). The first line of this file also contains the e-mail address to which the news feed data should be sent. The content of the configuration files are loaded using a simple trick: the back-tick operator, which performs exactly the same function as it does in the shell, is used to call the cat command. The output is then split into an array of individual lines using the explode function.
The parsing of the XML feed happens in two phases. First, the get_feed function uses the fopen() wrappers to download the feed in 4KB chunks. These are then passed on to an instance of the built-in PHP XML parser, which proceeds to interpret their contents and call ElementStarter(), ElementEnder() and DataHandler(), as needed. These three functions, in turn, parse the contents of the XML file and create a structure of CFeed and CItem instances that represents the feed itself. The script then calls the format_feed function, which scans feed objects and produces a textual version of their contents. Once all the feeds have been parsed and formatted, the resulting message is e-mailed to the intended recipient.
As a security note, format_feed() uses the wordwrap function to format the description of a news item so it doesn't span more than 70 columns. This helps enhance the readability of the news feed by presenting the user with a more compact look. Prior to PHP 4.3.0, the source code for wordwrap() included an unchecked data buffer that could, in theory, be exploited to execute arbitrary code, thus presenting a security issue. If you're not using the latest version of PHP, you probably should either avoid using wordwrap() or replace it with your home-grown version.
The easiest way to execute a script from the shell is to invoke the PHP interpreter explicitly:
marcot ~# php feeder.php
If you have the CGI version of PHP, you may want to use the -q switch, which causes the interpreter to omit any HTTP headers that are normally required during a Web transaction.
This explicit method, however, is not very practical if you want your users to access the scripts you write conveniently. A better solution consists of making the scripts executable, so they can be invoked explicitly, as if they were autonomous programs. To do this, first determine the exact location of your PHP executable:
marcot ~# which php /usr/local/bin/php
The next step consists of creating a shebang—an initial command that instructs the shell interpreter to pipe the remainder of an executable file through a specific application (the PHP engine in our case). The shebang must be the first line of your script—there can't be any white spaces before it. It starts with the character # and the character !, followed by the name of the executable through which the remainder of the file must be piped. For example, if you're using the CLI version of PHP, your shebang may look like this:
#!/usr/local/bin/php
If you're using the CGI version of the PHP interpreter, you also can pass additional options to it in order to keep it quiet and prevent the usual HTTP headers from being printed out:
#!/usr/local/bin/php -q
The final step consists of making your script executable:
marcot ~# chmod a+x feeder.php
At this stage, you can run the script without explicitly invoking the PHP interpreter; the shell will take care of that for you.
As you may have noticed, I have not renamed the script to remove the .php extension. Even though the extension itself is not necessary when running scripts from the shell, its presence makes it easy for text editors such as vim to recognize it and highlight the source's syntax:
marcot ~# ./feeder.php
A news aggregator that must be invoked explicitly every time you want to read your news page is not very useful. Therefore, you may want to have your system run it automatically on a specific schedule. The cron dæmon generally is used for this purpose. cron is a simple dæmon that runs in the background and, at fixed intervals, reads through a special file, called crontab, that contains schedule specifications for each of the users on the server. Based on the information contained in the crontab file, cron executes an arbitrary number of shell commands and, optionally, sends an e-mail notification of their results to the user. The crontab file contains entries in the following format:
minute hour day month weekday command
The first five fields indicate the time or times at which a command must be executed. For example:
5 9 13 9 1 /usr/bin/feeder.php
means that at 9:05 AM of September 13, the command /usr/bin/feeder.php will be executed, but only if September 13 falls on a Monday (weekday 1). This may sound complicated, but it's an extreme example. Most likely, you want to execute commands on a simpler schedule, like the beginning of every hour. This is accomplished by using the * wild card, which means any. So, for once an hour, on the hour, you would enter:
0 * * * * /usr/bin/feeder.php
And for once a day, at midnight, enter:
0 0 * * * /usr/bin/feeder.php
The time fields allow for even more complex specifications. For example, you can create a list of specific times by separating them with a comma:
0,30 * * * * /usr/bin/feeder.php
This crontab specification causes the command /usr/bin/feeder.php to be run every 30 minutes starting from the hour. Similarly, you can specify inclusive lists of times by separating them with a dash. For example, the following crontab command:
0 0 * * 1-3 /usr/bin/feeder.php
causes the script to be executed at midnight, Monday through Wednesday.
In order to change the contents of your crontab file, you need to use the crontab utility, which also automatically edits the correct file and notifies the dæmon that your schedule has changed. There aren't any special requirements to run a PHP script as a cron job, as long as it does not expect any input from a user.
Even though your PHP-CLI scripts are not outputting HTML through a Web server, you still can use them to manipulate and produce HTML code. Because the script is written rather modularly, converting its output to HTML format involves changing only the format_feed function and modifying the call to mail(). This is done so the e-mail message can be recognized as a valid HTML document by the user's e-mail application.
One of the greatest advantages of scripting Web pages with PHP is the ability to mix dynamic statements directly with the static HTML code. As you can see from Listing 3, which shows an updated version of format_feed, this concept still works perfectly even when the script is not outputting to a Web page.
The trick that makes it possible to capture PHP's output in a variable essentially consists of engaging the interpreter's output buffer (disabled by default) by calling ob_start(). Once the appropriate information has been output, the script retrieves the contents of the buffer, then erases it and turns output buffering off with a call to ob_end().
Although the news aggregator script I present in this article performs a rather complex set of functions—from grabbing content off the Web to parsing XML and formatting it in HTML—it requires only about 200 lines of code, including all the comments and blank lines. It is possible to write the same script in Perl or even as a shell script, with the help of some external applications such as wget, expat and sendmail. The latter approach, in my opinion, results in a complicated code base with plenty of opportunities for mistakes.
PHP-CLI rarely is installed by default on a machine running Linux, although you can count on Perl being readily available. Thus, if you have control over the make-up of the server on which you're running scripts and you're comfortable with PHP, there's no reason why you need to learn another language to write most of your shell applications. If, on the other hand, you're writing code to run on a separate machine over which you have no control, you may find PHP a slightly more problematic choice.
Special Magazine Offer -- 2 Free Trial Issues!
Receive 2 free trial issues of Linux Journal as well as instant online access to current and past issues. There's NO RISK and NO OBLIGATION to buy. CLICK HERE for offer
Linux Journal: delivering readers the advice and inspiration they need to get the most out of their Linux systems since 1994.
Sorry, offer available in the US only. International orders, click here.
Subscribe now!
Recently Popular
| Linux HOWTO: Video Editing Magic with ffmpeg | Jul-23-08 |
| The new business of free radio | Jul-24-08 |
| Why We Must React to ACTA | Jul-24-08 |
| Boot with GRUB | May-01-01 |
| Chapter 16: Ubuntu and Your iPod | Aug-30-06 |
| Building a Call Center with LTSP and Soft Phones | Aug-25-05 |
Featured Videos
Non-linear video editing tools are great, but they're not always the best tool for the job. This is where a powerful tool like ffmpeg becomes useful. This tutorial by Elliot Isaacson covers the basics of transcoding video, as well as more advanced tricks like creating animations, screen captures, and slow motion effects.
Shawn Powers reviews the HP Mini-Note portable computer.
Thanks to our sponsor: Silicon Mechanics
Silicon Mechanics is a leading manufacturer of rackmount servers, storage, and high performance computing hardware. The best warranty offerings available are backed by experts dedicated to customer satisfaction.
From the Magazine
August 2008, #172
There's nuttin like a Cool Project to give you some relief from the summer heat, so get out your parka cuz we got a bunch of em. First up is the BUG, not a bug, The BUG. It's got a GPS, camera and more, in a hand-sized package that's user programmable. The BUG does everything. It's both a floor wax and a dessert topping. Get one now. Need a software version of a Swiss Army knife? Take a look at Billix, and don't leave home without it. Then, chew on this one, an X server on a Gumstix device driving an E-Ink display. Need more storage? How about 16 Terabytes? Can do.
And, of course, we have the usual cast of characters: Marcel, Reuven, Dave, Kyle, Doc, plus the new kid on the block Shawn Powers. But it doesn't stop there: build a MythTV box on a budget, build your own GIS system, set up the tools to monitor your enterprise and more. Finally, remember The War of the Worlds? Now you can play too.
Delicious
Digg
Reddit
Newsvine
Technorati







PHP
On May 13th, 2005 Frank (not verified) says:
PHP is very easy to use. If you have some experience of C you won't have any problems to get started with PHP. Even HTML coders can start integrating PHP into their pages straight away. But maybe PHP is too simple. What do I mean by that? The simplicity of PHP means that almost anyone can write some scripts and as a result there is a lot of badly designed code out there. This gives PHP a bad name it does not deserve because it is a very powerful tool. PHP is designed for building Web applications that are scalable up to a very large number of users. With PHP 5 many developers finally got the robust support for object oriented programming they where waiting for but also its XML and MySQL support was much improved. There is much discussion about if PHP is "enterprise ready" - I truly believe it is since it reached version five.
Not able toconnect to mysql in crontab through PHP
On December 1st, 2004 sandy (not verified) says:
hi there
I am not able to connect to mysql in a PHP script that is running through crontab when it is time for crontab to run PHP script i get an error mesage in a mail that Fatal Error-Call to undefined function mysql_connect() in the script .otherwise the script is running OK in explorer or through linux command line ie. php filename.php.
I am using like - mysql_connect("localhost",$uname,$pass);
to make the connection.
plz help
Undefined mysql_connect...
On August 1st, 2005 Sami Salloum (not verified) says:
For the last half hour I was trying to deal with the same problem,
And I just remembered that the "dl" function might do the job.. and looky here.. it did!!!! :)
NOW TO THE SUBJECT:
When executed from Cron, or command line, a PHP script might not recognize the MySql functions, this is because the required libraries are not loaded. It can be simply fixed by the "dl" function in the following manner:
dl("mysql.so"); //loads the mysql library/module or perhaps you might be using a different library than mysql.so
After this line.. one can use the mysql functions as one pleases.. hopefully :)
Thank you so much.
On February 2nd, 2007 Thomas Leak (not verified) says:
This solution was a great help to me. Thanks so much. You rock!
Undefined mysql_connect...
On August 1st, 2005 Samix (not verified) says:
For the last half hour I was trying to deal with the same problem,
And I just remembered that the "dl" function might do the job.. and looky here.. it did!!!! :)
NOW TO THE SUBJECT:
When executed from Cron, or command line, a PHP script might not recognize the MySql functions, this is because the required libraries are not loaded. It can be simply fixed by the "dl" function in the following manner:
dl("mysql.so"); //loads the mysql library/module or perhaps you might be using a different library than mysql.so
After this line.. one can use the mysql functions as one pleases.. hopefully :)
Re: PHP as a General-Purpose Language
On August 23rd, 2004 powlow (not verified) says:
this is really a very good solution to general shell scripting. I was doing this to export data to xml files, do straighforward oracle database extraction and mysql backups.
the problem i ran into is that, like mentioned in the article, php is not readily available on all boxes and cannot always be installed...this can really become a problem.
otherwise, an excellent solution, to which php is really well suited.
Re: PHP as a General-Purpose Language
On August 22nd, 2004 Anonymous says:
PHP makes me sad. :(
Re: PHP as a General-Purpose Language
On August 21st, 2004 Anonymous says:
Listing 2 doesn't seem to be a configuration file.
SimpleXML
On August 19th, 2004 Anonymous says:
Have a look at SimpleXML. It is available since PHP5 and it makes life much easier for simple xml-files like RSS.
Re: SimpleXML
On August 20th, 2004 Anonymous says:
Check out this article that shows how to use SimpleXML to parse a RSS feed from php-planet.net