Creating Web Plots on Demand
The first section of Listing 1 requires little explanation. I just assign variables for file names and directories that will be used later. This is the section you might change if your web server's directories are different from mine.
The Perl script is meant to be started when a web browser requests it. In other words, it is a CGI program. A Common Gateway Interface (CGI) program provides a standard mechanism for providing “interactive content” over the Internet. Instead of requesting a static HTML document, the web browser asks the web server to run a program that will dynamically create an HTML document. CGI programs can be written in any language, but Perl is by far the most common language for CGI programming today.
My CGI script will actually run as a stand-alone program from the command line, although it isn't particularly useful when run that way. Running as a CGI program, it gains one important aspect: access to the QUERY_STRING environment variable. The QUERY_STRING environment variable is one way to pass information to a CGI program. If QUERY_STRING has been defined, then our program expects it to contain a date. This date should be in dd/mmm/yyyy format (e.g., 12/Sep/1997). That's the same format Apache uses to log its dates.
If QUERY_STRING isn't defined, we'll assume we should look for access data for today's date. In either case, the date we're seeking is stored in the variable $date.
In section 3 of Listing 1, we open the Apache access log and begin looking for lines that contain our target date. When we find a date that matches, we extract the hour and minute and place them in the variables $inhour and $inmin.
It's at this point that we run into a slight conceptual problem. We want to plot time of day on the x axis of our plot, but what we actually have is hours and minutes. What value will be plotted on the x axis? What we want is a time line from 12:01 AM to midnight. If we just turned times into integers, we would get values like 1024 for 10:24 AM. gnuplot could plot this, but since there are only 60 minutes in an hour, there would be a gap at the end of each plotted hour. There are many ways to solve this problem. The one I used was simply to scale the minutes 0 to 59 to a value between 0 and 99. The hours are then multiplied by 100 and the converted minutes are added to them. Thus, the time 1:30 PM becomes the integer value 1350. Our plot will only have tic marks for hours, so no one will notice our little conversion.
So in this section of code, we convert the hours and minutes to an integer value and use it as an array index for an array of counters called $accesses.
gnuplot can plot the data it reads from a text file. We need to write a file for gnuplot, where each line of the file contains two integer values. The first value will be our time value (the x axis) and the second value will be the number of web hits at that time (the y axis). The file will contain 2401 values. Here are a few lines from an example file:
0808 1 0809 0 0810 1 0811 2 0812 0 0813 21 0814 0 0815 12
This file shows a few minutes during the 8 o'clock hour. We create the file by printing out the data in our $accesses array. Here we encounter another little “gotcha”. Perl creates something called “sparse arrays”. This just means that array elements are created as they are defined. In other words, if there were no accesses at 8:12 AM, as in our sample data above, then no corresponding array element is created. It's undefined.
Normally in Perl, you can traverse an array using a foreach loop. In our case though, we need to output data for all time intervals, even those times when no accesses took place. We need this so we have a contiguous set of data for gnuplot to graph.
This is actually very easy to do. In section 4 of Listing 1, we set up a simple for loop. Using Perl's defined function, we determine if an array element exists. If it does, we simply print the index value as our x value and the access count as our y value. If the array element isn't defined, we know there were no accesses at that time, so we print out a zero as our y value.
Practical Task Scheduling Deployment
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.View Now!
|The Firebird Project's Firebird Relational Database||Jul 29, 2016|
|Stunnel Security for Oracle||Jul 28, 2016|
|SUSE LLC's SUSE Manager||Jul 21, 2016|
|My +1 Sword of Productivity||Jul 20, 2016|
|Non-Linux FOSS: Caffeine!||Jul 19, 2016|
|Murat Yener and Onur Dundar's Expert Android Studio (Wrox)||Jul 18, 2016|
- Stunnel Security for Oracle
- The Firebird Project's Firebird Relational Database
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- SUSE LLC's SUSE Manager
- Managing Linux Using Puppet
- Non-Linux FOSS: Caffeine!
- My +1 Sword of Productivity
- Google's SwiftShader Released
- SuperTuxKart 0.9.2 Released
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide