Web Analysis Using Analog
Analog can be configured to use customized log formats, which is a very good thing if you happen to have log files in various formats created by different servers. Even though I've used a number of different servers, I've been able to continue using Analog to analyze new and old log files (of different formats) by listing the type of log format before giving the name and path of the log file. I now use the Apache web server's combined log format, which produces a common log file that lists the referrer and browser information with the log entry for each access. Otherwise, I'd have separate log files, one for the referrer and another for the browser, and would need to include these log files when working with Analog's configuration files.
If you're a hostmaster, you can configure Apache to use a different log file for each virtual host. This keeps the information for each host separate and makes using Analog to analyze your virtual host log files much more straightforward. This is done using Apache's virtual host directive:
<VirtualHost vhost1.com> ServerAdmin email@example.com DocumentRoot /www/docs/vhhost1.com ServerName vhost1.com ErrorLog logs/vhost1.com-error_log CustomLog logs/vhost1.com-access_log combined </VirtualHost>
While you can use Analog with just the analog.cfg file to tell it what to do and where to save its report, if you want to create different reports for virtual hosts and individual pages, it's best to use multiple configuration files. Each configuration file serves a different purpose and can be combined with script files containing command-line switches for Analog.
In this scenario, Analog is run not once, but several times; each run creates a separate report. The analog.cfg file includes only a very few base commands that relate to our main site, not the virtual host sites. When creating reports for virtual hosts, I exclude analog.cfg from being called with the -G command-line switch.
The basic arrangement is similar to a pyramid format. All major items are in a master.cfg file to cover the broad category of all virtual hosts on our system. Items relating only to a specific virtual host and their general preferences are in the next tier, and finally, individual page.cfg files are in the last category. This allows me to create specialized setups as needed and still track individual hosts, sites and pages without making major changes.
When Analog is run for a virtual host, the master.cfg file is called first, followed by the master-vhost.cfg (I replace “vhost” with the name of the host when naming the file), and finally, single-page configuration files for separate pages. An example master.cfg file is included here (see Listing 1).
An example vhost.cfg file is shown in Listing 2, and as you can see, it's fairly general, since most of the report formatting and such is handled by the master.cfg file. The vhost.cfg file can be used to create a “total activity” report for the virtual host. The command-line prompt (or script file), shown without paths for clarity, is:
analog -G +gmaster.cfg +gvhost1.cfg +Ovhost1-total.html
The -G tells Analog not to use analog.cfg (which is used for the main host only). +g is used whenever we use additional configuration files: there's no space between it and the file name. +O designates the output file name: it's the letter O, not the number zero.
Single configuration files are used to give the basic information on the files(s) to include in the report (using the FILEINCLUDE command). The HOSTNAME and HOSTURL directives are the items that will appear at the top of each report after the words “Web Server Statistics for”. For individual pages, we use the name and URL of the page rather than the host name or URL. A single-page configuration file can be three or more lines, as shown in Listing 3.
Notice that the log file to use, output file and report-formatting commands aren't included; these items are set either in the master.cfg files or within the script file when Analog is run. This lets me use the same information when creating the daily and monthly reports, even though the two reports are very different.
The FILEINCLUDE command causes Analog to search through the logs and retrieve data relating to only the file you've specified. It's a very powerful command, and is normally used in the configuration files for individual pages or sites. It can also be used with a wild card; if I wanted to include all files in the widgets directory, I would use:
The command line used to create a daily report for this page (all on one line), shown without path information for clarity, is:
analog -G +gmaster.cfg +gmaster-vhost1.cfg +gwidgets.cfg +Owidget.html
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Client-Side Performance
- Libarchive Security Flaw Discovered
- Peppermint 7 Released
- Sony Settles in Linux Battle
- Maru OS Brings Debian to Your Phone
- The Giant Zero, Part 0.x
- Profiles and RC Files
- Git 2.9 Released
- Snappy Moves to New Platforms
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide