Web Analysis Using Analog
Analog can be configured to use customized log formats, which is a very good thing if you happen to have log files in various formats created by different servers. Even though I've used a number of different servers, I've been able to continue using Analog to analyze new and old log files (of different formats) by listing the type of log format before giving the name and path of the log file. I now use the Apache web server's combined log format, which produces a common log file that lists the referrer and browser information with the log entry for each access. Otherwise, I'd have separate log files, one for the referrer and another for the browser, and would need to include these log files when working with Analog's configuration files.
If you're a hostmaster, you can configure Apache to use a different log file for each virtual host. This keeps the information for each host separate and makes using Analog to analyze your virtual host log files much more straightforward. This is done using Apache's virtual host directive:
<VirtualHost vhost1.com> ServerAdmin firstname.lastname@example.org DocumentRoot /www/docs/vhhost1.com ServerName vhost1.com ErrorLog logs/vhost1.com-error_log CustomLog logs/vhost1.com-access_log combined </VirtualHost>
While you can use Analog with just the analog.cfg file to tell it what to do and where to save its report, if you want to create different reports for virtual hosts and individual pages, it's best to use multiple configuration files. Each configuration file serves a different purpose and can be combined with script files containing command-line switches for Analog.
In this scenario, Analog is run not once, but several times; each run creates a separate report. The analog.cfg file includes only a very few base commands that relate to our main site, not the virtual host sites. When creating reports for virtual hosts, I exclude analog.cfg from being called with the -G command-line switch.
The basic arrangement is similar to a pyramid format. All major items are in a master.cfg file to cover the broad category of all virtual hosts on our system. Items relating only to a specific virtual host and their general preferences are in the next tier, and finally, individual page.cfg files are in the last category. This allows me to create specialized setups as needed and still track individual hosts, sites and pages without making major changes.
When Analog is run for a virtual host, the master.cfg file is called first, followed by the master-vhost.cfg (I replace “vhost” with the name of the host when naming the file), and finally, single-page configuration files for separate pages. An example master.cfg file is included here (see Listing 1).
An example vhost.cfg file is shown in Listing 2, and as you can see, it's fairly general, since most of the report formatting and such is handled by the master.cfg file. The vhost.cfg file can be used to create a “total activity” report for the virtual host. The command-line prompt (or script file), shown without paths for clarity, is:
analog -G +gmaster.cfg +gvhost1.cfg +Ovhost1-total.html
The -G tells Analog not to use analog.cfg (which is used for the main host only). +g is used whenever we use additional configuration files: there's no space between it and the file name. +O designates the output file name: it's the letter O, not the number zero.
Single configuration files are used to give the basic information on the files(s) to include in the report (using the FILEINCLUDE command). The HOSTNAME and HOSTURL directives are the items that will appear at the top of each report after the words “Web Server Statistics for”. For individual pages, we use the name and URL of the page rather than the host name or URL. A single-page configuration file can be three or more lines, as shown in Listing 3.
Notice that the log file to use, output file and report-formatting commands aren't included; these items are set either in the master.cfg files or within the script file when Analog is run. This lets me use the same information when creating the daily and monthly reports, even though the two reports are very different.
The FILEINCLUDE command causes Analog to search through the logs and retrieve data relating to only the file you've specified. It's a very powerful command, and is normally used in the configuration files for individual pages or sites. It can also be used with a wild card; if I wanted to include all files in the widgets directory, I would use:
The command line used to create a daily report for this page (all on one line), shown without path information for clarity, is:
analog -G +gmaster.cfg +gmaster-vhost1.cfg +gwidgets.cfg +Owidget.html
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- The Italian Army Switches to LibreOffice
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Linux Mint 18
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
- Varnish Software's Varnish Massive Storage Engine
- Devuan Beta Release
- Privacy and the New Math
- Ben Rady's Serverless Single Page Apps (The Pragmatic Programmers)
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide