Squid-Based Traffic Control and Management System

When Web traffic became a major use of the organization's network, this university put in a control system to track and limit access, using the open-source Squid caching system.
Squid-Based Internet User Access Control and Management System

Before we start, we should mention that the file paths here are always relative to the Squid source base catalog, which, in our case, is /usr/local/src/squid-2.5STABLE7/. The detailed information of getting, compiling and using Squid can be obtained from the Squid site.

Let us now consider some characteristics of Squid, taken from the Squid Programming Guide.

Squid is a single-process proxy server. Every client HTTP request is handled by the main process. Its execution progresses as a sequence of callback functions. The callback function is executed when I/O is ready to occur or some other event has happened. As a callback function completes, it registers the next callback function for the subsequent I/O.

At the core of Squid are the select(2) or the poll(2) system calls, which work by waiting for I/O events on a set of file descriptors. Squid uses them to process I/O on all open file descriptors. comm_select() is the function that issues the select() system call. It scans the entire fd_table[] array looking for handler functions. For each ready descriptor, the handler is called. Handler functions are registered with the commSetSelect() function. The close handlers normally are called from comm_close(). The job of the close handlers is to deallocate data structures associated with the file descriptor. For this reason, comm_close() normally must be the last function in a sequence.

An interesting Squid feature is the client per-IP address database support. The corresponding code is in the file src/client_db.c. The main idea is the hash-indexed table, client_table, consisting of the pointers to ClientInfo structures. These structures contain different information on the HTTP client and ICCP proxy server connections, for example, the request, traffic and time counters. The following is the respective code from the file src/structs.h:

struct _ClientInfo {
    /* must be first */
    hash_link hash;
    struct in_addr addr;
    struct {
	int result_hist[LOG_TYPE_MAX];
	int n_requests;
	kb_t kbytes_in;
	kb_t kbytes_out;
	kb_t hit_kbytes_out;
    } Http, Icp;
    struct {
	time_t time;
	int n_req;
	int n_denied;
    } cutoff;
    /* number of current established connections */
    int n_established;
    time_t last_seen;
};

Here are some important global and local functions for managing the client table:

  • clientdbInit()—global function that initializes the client table.

  • clientdbUpdate()—global function that updates the record in the table or adds a new record when needed.

  • clientdbFreeMemory()—global function that deletes the table and releases the allocated memory.

  • clientdbAdd()—local function that is called by the function clientdbUpdate() and adds the record into the table and schedules the garbage records collecting procedure.

  • clientdbFreeItem()—local function that is called by the function clientdbFreeMemory() and removes the single record from the table.

  • clientdbSheduledGC(), clientdbGC() and clientdbStartGC()—local functions that implement the garbage records collection procedure.

By parallelizing the requirements to the developed system and the possibilities of the existing client database, we can say that some key basic features already are implemented, except the client per-user name indexing. The other significant shortcoming of the existing client statistic database is that the information is refreshed after the client already has received the entire requested content.

In our development, we implemented another parallel and independent client per-user database using the code from the src/client_db.c file with some modifications. User statistics are kept in structure ClientInfo_sb. The following is the corresponding code from the file src/structs.h:

#ifdef SB_INCLUDE
#define SB_CLIENT_NAME_MAX_LENGTH 16
struct _ClientInfo_sb {
    /* must be the first */
    hash_link hash;
    char *name;
    unsigned int GID;
    struct {
	long value;
	char type;
	long cur;
	time_t lu;
    } lmt;
    /* HTTP Request Counter */
    int Counter;
};
#endif

The client database is managed by the following global and local functions, quite similar to those listed previously:

  • clientdbInit_sb()—global function that initializes the client table.

  • clientdbUpdate_sb()—global function that updates the record in the table, disconnects the client when the limit is exceeded or adds the new record when needed by calling the function clientdbAdd_sb().

  • clientdbEstablished_sb()—global function that counts the number of client requests and periodically flushes the appropriate record into the file, disconnects the client when the limit is exceeded and adds the new record when needed by calling the function clientdbAdd_sb().

  • clientdbFreeMemory_sb()—global function that deletes the table and releases the allocated memory.

  • clientdbAdd_sb()—local function that is called by the function clientdbUpdate_sb() and adds the record into the table and schedules the garbage records collecting procedure.

  • clientdbFlushItem_sb()—local function that is called by the functions clientdbEstablished_sb() and clientdbFreeItem_sb() and flushes the particular record into the file.

  • clientdbFreeItem_sb()—local function that is called by the function clientdbFreeMemory_sb() and removes the single record from the table.

  • clientdbSheduledGC_sb(), clientdbGC_sb() and clientdbStartGC_sb()—local functions that implement the garbage records collecting procedure.

The client database initialization and release are implemented similarly to the original table in the file src/main.c. The main peculiarity of our code is the calls of the functions clientdbUpdate_sb() and clientdbEstablished_sb() in the client-side routines in the file src/client_side.c:

  • call of the function clientdbUpdate_sb() from the auxiliary function clientWriteComplete(), which is responsible for sending the portions of data to the client.

  • call of the function clientdbEstablished_sb() from the function clientReadRequest(), which processes the client request.

Listing 1 shows the corresponding fragments of the functions clientWriteComplete() and clientReadRequest() from the file src/client_side.c.

______________________

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState