Creating a Centralized Syslog Server

A centralized syslog server was one of the first true SysAdmin tasks that I was given as a Linux Administrator way back in 1997. My boss at the time wanted to pull in log files from various appliances and have me use regexp to search them for certain key words. At the time Linux was still in its infancy, and I had just been dabbling with it in my free time. So, I jumped at the chance to introduce Linux to the company that I had worked for. Did it work? You bet it did! What this post is going to cover is not only how to setup a centralized syslog-ng server, but why you would go about setting one up in the first place.

So what is syslog? Syslog is used in Linux to log system messages (huh, another easy to guess name). Syslog-ng is just a rewrite of the original syslog, that was developed in 1998. Syslog-ng is still being actively developed (as of 2010) by BalaBit IT Security and comes with many more features, including better TCP handling, TLS encryption of messages, and sending messages to a database among other things. Some distributions allow you to install either syslog, rsyslog or syslog-ng. For this article, I'll be focusing on syslog-ng as this is more up to date, and if the reader wishes, can be 'supported' via the company that owns the syslog-ng software by going with their enterprise edition version at a later date.

Now that you've got an overview of syslog-ng, let's talk about just why you would use a centralized syslog-ng server. I am sure there are more than the two reasons that I will bring up, but I can think of at least 2 of them off the top of my head. The first is for security purposes. If you have your routers, firewalls, switches, Linux servers and/or other hardware pointing to a SECURED centralized syslog-ng server, when someone does attempt to attack one of the above devices log files can be safely off-site in a secure location. If syslog files are kept on the device this gives an attacker the ability to clean up their tracks. Granted, they can disable the ability to send log files to an external syslog-ng server, but any and all connections prior to that will be located on the centralized syslog server. The other reason is for convenience. For instance, if you have a server that crashed and is unresponsive, you can check the kernel error logs on your centralized syslog server. If you want to check syslog patterns between various dates over an extended time, regex the log files from the centralized syslog server.

So what do I do? I actually use both approaches at home. Not only do my devices and servers forward all their syslog files to a centralized location, but that location is locked down. The machine in question is a virtual machine with only 1 port open (syslog) and accessible only from the local machine, the syslog files are kept on an external drive. Is it paranoia? Probably a wee bit. But I do know that in my home environment, if my external drive fills up from too many syslog files it won't crash my virtual machine. If somehow something happens to my virtual machine, my host OS won't be affected, if someone does gain access to one of my devices then they can't gain access to my syslog server. Granted if something happened to my host OS then I would have issues with my guest VM's, but we can't always prepare for everything. Okay, I admit it's paranoia in the highest of levels, and for most people this is probably too far.

Before we get started, here's a quick disclaimer. First off, as with all of my previous posts, I do all of my blogpost testing in Debian. In this case I had a virtual machine setup for Debian 6.0.1, thus your mileage may vary. Also I won't be getting into how to properly secure your server, best practices on where to place syslog files, or how to setup anything other than syslog-ng. I leave that up to the reader. This blog post just covers the basics of a centralized syslog-ng server.

Installing and Configuring - Server Side

Installing syslog-ng isn't as hard as it looks especially if you're installing from packages. For Debian: apt-get install syslog-ng, for Redhat: yum install syslog-ng. For those of you that enjoy a good source install: http://www.balabit.com/downloads/files?path=/syslog-ng/sources/3.2.4/source/syslog-ng_3.2.4.tar.gz Download, unpackage, configure, make & make install. Once you have syslog-ng installed, we can get to configuring the server side.

Global Options

First thing you need to do is locate your syslog-ng configuration file. The default install (for Debian variants) is '/etc/syslog-ng/syslog-ng.conf'. Before editing any configuration files it is best practice to make a copy of the original configuration file prior to any changes. This is just in case something happens and you need to go back to the original configuration file. I tend to label my original configuration files with .orig (in this case: syslog-ng.conf.orig). Now that you have made a copy of your configuration file, let's open it up with your editor of choice and get started.

long_hostnames(default: off ) - For this post I'm using syslong-ng OSE version 3.1, and I actually can't find long_hostnames in the global configuration guide online. I'll go with long hostnames as a default of off, being fully qualified domain names.

flush_lines(default: 0 ) - Sets the number of lines flushed to a destination at a time. Setting to 0 sends messages as they are received, but keep in mind setting this number higher may increase message latency. This is useful on the client side of syslog-ng. You would keep xx messages on the client before flushing to the destination so that you are not flooding the main syslog-ng server if you have alot of traffic coming from a server.

use_dns(default: no ) - Options: yes, no, persist_only. This one is up to you and your environment. If your syslog-ng is behind a firewall and not accessible to the outside of the world then 'yes' would be appropriate. If accessible to the outside of the world, set to 'no' in order to stop possible DoS attacks. I set mine to 'persist_only' which checks my /etc/hosts file on my syslog-ng server to resolve hostnames, without relying on dns servers.

use_fqdn(default: no ) - Set the Fully Qualified Domain Name, your choice. As a home network I only have one internal domain name. So mine defaults to 'no'. Setting to 'yes' would have your clients hostname show up as: 'hostA.domain.com' instead of 'hostA'

owner(default: root ) - Owner of output files

group(default: adm ) - Group of output files

perm(default: 0640 ) - Permission of output files. Defaults to 640 - Owner Read-Write, Group Read, Other none.

stats_freq(default: 0 ) - Time (in seconds) between two STATS (statistics messages about dropped log messages) Messages. 0 disables STATS messaging.

bad_hostname(default: ^gconfd$ ) - Regex containing hostnames that should not be handled as hostnames..in this case gconfd. If you have more than a handful of servers than I woudl recommend hostnames, unless of course you remember every ip address in your domain..if you, I applaud you.

Now that's it for the 'Default' Global configuration options, but there are many more that you can use. I also use the following:

normalize_hostnames(yes) - This converts all hostnames to lowercase. Some of my devices have uppercase hostnames, and sometimes I get carried away with a new host and Uppercase the first letter of the hostname. This will just lowercase all characters for easier readability.

keep_hostname(yes) - This keeps the hostname if running through a relay or an external server, so that when the host finally reaches the central server the hostname comes with it instead of relying on DNS (or /etc/hosts). If you're using $HOST macro, this should be enabled.

In a bigger and more important environment (read: not soho) I would be setting stats_freq(600) and stats_level(2) in order to retrieve statistics messages from the server. In most soho environments you might be gathering syslog data from 3-5 devices, at which point the odds of actually losing data are pretty slim. In a larger enterprise environment of several hundred devices reporting to centeral syslog servers, enabling statistics allows the sys admin the ability to check on stats and possibly lost messages.

Your global configuration options (if you want it to mirror mine) would look like the following:


options {(off); 
flush_lines(0); 
use_dns(persist_only);
use_fqdn(no); 
owner("root"); 
group("adm"); 
perm(0640); 
stats_freq(0);
bad_hostname("^gconfd$"); 
normalize_hostnames(yes);
keep_hostname(yes); 
};

Setting up Listener

Setting up the listener for syslog-ng is actually only a few lines in the configuration file. A typical listener line looks like this:

source s_net { tcp((ip(127.0.0.1) port(1000) max-connections 5000)); udp (); };

source s_net = Network listener

tcp(ip(127.0.0.1) = Listen on localhost. If you have multiple NIC's, or want to specify an ip to bind this to, change 127.0.0.1 to the ip address of that specific network card

port (1000) = Listen to TCP port 1000

max connections = Allow 5000 simultaneous connections (stops the dreaded 'run away server' syndrome)

udp () = Some devices send their syslog messages via udp, so enable udp if you can't specify tcp and port number.

encrypt(allow) = This could be an entire blog post in itself. Syslog-ng allows for encrypted (TLS, certificate based) syslog messages

Mine for example looks like this:


# Listen on TCP Port 1000 and UDP Port 514, Max 500 Connections source s_net {
tcp(port(1000) max-connections(500)); udp(););

Destination - What goes up must come down. In this case what gets sent out must get put somewhere. Once a message is received from the syslog-ng server it's got to go somewhere. Thus the destination section of the syslog-ng.conf file. As you can see, the default covers your *nix destination for server messages on the local machine. But what about incoming messages? Where do they go? Good question, by default they will send their syslog messages to the subsystem specified in syslog-ng. For instance if it's a message that would be classified as an authentication message (/var/log/auth) then it will dump the message into the syslog-ng's /var/log/auth.log file with the appended information (hostname, date/time, etc).

If that's actually what you want to accomplish, a bunch of servers dumping to the same file as your main server, then I guess the task is complete. But syslog-ng can do so much more than that. If I do much more on server side configuration though I fear this will end up being a chapter in a book. Destinations can be flat files, pipes into other applications, SQL Databases (mysql, MS SQL, Oracle, etc), Remote Log servers, and Terminal Windows. I'll be focusing on flat files and assume you are doing the same for now.

Now the way I setup my centralized syslog server might be different then the way you setup yours. In my case I have a folder that has each hostname and the syslogs from the hostname are located in the folder. For Example: /mount/syslog/macha, /mount/syslog/beag, and so on and so forth. Logrotate takes care of zipping, removing (old files are backed up to a remote server just in case) and cleaning up log files.

My Destination directive looks like this:


destination d_net_auth {
file("/var/log/syslog/remote/$HOSTNAME/auth.log"); }; destination d_net_cron {
file("/var/log/syslog/remote/$HOSTNAME/cron.log"); }; destination d_net_daemon
{ file("/var/log/syslog/remote/$HOSTNAME/daemon.log"); }; destination
d_net_kern { file("/var/log/syslog/remote/$HOSTNAME/kern.log"); }; destination
d_net_lpr { file("/var/log/syslog/remote/$HOSTNAME/lpr.log"); }; destination
d_net_mail { file("/var/log/syslog/remote/$HOSTNAME/mail.log"); }; destination
d_net_syslog { file("/var/log/syslog/remote/$HOSTNAME/syslog.log"); };
destination d_net_user { file("/var/log/syslog/remote/$HOSTNAME/user.log"); };
destination d_net_user { file("/var/log/syslog/remote/$HOSTNAME/uucp.log"); };
destination d_net_debug { file("/var/log/syslog/remote/$HOSTNAME/debug"); };
destination d_net_error { file("/var/log/syslog/remote/$HOSTNAME/error"); };
destination d_net_messages { file("/var/log/syslog/remote/$HOSTNAME/messages");
}; destination d_net_mailinfo {
file("/var/log/syslog/remote/$HOSTNAME/mail/mail.info"); }; destination
d_net_mailwarn { file("/var/log/syslog/remote/$HOSTNAME/mail/mail.warn"); };
destination d_net_mailerr {
file("/var/log/syslog/remote/$HOSTNAME/mail/mail.err"); }; 

Now in theory, the syslog-ng server is supposed to create the directories necessary for the files to drop into (as specified in the global policies) but sometimes I run into problems where the directories were not created properly and the errors in syslog-ng are reported in /var/log/errors. To alleviate future pain and suffering I tend to create the host and log files as I go, anything I'm missing will end up in /var/log/errors and I can create them later.

For those of you that are veteran syslog-ng users, you might wonder why I split my localhost destination and my remote(off-site clients) destinations when in theory I could have created a d_auth and had my regular localhost filter into a folder as well. The reason behind that was that I wanted to separate my localhost syslog traffic from remote traffic - more configuration lines, but easier on me. Also, I'm not messing with the Linux subsystem when it's out looking for where to put regular log files.

Filtering - The ability for Syslog-NG to filter its messages is what really seperates the 'men from the boys' in the syslog battle. The filtering is what really sets syslog-ng apart. Granted I separate my hosts in folders defined in $HOST variable, but filtering is the real meat and potatoes. With filtering I can (and do) the following: Filter Firewall logs looking for certain key words such as port scans, that get dumped into 1 folder, DDOS attacks that get filtered into another folder. My voip adaptor sends syslog events and I filter based on those messages into individual files instead of a single file. Filtering also allows you to specify multiple hosts to filter based on, and into multiple destinations. Not only that, but you can use regular expressions in filtering.

Filtering expressions are created like: filter <identifier> { expression; };

<identifier> is the name you give your filter. <expression> contains the function, and boolean operators (and,or,not).

An example for my firewall would be:


filter firewall_ddos_filter { host("10.1.1.1") and match("Denial of Service"
value("MESSAGE")); };

This filter is called 'firewall_ddos_filter, it listens for incoming syslog messages from 10.1.1.1 with a message of 'Denial of Service'. To complete the filter you need a log statement:

log firewall_ddos_filter { source(s_net); filter(firewall_ddos_filter); destination(d_net_firewall_ddos); };

In my above destination I would add a destination for firewall DDOS Attacks, port scanning, etc. This makes it easier to separate log files from servers/devices that do not use the standard *nix logging facilities, or easier for a system admin to filter logs coming out of a firewall (or many firewalls filtered into one log).

If you want to use multiple 'firewall' hosts (as an example) do NOT use just add them in and create a log/filter rule using a boolean operator of 'and'. It will not work, and you beat your head on the desk for many hours to come. Instead, use the 'or' boolean operator as such:

filter firewall_ddos_filter { host("10.1.1.1") or host ("10.1.1.2") and match("Denial of Service" value("MESSAGE")((; };

My 'Default' Filtering directive looks like this (Beautified for this post but they call fit in 'paragraph' form as long as there is a semi-colon seperating each case):


filter f_dbg { level(debug); }; 
filter f_info { level(info); }; 
filter f_notice{ level(notice); }; 
filter f_warn { level(warn); }; 
filter f_err { level(err);  }; 
filter f_crit { level(crit .. emerg); };

filter f_debug { level(debug) and not facility(auth, authpriv, news, mail); };
filter f_error { level(err .. emerg) ; }; 
filter f_messages { level(info,notice,warn) and not facility(auth,authpriv,cron,daemon,mail,news);
};

filter f_auth { facility(auth, authpriv) and not filter(f_debug); }; 
filter f_cron { facility(cron) and not filter(f_debug); }; 
filter f_daemon { facility(daemon) and not filter(f_debug); }; 
filter f_kern { facility(kern) and not filter(f_debug); }; 
filter f_lpr { facility(lpr) and not filter(f_debug);}; 
filter f_local { facility(local0, local1, local3, local4, local5, local6, local7) and not filter(f_debug); }; 
filter f_mail { facility(mail) and not filter(f_debug); }; 
filter f_news { facility(news) and not filter(f_debug); };
filter f_syslog3 { not facility(auth, authpriv, mail) and not filter(f_debug); }; 
filter f_user { facility(user) and not filter(f_debug); }; filter f_uucp { facility(uucp) and not filter(f_debug); };

filter f_cnews { level(notice, err, crit) and facility(news); }; 
filter f_cother { level(debug, info, notice, warn) or facility(daemon, mail); };

filter f_ppp { facility(local2) and not filter(f_debug); }; 
filter f_console { level(warn .. emerg); };

Statistics

There's nothing more I enjoy better than some good statistics. When I run any server or service, be it at the house or at work I want to see what my server has processed over time. Beginning with version 3.1, syslog-ng now has a syslog-ng-ctl stats utility which has greatly simplified grabbing log files. Prior to 3.1 to fetch statistic files you would run: echo STATS | nc -U /var/run/syslog-ng.ctl.

Because I'm a regex geek I'm not thrilled with the semi-colons in the output of syslog-ng-ctl stats thus I run: syslog-ng-ctl stats | sed 's|;|\t|g' to clean up the output.

What you have when you type the above command is 6 columns: SourceName, SourceID, SourceInstance, State, Type and Number.

SourceName - The name of the Source, for instance: destination, source, global, center

SourceID - The ID you gave the source (a previous example was firewall_ddos_filter, other examples would be: d_mail, d_net_user, etc)

SourceInstance - The destination of the Source Instance such as a filename, or the name of an application for a program source (sql) or destination

State: - Status of the object: a (Active - Currently active and receiving data), d (Dynamic - Not continuously available) o (Once active but stopped receiving messages such as an orphaned object)

Type - Type of Statistic such as: Processed: Number of Messages that reached their destination Dropped: Number of dropped messages Stored: Number of messaged stored in message Queue waiting to be sent to destination Suppressed (not sent): Number of Suppressed Messaged Stamp: Timestamp of Last message sent. These statistics are reset when the syslog-ng service is reset.

Number: Number of Messages

Log Rotate, Log Rotate, LOG ROTATE

Was that a clear enough message for you? Rotating your message logs will save your butt in the log run. Without rotating your logs your log disk space will just continue to grow and grow eventually filling up your hard drive. Not only will log rotate save space, but it will make searching for log files on specific dates easier than pulling up a 50MB log file that you didn't set into log rotate and searching for a specific date. Depending on your distro, logrotate is located in /etc/logrotate.conf. As this isn't a blogpost on logrotate, I'll leave your configuration up to your imagination and give you an example on how I rotate my log files:

/var/log/remote/*/ { rotate 5 weekly missingok create }

This goes through /var/log/remote/*/ every week and rotates my logs. Logs are rotated for 1 month at which point I have a cronjob that tar-zips my old logs and they are moved off to a backup location where they are kept for another month before being rotated off. In a business environment of course logs would be kept for however long management and legal dictates, but for a home environment I feel 2 months of logs is good enough to troubleshoot any problems that might have come up in that time.

Syslog Client

As each server and device is different in their setup, I won't get too in-depth into this. Syslog communicates on UDP port 514, but as I stated earlier above, I also set the main syslog server to communicate on TCP port 1000 for other devices. This allows the syslog-ng server to listen on two ports, 514 UDP for devices that can't change their ports, and TCP 1000 for servers that you can specify port numbers. Why did I put TCP 1000 and not TCP 514? Because Linux uses tcp 514 for rsh (remote shell) which would have caused some problems with my (and other's) host system. If you plan on running syslog-ng on the outside of the world (and I would assume your setting authentication, and using TLS encryption) then setting a TCP port that's not typical would be your best bet.

1. For devices all you should need to do is tell the device to point to the hostname and make sure either UDP 514 or TCP 1000 is the destination

2. For rsyslog clients add the following line:

For TCP:  *.* @@ipaddress:1000
For UDP:  *.* @ipaddress:514

3. For syslog-ng clients add the following line:

*New syslog Protocol* syslog(host tranport [options];

*old syslog protocol* destination d_tcp { syslog(ip("remoteip")
transport("tcp") port(1000) localport(999)}; };

destination d_udp { syslog(ip("remoteip") transport("udp") port(514)
localport(999)}; };

Conclusion

Well there you have it, a birds eye view of syslog-ng. There is plenty more that you can learn about syslog-ng, as I just went into the basics of getting started. From here you can get into macros, increased filtering, and TLS/Certificate based encryption of syslog messages (which I might cover in a later blog post). By sending your syslog messages to a centralized syslog server, and backing up said syslog server, you can rest assured that your system messages are secure and easy to get to when you need them.

As promised earlier, here are the links to get you started with syslog-ng:

The syslog-ng Open Source Edition 3.1 Administrator Guide (HTML) http://www.balabit.com/sites/default/files/documents/syslog-ng-ose-v3.1-guide-admin-en.html/bk01-toc.html

The syslog-ng Open Source Edition 3.1 Administrator Guide (PDF) http://www.balabit.com/support/documentation/syslog-ng-pe-v3.2-guide-admin-en_1.pdf

All Documentation: http://www.balabit.com/support/documentation

______________________

www.jaysonbroughton.com

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

nxlog

Anonymous's picture

For centralized log collection I also recommend nxlog. You can use it on both Linux and Windows, and has SSL support for forwarding logs in addition to the features syslog-ng has.
NXLOG is also open source.

Rather than using logrotate,

Anonymous's picture

Rather than using logrotate, just use Date macros in the destination definitions, and syslog-ng will start new logfiles intself at midnight. You also end up with nice directory structures where log files are grouped by date.

Use Splunk!! it's the best

Anonymous's picture

Use Splunk!! it's the best centralized syslog (and search/reporting) solution available. DL the file and you'll have a beautiful solution up and running in 10 minutes. - really.

Splunk can collect syslog traffic (and other network feeds) or a light weight agent can be used to run commands and collect(index) their output.

Splunk works great, depending

loadedmind's picture

Splunk works great, depending on the amount of data it has to parse/index. Then, you just about need a freaking supercomputer to handle that data.

Splunk

Jayson Broughton's picture

True, I'll agree with you on the Splunk. I've used Splunk in the past and I love it. But for home users with less than 10 servers/devices splunk can be over-kill, with IT departments on a thin budget (or those non-profits) that can't afford Spunks licensing for over xxx ammt of data/day it can put a stop to purchasing. I swear by splunk for the middle of the line setups (20-30 servers with less than 500MB of syslog traffic a day). But to setup and monitor a splunk server in a soho environment would probably be overkill when a small syslog-ng server with filtering would do the trick. Just my 2c. :-)

Alternatives to Splunk

Scott McCarty's picture

Another alternative is petit with syslog-ng, http://crunchtools.com/centralizing-log-files/

On the other hand, if you really need the power of a web gui, the is also Logzilla.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState