Pagesat High Speed News
This article is a discussion of several aspects of my company's news system, namely the principles of operation and the hardware and software utilized, in order to guide the novice news seeker through a successful implementation of Pagesat's High Speed News feed. Our “wish” list for a news system included:
A low-cost solution that would not impact local network traffic or chew up our Internet bandwidth
A system that could be monitored and modified remotely
A system that could support more than just a few simultaneous news readers
A system that worked now
The news system that we have now set up is comprised of four major components:
Pagesat High Speed News antennae and receiver
News “receiver” machine connected to high-speed receiver, Ethernet and SLIP attached
Master news machine, Ethernet attached
Slave news machine, Ethernet attached
All three news machines run Slackware Linux 1.2.8 as the operating system. INN 1.4 processes the news feed. We are currently using the “hsdist2.0b.tar” software to decode data from the receiver. This software contains “Forward Error Correction” code, which eliminates or drastically reduces data loss caused by less-than-perfect satellite reception.
We have three Intel-based machines. The receiver machine is a 486-33, 8MB RAM, 500M IDE HD. The master news machine is a little beefier: a P133 with 32MB RAM, 1GB SCSI for the OS, 4GB fast SCSI-II to retain the news hierarchy and data files, and another 4GB fast SCSI-II to contain INN and other toys. The relationship of news and the disk subsystem is simple—both should be big and fast. At the current rate, starting from scratch, my 4GB disk is full of news in 5.5 days. We use the Buslogic BT-456-C PCI SCSI controller because it is twice as fast as the 16-bit Adaptec 1542C. Having the Buslogic helps in particular on the daily expire, which went from 3+ hours to a mere 35 minutes. All the machines are Ethernet-attached, and one of the machines has a modem for remote control. The slave machine is just that—a slave. It's identical to the master in hardware configuration, and just receives everything that the master “feeds” it, which is everything. Why do we have it? In a pinch it can be configured as the “master”, just in case some disaster strikes the master, and it can also be used as a primary or secondary news reader machine for all our clients.
The number one rule is to keep operation straightforward and simple. Cron-managed batch jobs were our choice: I can't write C code, but I can write simple shell scripts. I wanted a little more monitoring capability, so I added extra processing to the Pagesat software. We accumulate news for half-hour bursts, then process it into the news system on the hour and the half hour. It currently takes, on average, fifteen minutes to process the previous half hour's data. At 15 and 45 minutes past the hour, we “feed” the slave, sending everything we have received to that point. We run a nightly expire on the master and slave to get rid of the old news and prepare for the next day. The “receiver” machine runs both the PSFRX and PSNEWS programs to receive the data and process it into the .gz data files. These files are stored on an NFS-mounted disk r/w to the master. The master copies the files at the specified intervals onto its disks and deletes them from the receiver disk. The master then processes the news into the system. With this configuration we can take down the master machine for whatever reason and continue to accumulate news on the receiver, processing it whenever the master comes on-line again.
Three reasons: it's dirt cheap, it's efficient, it works. Add it up. The total cost for software is ZERO. Hardware costs are minimal, because PCs are a lot less expensive than workstations, and disk drive prices keep dropping every day. And RAM is really cheap these days—16MB for $100 or less.
Have you priced a leased line lately? The cost is maybe $200 a month for a 64KB circuit to pipe your news to you, a circuit that can't even support a full feed. Want to slurp two or more days of old news from your provider across your 28.8 AND try to surf AND do anything else at the same time?
If you don't already have your Linux system(s) up and running, do it now. Run down to your local software house and pick up a CD-ROM containing the latest version of Linux (I like Slackware), follow the instructions and get it installed. It's easy—all it takes is a little time. Take the extra time needed to customize your kernel in order to save RAM. Next, get X-Windows up and running, so that you can monitor several things simultaneously. Make sure your TCP/IP is working, be it LAN or SLIP/PPP, to allow posting capability. Now you're ready to set up the news system. We chose to obtain a source code version of INN off the Internet rather than use the distributed version. Key files worth reading are most notably the FAQs in /usr/lib/news/tools.linux, and the README files in the base directory. These files explore the configuration options and operating procedures.
Now it's time to build your news repository. First, fetch the latest “active” file from ftp.pagesat.net. Then write a simple script to strip out and retain the newsgroup name, and append “00000000 000000001 y” to each entry to reset the news article counters. Make your modified file the “active” file. Now run /usr/lib/news/bin/makehistory and watch a lot of your disk space be consumed by the directory structure being built to house the news data. Next, you will need to edit some of the INN control files in /usr/lib/news. The following examples are excerpts from our working files, with explanations. Feel free to copy and/or modify to suit your configurations.
## mail notification to root for all control ## functions, and create new newsgroups. all:*:*:mail checkgroups:*:*:mail ihave:*:*:mail sendme:*:*:mail newgroup:*:*:doit=mail rmgroup:*:*:mail sendsys:*:*:mail senduuname:*:*:mail version:*:*:mail
## expire control and junk after 1 day, keep ## 2 newsgroups for 90 days, keep biz.pagesat ## forever, expire all other news after 3 days. /remember/:1 control:A:1:1:1 junk:A:1:1:1 *:A:3:3:3 news.software.nntp:A:90:90:90 comp.os.linux*:A:90:90:90 biz.pagesat:A:never:never:never
## our org, server and domain... please use your own. organization: Webworks Internet Services server: newsfeed.webworks.net domain: webworks.net
## feed this machine and slave everything., output ## posts to slave and pagesat. ## exclude some posting from pagesat ME:*:: slave:*:Tf,Wnm: pagesat/jolt.pagesat.net,news.pagesat.net,pagesat.net,\ pubxfer.news.psi.net,psinntp,unknowna:*,\ !junk*,!local*,!control*:Tf,Wnm:
## allow/disallow newsreader/nntp acess<\n> *:: -no- : -no- :!* *.webworks.net:Read Post:::*
## the FQDN of all the machine names that we intend to feed slave:slave.webworks.net:1m:-t300 pagesat:news.pagesat.net:1m:-t300
Once again, the main thing to remember is to follow the directions. Read the documentation that comes with the dish and receiver. Grab a compass and protractor, an extension cord and the tools necessary to assemble the dish. Don't forget a beer or two, a lawn chair and a friend with two hands to help. Go out into your yard and plug everything together. Then, using your compass and protractor, aim the dish in the general direction of the satellite to obtain a tone signal. This tone will help you orient the dish to the proper location, so that you can decide where to mount it permanently—a position that should be free of current and future obstructions. Once you get the antenna mounted, attach it to your computer, and start up PSFRX -v to see if you're pointing at the correct satellite. If you are, you should see a series of dots representing data blocks—it isn't a continuous flow, so be patient. If you see other characters like C and S, which represent errors, try re-aiming the dish a little, twisting the LNB for proper polarization. You really need a friend within earshot to fine-tune the aiming of the dish. If you're getting data, you're aimed at the right place. Now you can re-attach the receiver next to the dish for fine-tuning. Using the tone and meter, you can really zero in on the satellite. Once done, go back into the house, and re-attach the receiver to your PC: you're ready to start receiving the news!
Start up INN with /usr/lib/news/etc/rc.news and the PSFRX and PSNEWS programs. You should start seeing news batches filling up your spool queue. Set cron to run the PROCESSSATNEWS PROGRAM each half hour or so. Pick a time and set cron to run news.daily once a day. Watch your disk fill up with hundreds of megabytes of news daily. Now you're a member of the INN crowd; pick your favorite newsgroups (e.g., biz.pagesat, comp.os.linux.* and news.*) and start learning more about how to tailor your system to be exactly as you want it.
Rich Myers (email@example.com) cut his programming teeth on IBM iron using 370 Assembler language back in 1980, when mainframes were “in”. In the last half of the 80s when the dinosaurs were starting to die off and PCs were sprouting everywhere, he was manager of a network of SUN workstations. Then came Corporate Takeovers, with countless changes to the almighty LANs and WANs, and late nights and weekends keeping up with the Joneses. And throughout all this, he bought not even one winning lottery ticket. Which brings us to now—a piece of Internet Pie for me, please.