Pagesat High Speed News
This article is a discussion of several aspects of my company's news system, namely the principles of operation and the hardware and software utilized, in order to guide the novice news seeker through a successful implementation of Pagesat's High Speed News feed. Our “wish” list for a news system included:
A low-cost solution that would not impact local network traffic or chew up our Internet bandwidth
A system that could be monitored and modified remotely
A system that could support more than just a few simultaneous news readers
A system that worked now
The news system that we have now set up is comprised of four major components:
Pagesat High Speed News antennae and receiver
News “receiver” machine connected to high-speed receiver, Ethernet and SLIP attached
Master news machine, Ethernet attached
Slave news machine, Ethernet attached
All three news machines run Slackware Linux 1.2.8 as the operating system. INN 1.4 processes the news feed. We are currently using the “hsdist2.0b.tar” software to decode data from the receiver. This software contains “Forward Error Correction” code, which eliminates or drastically reduces data loss caused by less-than-perfect satellite reception.
We have three Intel-based machines. The receiver machine is a 486-33, 8MB RAM, 500M IDE HD. The master news machine is a little beefier: a P133 with 32MB RAM, 1GB SCSI for the OS, 4GB fast SCSI-II to retain the news hierarchy and data files, and another 4GB fast SCSI-II to contain INN and other toys. The relationship of news and the disk subsystem is simple—both should be big and fast. At the current rate, starting from scratch, my 4GB disk is full of news in 5.5 days. We use the Buslogic BT-456-C PCI SCSI controller because it is twice as fast as the 16-bit Adaptec 1542C. Having the Buslogic helps in particular on the daily expire, which went from 3+ hours to a mere 35 minutes. All the machines are Ethernet-attached, and one of the machines has a modem for remote control. The slave machine is just that—a slave. It's identical to the master in hardware configuration, and just receives everything that the master “feeds” it, which is everything. Why do we have it? In a pinch it can be configured as the “master”, just in case some disaster strikes the master, and it can also be used as a primary or secondary news reader machine for all our clients.
The number one rule is to keep operation straightforward and simple. Cron-managed batch jobs were our choice: I can't write C code, but I can write simple shell scripts. I wanted a little more monitoring capability, so I added extra processing to the Pagesat software. We accumulate news for half-hour bursts, then process it into the news system on the hour and the half hour. It currently takes, on average, fifteen minutes to process the previous half hour's data. At 15 and 45 minutes past the hour, we “feed” the slave, sending everything we have received to that point. We run a nightly expire on the master and slave to get rid of the old news and prepare for the next day. The “receiver” machine runs both the PSFRX and PSNEWS programs to receive the data and process it into the .gz data files. These files are stored on an NFS-mounted disk r/w to the master. The master copies the files at the specified intervals onto its disks and deletes them from the receiver disk. The master then processes the news into the system. With this configuration we can take down the master machine for whatever reason and continue to accumulate news on the receiver, processing it whenever the master comes on-line again.
Three reasons: it's dirt cheap, it's efficient, it works. Add it up. The total cost for software is ZERO. Hardware costs are minimal, because PCs are a lot less expensive than workstations, and disk drive prices keep dropping every day. And RAM is really cheap these days—16MB for $100 or less.
Have you priced a leased line lately? The cost is maybe $200 a month for a 64KB circuit to pipe your news to you, a circuit that can't even support a full feed. Want to slurp two or more days of old news from your provider across your 28.8 AND try to surf AND do anything else at the same time?
Practical books for the most technical people on the planet. Newly available books include:
- Agile Product Development by Ted Schmidt
- Improve Business Processes with an Enterprise Job Scheduler by Mike Diehl
- Finding Your Way: Mapping Your Network to Improve Manageability by Bill Childers
- DIY Commerce Site by Reven Lerner
Plus many more.
- diff -u: What's New in Kernel Development
- Server Hardening
- 22 Years of Linux Journal on One DVD - Now Available
- Giving Silos Their Due
- What's New in 3D Printing, Part III: the Software
- Controversy at the Linux Foundation
- Don't Burn Your Android Yet
- Firefox OS
- February 2016 Issue of Linux Journal