Pagesat High Speed News
This article is a discussion of several aspects of my company's news system, namely the principles of operation and the hardware and software utilized, in order to guide the novice news seeker through a successful implementation of Pagesat's High Speed News feed. Our “wish” list for a news system included:
A low-cost solution that would not impact local network traffic or chew up our Internet bandwidth
A system that could be monitored and modified remotely
A system that could support more than just a few simultaneous news readers
A system that worked now
The news system that we have now set up is comprised of four major components:
Pagesat High Speed News antennae and receiver
News “receiver” machine connected to high-speed receiver, Ethernet and SLIP attached
Master news machine, Ethernet attached
Slave news machine, Ethernet attached
All three news machines run Slackware Linux 1.2.8 as the operating system. INN 1.4 processes the news feed. We are currently using the “hsdist2.0b.tar” software to decode data from the receiver. This software contains “Forward Error Correction” code, which eliminates or drastically reduces data loss caused by less-than-perfect satellite reception.
We have three Intel-based machines. The receiver machine is a 486-33, 8MB RAM, 500M IDE HD. The master news machine is a little beefier: a P133 with 32MB RAM, 1GB SCSI for the OS, 4GB fast SCSI-II to retain the news hierarchy and data files, and another 4GB fast SCSI-II to contain INN and other toys. The relationship of news and the disk subsystem is simple—both should be big and fast. At the current rate, starting from scratch, my 4GB disk is full of news in 5.5 days. We use the Buslogic BT-456-C PCI SCSI controller because it is twice as fast as the 16-bit Adaptec 1542C. Having the Buslogic helps in particular on the daily expire, which went from 3+ hours to a mere 35 minutes. All the machines are Ethernet-attached, and one of the machines has a modem for remote control. The slave machine is just that—a slave. It's identical to the master in hardware configuration, and just receives everything that the master “feeds” it, which is everything. Why do we have it? In a pinch it can be configured as the “master”, just in case some disaster strikes the master, and it can also be used as a primary or secondary news reader machine for all our clients.
The number one rule is to keep operation straightforward and simple. Cron-managed batch jobs were our choice: I can't write C code, but I can write simple shell scripts. I wanted a little more monitoring capability, so I added extra processing to the Pagesat software. We accumulate news for half-hour bursts, then process it into the news system on the hour and the half hour. It currently takes, on average, fifteen minutes to process the previous half hour's data. At 15 and 45 minutes past the hour, we “feed” the slave, sending everything we have received to that point. We run a nightly expire on the master and slave to get rid of the old news and prepare for the next day. The “receiver” machine runs both the PSFRX and PSNEWS programs to receive the data and process it into the .gz data files. These files are stored on an NFS-mounted disk r/w to the master. The master copies the files at the specified intervals onto its disks and deletes them from the receiver disk. The master then processes the news into the system. With this configuration we can take down the master machine for whatever reason and continue to accumulate news on the receiver, processing it whenever the master comes on-line again.
Three reasons: it's dirt cheap, it's efficient, it works. Add it up. The total cost for software is ZERO. Hardware costs are minimal, because PCs are a lot less expensive than workstations, and disk drive prices keep dropping every day. And RAM is really cheap these days—16MB for $100 or less.
Have you priced a leased line lately? The cost is maybe $200 a month for a 64KB circuit to pipe your news to you, a circuit that can't even support a full feed. Want to slurp two or more days of old news from your provider across your 28.8 AND try to surf AND do anything else at the same time?
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- My +1 Sword of Productivity
- Tech Tip: Really Simple HTTP Server with Python
- Non-Linux FOSS: Caffeine!
- Returning Values from Bash Functions
- Managing Linux Using Puppet
- Doing for User Space What We Did for Kernel Space
- Rogue Wave Software's Zend Server
- Parsing an RSS News Feed with a Bash Script
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide