Letters to the Editor
I was a SPARC user until I encountered a hardware problem. I found it difficult to get service for the SPARC hardware; small site end users don't get much support from SUN, it seems.
The Linux/x86 world is just the opposite. There is a very competitive market offering a wide variety of options in all price/performance ranges. It creates a truly affordable Unix computing system.
I wish the Linux community would stay with x86 instead of making Linux just another piece of software on those expensive proprietary boxes.
—Siuki Chan siuki.chan@xilinx.com
When Unix was first written it was for a PDP-7 (and written in assembly language). Subsequent ports to additional hardware helped turn Unix into the machine-independent system it is today. Associated with the 25-year history of Unix and its machine-independence has been additional overhead.
Linux offers a fresh start. Porting Linux to other architectures can bring a new machine-independent operating system into general use. While many people today see the x86 as the right answer I am sure people felt the same way about the PDP-11 (which was the second platform for Unix). Designing portability into Linux now means it will be much easier to get running on the hardware of the future—whether it be based on the SPARC, DEC Alpha, MIPS or something totally new.
—Haykel Ben Jemia haykel@cs.tu-berlin.de
I would like to say that I really like LJ and that through it I learn something new about Linux every month. In issue 25 (May 1996), I found the article Introduction to Gawk very interesting because I often use Gawk. But when trying things out, I noticed something wrong with the output of the FS statement. The article stated that
{
FS=":"
print $5
}
and
BEGIN {
FS=":"
}
{
print $5
}
are functionally identical, except that the first one is slower.
But when I executed the first program, the first line was blank. With the second program, everything was okay. So I asked on the Red Hat mailing list (because I use the Red Hat distribution) if someone could help me. Marc Ewing provided the real answer to the problem:
The line is split into fields before the rule is evaluated, so when the FS=":" is evaluated the first time, the line has already been split up, and either no fifth field exists, or in some situations the fields are wrong. So the two awk programs are not functionally identical; the first program is incorrect.
I hope this information will be useful for someone.
That was tested before being put in the magazine, but the bug was hard to notice (and was missed) because the /etc/passwd file on the machine used to test the script ran the output off the top of the screen. Thanks for bringing this to our attention.
In Issue 25 of Linux Journal, the lcc compiler was reviewed [Introduction to Awk, by Ian Gordon] and the FTP site for lcc was listed as ftp://ftp.cs.princeton.edu/pub/lcc/. However, I can only find a.out ports of lcc to Linux. Is anybody working on ELF support in lcc?
—Arthur D. Jerijian
The file ftp://ftp.cs.princeton.edu/pub/lcc/contrib/linux-elf.tar.gz is dated November 14th, 1995, so Linux ELF support for lcc has been around for a while.
To the editor:
I would like to comment on the article on gawk {An Introcution to Awk] by Ian Gordon in the May Linux Journal. Overall it is a nice introduction to the joys of awk programming, but I wish you had let me review it first.
There are a number of minor and not so minor errors in the article. In order of appearance:
1. Brian Kernighan wasn't one of the original designers of C; he “merely” wrote the book on it with Dennis Ritchie, who designed and implemented C. (Not to diminish his stature in any way; Brian is still a very important and seminal figure in the Unix and C world.)
2. The article says, “gawk also defines several special patterns wich do not match any input at all, the most commonly used being BEGIN and END”. This is incorrect. Only BEGIN and END are defined in awk, there are no others.
3. The statement “If you try to refer to fields beyond NF, their value will be NULL”, if read literally, is misleading. The value is the null or empty string, often denoted "". Granted, most programmer types would understand the statement at face value, maybe I'm just being pedantic.
4. There is a major error in the part that describes using a colon as the field separator.
{
FS = ":"
print $5
}
In gawk, field splitting occurs using the value of FS at the time the record was read. Thus, $5 will already have been determined, based on the previous value of FS (presumably a space, " "). Unix versions of awk do this incorrectly, delaying field splitting until a field is needed, but doing so with whatever value of FS is current. This is incorrect, and the POSIX standard for awk mandates that field splitting happen the way gawk (and mawk, see below) do it. In fact, my book (cited in the RESOURCES sidebar, thanks!) describes this exact issue.
The correct way to get the desired behaviour is to set FS either using the -F option, or using an assignment inside the BEGIN block, as mentioned later.
5. Some typos: “does not contain a seven field” should be “seventh”, and “modifing” should be “modifying”. And a nit. Calling the Info file a “page” is misleading. When printed, the current documentation is over 330 pages...
6. When talking about variable initialization, the article says “... setting it to 0 for an integer or "" for an integer or a string, respectively.” Not quite. Variables are initialized to 0 for their numeric value and "" for their string value. All numbers in awk are maintained internally as C double's. Numbers that look like integers are still stored as doubles. This can lead to confusion for C programmers:
x = 5 / 4 # x is now 1.25, not 1, no integer truncation
(I've been bitten by this one, myself!)
7. The discussion of the array “for” loop is incomplete.
for (i in theArray) print i
prints each index in theArray. To get both the indices and the corresponding values, you would need something like 8.
for (i in theArray) print i, theArray[i]
A word about implementation speed and comparisons to Perl. There are three freely available awk implementations: the Bell Labs version, gawk, and mawk. Gawk is much much faster than the Bell Labs version. Mawk (ftp from oxy.edu), by Michael Brennan, is a very nice implementation that is (generally) even faster than gawk. Although I haven't done any timings, I'm willing to bet that an awk program run with mawk will give a comparable Perl program a really good run for its money, every time. Gawk's advantages over mawk are its additional features, its ports to more systems, and its comprehensive documentation. Mawk's advantages are its speed and rock solidness.
9. In the resource block, the title of the gawk documentation from the FSF is now The GNU Awk User's Guide, with just myself listed as the author. Because the manual changed significantly (it's now about double the previous size), we changed the title, and I am listed as the author because of all the new and heavily revised material in the guide. The title page does give credit where credit is due, saying “Based on The Gawk Manual, by Close, Robbins, Rubin and Stallman.”
Finally, I would like to point out that many Linux distributions apparently don't yet have the latest version, 3.0.0; this should be gotten from a GNU mirror. There are a large number of nifty new features and bug fixes over the previous version, as well as the revised manual.
Please accept this note as constructive comments on an otherwise enjoyable article, one that I wish I had had time to write...
Thanks,
—Arnold Robbins gawk maintainer and documenter arnold@gnu.ai.mit.edu
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6
Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.
Learn more about catching the bad guy in this free white paper.
Sponsored by DLT Solutions
| Dynamic DNS—an Object Lesson in Problem Solving | May 21, 2013 |
| Using Salt Stack and Vagrant for Drupal Development | May 20, 2013 |
| Making Linux and Android Get Along (It's Not as Hard as It Sounds) | May 16, 2013 |
| Drupal Is a Framework: Why Everyone Needs to Understand This | May 15, 2013 |
| Home, My Backup Data Center | May 13, 2013 |
| Non-Linux FOSS: Seashore | May 10, 2013 |
- RSS Feeds
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Using Salt Stack and Vagrant for Drupal Development
- Dynamic DNS—an Object Lesson in Problem Solving
- New Products
- Validate an E-Mail Address with PHP, the Right Way
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Download the Free Red Hat White Paper "Using an Open Source Framework to Catch the Bad Guy"
- Tech Tip: Really Simple HTTP Server with Python
Enter to Win an Adafruit Pi Cobbler Breakout Kit for Raspberry Pi

It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Pi Cobbler Breakout Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- 5-21-13, Prototyping Pi Plate Kit: Philip Kirby
- Next winner announced on 5-27-13!
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




11 min 7 sec ago
2 hours 2 min ago
7 hours 15 min ago
10 hours 26 min ago
12 hours 42 min ago
13 hours 10 min ago
14 hours 8 min ago
15 hours 37 min ago
16 hours 46 min ago
17 hours 32 min ago