Linux Programming Hints
For several good reasons, the Linux standard libary implements standard I/O (stdio) in a somewhat strange way. Unfortunately, many programs make unwarranted assumptions about how stdio is implemented that cause the programs not to compile properly under Linux. I have alluded to this problem before in this column; this month I will explain how to fix such source code to compile under any operating system, including Linux.
by Michael K. Johnson
Linux stdio is not exactly non-standard; that would imply that there is a real standard for how standard I/O is supposed to be implemented. Theoretically, all I/O operations that use the stdio library should only use the “published” FILE mechanisms, which are abstract, and should not pay any attention to details. Unfortunately, many stdio implementations are rather slow, and do not provide functionality that programs need.
Instead of writing a working replacement for stdio, many programmers chose to abuse the stdio interface by directly accessing “private” members of the FILE structure that are not guaranteed to be the same from system to system. In practice, this worked very well from system to system, because almost all the systems came from the same source and a prototype using the same variable names was widely available.
Programmers learned, for instance, that the _cnt member of the FILE structure contained the number of bytes which had been read by the library but not yet read by the application, and that the _ptr member contained a pointer to the buffer in which the characters that had been pre-fetched by the library were stored. It was general knowlege that behind the scenes, the _filbuf() (sometimes called _ _filbuf() ) macro was called to cause the stdio library to read more characters.
This worked as long as everyone used similar stdio implementations. Many well-respected applications used these methods to get around stdio; GNU emacs and the Rand MH mail handler are among them.
Linux is different.
The Linux stdio is based on the GNU libg++ iostream I/O. The FILE structure looks, in part, like this (from libio.h):
int _flags; /* High-order word is _IO_MAGIC; rest is flags. */ #define _IO_file_flags _flags char* _IO_read_ptr; /* Current read pointer */ char* _IO_read_end; /* End of get area. */ char* _IO_read_base; /* Start of putback+get area. */ char* _IO_write_base; /* Start of put area. */ char* _IO_write_ptr; /* Current put pointer. */ char* _IO_write_end; /* End of put area. */ char* _IO_buf_base; /* Start of reserve area. */ char* _IO_buf_end; /* End of reserve area. */
This isn't at all the same. It is better optimized: instead of having one _ptr element, it has one pointer for reading, and one for writing, and a buffer for each as well. Instead of keeping track of the number of characters in the buffer, a pointer to the end of each buffer is kept, as well as the curernt pointer. It makes it easier to use all sorts of things as files, including shared memory, SYSV IPC, and anything else that fits the paradigm; it is dynamically extensible. It is also shared between C++ and C, and makes the C++ iostream implementation more robust because of the extra testing it gets as a standard io package.
If you have worked with the Linux or GNU C libraries in the past, you will notice that the names have changed. They used to be shorter names like _pbase and _pptr that looked like they were related to the old stdio names. In November 1993, the names were changed to what you see above. It is not anticipated that these will change again in the foreseeable future. See the sidebar “Old Names to New” for a listing of how the names changed.
Replacing direct access to the members of the FILE structure with abstract macros can make it possible to compile offending source on any system. Since the Linux stdio makes a distinction between reading and writing, the first thing to determine is whether each code fragment is reading or writing. Then you replace the direct use of the members of the FILE structure with macros; ones that are specific to reading an writing. Finally, you write the macros; one set for Linux, and one set for “standard” stdio. Here are some of mine:
Under Linux or other similar stdio implementation:
#ifdef _STDIO_USES_IOSTREAM /* defined in libio.h */ #define FWptr(f) ((f)->_IO_write_ptr) #define FRptr(f) ((f)->_IO_read_ptr) #define Fptr(f) (((f)->_IO_file_flags && \ _IO_CURRENTLY_PUTTING) ? \ FWptr(f) : \ FRptr(f)) #define FWcnt(f) (((f)->_IO_write_end - \ (f)->_IO-write_ptr) > 0 ? 0 : \ (f)->_IO_write_end - (f)-> \ _IO_write_ctr) #define FRcnt(f) (((f)->_IO_read_end - \ (f)->_IO_read_ptr) > 0 ? 0 : \ (f)->_IO_read_end - (f)-> \ _IO_read_ctr) #define Fcnt(f) (((f)->_IO_file_flags && \ _IO_CURRENTLY_PUTTING) ? \ FWcnt(f) : \ FRcnt(f)) #define Ffill(f) __underflow(f) #define Fflsh(f) __overflow(f)
Under “standard” stdio:
#else /* standard stdio */ #define Fptr(f) ((f)->_ptr) #define FWptr(f) Fptr(f) #define FRptr(f) Fptr(f) #define Fcnt(f) ((f)->_cnt) #define FWcnt(f) Fcnt(f) #define FRcnt(f) Fcnt(f) #define Ffill(f) _filbuf(f) #define Fflsh(f) _flsbuf(f) #endif
Note that some code may use f->_cnt as an lvalue (a variable to which something is assigned). In these cases, f->_ptr will always also be assigned to; both need to be updated at the same time in the standard stdio library. Since the “count” values in these abstraction macros are calculations for iostream-based stdio, they cannot be lvalues. However, since they depend on the “pointer” values and the “end” values, and the “pointer” values are updated, and the end" values don't change, they do not need to have the updated values assigned to them. Therefore,
f->_ptr++; f->_cnt; ;
becomes (assuming that the code is reading using this pointer):
FRptr(f)++; #ifndef _STDIO_USES_IOSTREAM FRcnt(f); ; #endif
Fptr(f)++; #ifndef _STDIO_USES_IOSTREAM Fcnt(f); ; #endif
if you are not sure whether the code is reading or writing.
I will warn you: trying to apply these instructions and macros to code you are porting without understanding the code you are working on is likely to be disastrous. Your application may compile, but quietly lose data if you put the wrong macros in. It is most important not to use the FR*() macros when the library is writing, and not to use the FW*() macros when the library is reading. If you can't tell which is being done, you are far better off using the generic versions, Fptr() and Fcnt(), than you are guessing.
There are other mistakes waiting to be made, and I can't cover them all, because I don't know what they all are. The source code to the Linux C libary is available via ftp from tsx-11.mit.edu and sunsite.unc.edu, and is distributed with many Linux distributions. Reading the libc source code (usually found in /usr/src/libc-linux/libio/), and understanding what it is doing, is the safest route to knowing what to do when porting code that makes assump-tions about stdio. This article alone can only help you along your way; you will still have to understand the program you are porting and the Linux stdio to achieve success.
Michael K. Johnson may be reached by e-mail at: (email@example.com)
Practical Task Scheduling Deployment
July 20, 2016 12:00 pm CDT
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.Register Now!
- SUSE LLC's SUSE Manager
- My +1 Sword of Productivity
- Managing Linux Using Puppet
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Non-Linux FOSS: Caffeine!
- Doing for User Space What We Did for Kernel Space
- Google's SwiftShader Released
- SuperTuxKart 0.9.2 Released
- Parsing an RSS News Feed with a Bash Script
- LiveCode Ltd.'s LiveCode
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide