Somebody Still Uses Assembly Language?
In the core program for our computer science curricula we offer two assembly language courses as elements in that part of our sequence providing hardware emphasis. Although the students do learn to program in this arcane language, the emphasis is on using assembly language as a detective's tool to learn about the underlying hardware.
Both courses involve the omnipresent Intel 80x86 architecture. However, the first course treats the chip as an 8086/88 and works within the MS-DOS environment. Insofar as is practical within the existing time constraints, we pretend that MS-DOS is not present and try to simulate an embedded systems environment. The essential fact is that MS-DOS puts us in charge of the system resources, i.e., in real address mode. This first course is a prerequisite for our subsequent hardware courses.
The focus of the second course is to examine the architecture elements that support a multitasking, multiuser operating system. For this course we have chosen Linux as the environment. This second course is a prerequisite for subsequent hardware courses as well as for our operating systems sequence. Linux is typically used in the latter. Of course, in such a sophisticated multitasking, multiuser system we no longer have direct control over the hardware resources. It is of central interest to see how the operating system protects itself.
This article discusses the second course. The intended audience consists of those who have an interest in the features of the 80x86 (x >= 3) Intel architecture that support an operating system such as Linux. The two techniques we use for investigating are as follows:
Write our own assembly language code to probe the architecture.
Examine assembly language written by others.
These two approaches are discussed in their respective sections later in this article. This article is not an attempt to investigate the Intel architecture (a subject for a large volume), but to describe the tools and resources available to do so.
Virtually all textbooks on the Intel 80x86 architecture assume that the reader is working in a Microsoft environment, usually with the Microsoft Assembler, MASM. Because we are working in a Linux environment, we do not use such traditional textbooks; instead we use as the primary resource the Intel486 Processor Family: Programmer's Reference Manual (1995), Intel Order Number 240486-003. This is a large manual and of special interest are Parts I and II dealing with application and system programming, respectively. Other useful resources are the on-line Kernel Hacker's Guide (see http://www.ssc.com/linux/ldp.html), Brennan's Guide to Inline Assembly (see http://www.rt66.com/-brennan/djgpp/djgpp_asm.html), and the various man pages and info documents available within Linux itself. Using such a set of resources rather than a focused textbook is, of course, typically how a real world software engineer operates.
Linux is a natural choice for rather obvious reasons:
It is free.
It includes a complete set of development and detective tools.
The source code is available.
It is an evolving multitasking, multiuser environment making use of the advanced features of the underlying chip architecture.
I recommend Debian GNU/Linux to our students because:
It is quite stable.
It can be updated/upgraded nondestructively, in place.
Various libraries are in the standard locations.
It is non-commercial, so students can get more seriously involved with maintenance and development later in the curriculum.
The Debian users and developers are extremely responsive and helpful.
Other distributions such as Red Hat or Craftworks meet most of these requirements quite well also, except for item 4, which is important for our students, but perhaps not to others.
We have found it convenient and productive to write our assembly language in-line within C source code. Labels can be interjected in the source code at appropriate places to provide breakpoints for the debugger. The primary motivation for writing in-line assembly language is to examine architectural features. The assembly language statements are AT&T style rather than Intel style. The former seems to be the Unix custom.
As a simple example, we'll exhibit a short program, example1.c (see Listing 1), whose purpose is to examine the flags register which has three types of flag bits: status bits (e.g., the Carry Flag), system flags (e.g., the two bit combination giving the I/O Privilege Level), and a control flag, the Direction Flag. The program does the following:
Puts a copy of the flags register in the eax register for examination (breakpoint bp1).
Flips all the bits in that copy (breakpoint bp2).
Attempts to write that bit-flipped copy into the flags register and then puts a copy of the resulting flags register into eax for examination (breakpoint bp3).
Note how in-line assembly language is supported by the asm macro.
To compile this into the executable program example1.x, containing necessary information for subsequent use by the debugger, we use the -g switch in the following command:
gcc -g example1.c -o example1.x
The next step is to invoke the debugger. It is convenient to also get a log of the debugger activities via a pipe to the tee command so the command line entry would be:
gdb -silent example1.x | tee example1.logyielding the gdb prompt
(gdb)Now gdb is ready to run example1.c, while tee will produce a record of our activity in example1.log. The latter can be printed or examined with an editor.
It is beyond the scope of this article to also be a tutorial on the use of gdb; such documentation is readily available in man page and info format. In addition, for use within a browser, one can find, in html format, the FSF document Debugging with gdb by Stallman and Pesch. One current URL for this is: http://funnelweb.utcc.utk/~harp/gnu/tars.
It might be more efficient to first look at the terse, readable introduction to gdb given in Getting to Know gdb by Loukides and Oram in Issue 29 of Linux Journal (September 1996).
Having said that, let's at least show a typical example1.log (see Listing 2) which shows setting breakpoints and then stopping at those breakpoints to examine registers of interest. Lines starting with the (gdb) prompt are commands entered by the user, whereas everything else is information volunteered by the debugger.
The log file tells the following:
The original value of the flags register was 0x246.
`Our attempt to flip all the bits and write the flipped value back to the flags register was only partially successful and that attempt generated an exception (signal SIGTRAP).
The investigator might go through a questioning process rather like this:
What does the original value of the flags register mean in terms of individual bits (e.g., what is the I/O Privilege Level)?
Which instruction generated an exception and why?
Which bits could be flipped and which could not? Why?
Interesting facts are then uncovered. For example, in the log file shown, the ID flag (bit 21) was successfully flipped. According to the Intel documentation this indicates that this processor can execute the CPU_ID instruction. On the other hand, the bits giving the I/O Privilege Level (bits 12 and 13) could not be modified. Clearly, that is expected—the casual user should not be able to change anything that might help get at the I/O hardware directly.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Sony Settles in Linux Battle
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- Libarchive Security Flaw Discovered
- Profiles and RC Files
- Maru OS Brings Debian to Your Phone
- Understanding Ceph and Its Place in the Market
- Snappy Moves to New Platforms
- The Giant Zero, Part 0.x
- Git 2.9 Released
- Astronomy for KDE
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide