Porting DOS Applications to Linux

Trying to port a DOS application to Linux? Alan Cox gives you hints and practical help.

With a little care, the average DOS application can be easily ported to the Linux system. This article looks at some of the techniques involved, and tries to provide a small “builder's kit” of handy little DOS routines people always want under Linux.

Memory Models

DOS programs written in the C and C++ languages generally run in a variety of different memory models with their own segmentation semantics. The simplest is the “tiny” model, where all of the program and data are referenced off one segment. All three segment registers (CS, DS, and SS) point to the same place to suit the way the processor wishes to work. The Linux kernel executes programs in the 32-bit equivalent of tiny mode. Because offsets are 32-bit, not 16-bit, a program can utilise 4GB of address space before segmentation becomes an issue. Thus you get the simplicity of tiny model without the limitations.

As a result of this the DOS keywords near, far, and huge, have no meaning to Linux. These can be removed, or if you are trying to maintain a common source tree, you can add these lines instead:

#if defined(__linux__)
#define far
#define near
#define huge
#define register

gcc, the normal Linux C compiler, understands the register keyword, but the code optimiser is sufficiently good that using register is normally a bad idea.

Many DOS C compilers support an inline keyword. gcc also supports this.

C Types Supported

gcc supports all the ANSI C types you would expect and some extensions. The size of the normal types is, however, different from that of DOS compilers, and frequently causes problems when porting. Here is a summary for sizes on Linux/i386 (Linux on other architectures, such as the 64-bit Alpha, will differ in some respects):

Type Name       Linux           DOS time/small  DOS large       DOS huge
char            8 bits          8 bits          8 bits          8 bits
short           16 bits         16 bits         16 bits         16 bits
int             32 bits         16 bits         16 bits         16 bits
long            32 bits         32 bits         32 bits         32 bits
pointer         32 bits         16 bits         32 bits         32 bits
largest array   4GB*            64KB            64KB            640KB

* Actually, because some of the address space is reserved and used for other things, you can't get above about 2GB at the moment.

DOS programmers generally make good use of prototypes to avoid mysterious crashes caused by passing the wrong type. Mixing short and long under Linux normally just results in mysterious value changes in passed parameters, so the habit of prototyping is a good one to get into. Furthermore, you can tell gcc to warn you about any routine which has no prototype by adding the compiler flag -Wstrict-prototypes. All of the C library and system calls have prototypes, provided the correct header files are included.

gcc: the GNU C Compiler

The GNU C compiler is an extremely flexible tool. Although it compiles much slower than most of the DOS compilers, and is (intentionally) without an Integrated Development Environment, it has a wide range of abilities and flexibility that few DOS compilers can touch. People who have used DJGPP to write 32-bit DOS extender programs will be familiar with gcc, although in its Linux and Unix form it is somewhat easier to work with.

It is worth knowing how to tell gcc how to cope with different “flavours” of code. It can become a traditional K&R C compiler, by using the -traditional option, a strict ANSI compiler, by using the -ansi option, or a GNU C compiler—ANSI + GNU extensions. In addition, you can ask it to perform a wide range of sanity checks with the -pedantic and -Wall options. For a typical program, the compiler will generate a lot of warnings, many of which will give insights into potential problems. For example, the compiler will check to see that the conversion options in the format strings of printf()/scanf() and their family of functions match the types of the variables they will interpret.

The optimiser is controllable both by a general level of optimisations, using the -O1 or -O2 options, and on a per-optimisation basis for those speed-critical special cases. The optimiser performs a wide range of peephole and global optimisations, including intelligent allocation of registers, loop unrolling, and even instruction scheduling on RISC CPUs.

The GNU C compiler, linker, and debugger are all described in complete documents available from the Free Software Foundation, which you can either buy as books (the money goes to fund more free software work) or print yourself.

To cover the compiler, debugging tools, make, and other programs in full would require several more articles. If the documentation and documentation viewer are all installed, typing info gdb, info gcc, and info ld should give you a good start. (If the info program is not installed, the Emacs editor can also be used to read documentation in the info format.) Fans of graphical user interfaces may also like to pick up tgdb as a graphical front end for the gdb debugger, and xwpe, a look-alike of a well-known DOS C development environment, built on top of gcc, make, and gdb.



Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Error in listing 2

Graeme's picture

The second call to signal in term_ctrlz should be a call to kill, i.e. kill( getpid(), SIGSTOP );