The History of Compilers

History of Compilers

Once upon a time, there was only machine language and it was a tedious-some way to write code. It would take inordinate amounts of time to do the simplest things. For example to do a simple, set a value to 1 would require you to type these easy to remember numbers:

A9 01
or
A9 01
8D 00 0C

In case you didn't recognize it, it is written in 6502 machine code, for the benefit of those as old as me :-). I still have a KIM-1 with the books and dug it up for fun today.

Then some genius came up with the idea of an assembler. It would (and still does) allow you to write this instead:
LDA 01
or
LDA 01
STA INDEX

where INDEX is the memory location 0C00. Yes it was a little-endian processor. Some may remember the days when some were big-endian and some were little-endian. You can debate what you prefer. Then, later:
MOV AX,1
or
MOV INDEX,1

Some assemblers were written so that destination in a move statement was to the right and some to the left. You can also debate what you prefer, if it really matters to you :-)

Then assemblers became nicer and someone developed a macro assembler. You could use simple string substitution (like doing a global find and replace in a text editor) to substitute abstractions into assembly code. And you could write:

if optimized
index = AX
else
index = @address
mov index,1

Well you couldn't really because (this is just an example) but those, so inclined, might show me how to do it in their favourite macro assembly :-)

Then people said we could write this instead:
index = 1

And a light came on and then people (actually geeks) decided we could write a computer language. Some came up with some awful languages which I won't mention to be kind to those who still claim them. Some were quite useful like BASIC and it is still fondly remembered by us older geeks. In BASIC, you could actually write:
INDEX = 1

This was about the time some geeks were shouting and some were not. The not shouting crowd invented C, which was a little odd because you had to shout it to say it!

In C, you could write:

int index = 1;

And you needed semicolons at the end of the lines which was done to make it easier for the language to “parse” your code. Funny that BASIC didn't need it, but then in BASIC you could only put one statement on each line.

Then somebody said, wouldn't it be cool if you could optimize things a lot if you could say:
register int index = 1;

Because you didn't know if the compiler was going to use memory or a CPU register. Then the compiler would use the optimization (given above in assembler macro pseudo code) to those who were bright enough to realize, at the start, that my next story is about optimizing compilers. Because in one case I used a register in the CPU and in the other case a memory location in, presumably, RAM. You can guess which one is faster :-)

So then we said, let's let the compiler figure out best if “register” is needed or not. And optimizing compilers were born. This was a major step and it took some brilliance.

About the time optimizing compilers were being developed, other languages came and went. FORTRAN, LISP, APL, PL/1, PL/M, PASCAL, MODULA-2, SIMULA, Java, All these languages I have written some code for, at one time or another. There's many more than this and I'm sure some of you have your favourites and one's you've hated.

Lately, Scripting languages have been developed. I don't know the history of these languages as well as the “true” compiles, mentioned above. I put “true” in quotes because most scripting languages have compiler's themselves. They traditionally use a p-code or a pseudo code compiler which actually executes the pseudo machine statements in a simulated box, like a virtual machine. Java is odd in this respect and separates it as some kind of “switch hitter” between a scripting language and a compiler. You may enjoy debating this. I am not an expert on Java.

Scripting languages are in my opinion more advanced and more useful for writing most code than compilers. Only when you need highly optimized code should you use a compiler.. I still use Bash all the time but when I find myself reaching for awk its time to use a more advanced scripting language. Before Bash I used C shell, for the first time, on a VAX 11/750 running a nice BSD variant of UNIX (I think it was 4.2). This was when I first learned C. Then later I was forced to learn DOS Batch which was horrible compared to C shell. Later again, I switched to Bash when I first re-visited UNIX on a Pentium 1 running Slackware. This was when I discovered the beautiful gcc or the GNU compiler and fell in love with open source.

There are some good scripting languages and some bad. Tcl/Tk is cool because you can write portable GUIs in a scripting language but it doesn't check your code syntax for mistakes until you run it (maybe it does now). Bad is Perl. I do use it now and then but only for short programs. It makes for very hard to read programs for the vast majority who write it, although I have seen some nicely coded Perl but in the real world most people write really hard to read Perl code. My apologizes to those who like Perl. Personally I like Python and I have read a couple of books on Ruby and thought it looked nice too. Python is a beautifully designed language. It makes object oriented code look easier to read than what I have seen in SIMULA, C++, Perl or Java.

Which brings me to my final thought. Object oriented programming. It looks great and confusing when you study it (I was getting older by the time I had to finally learn it and used it at work). I tried to learn it in SIMULA when I was in college but I honestly cannot remember a thing about the class! Needless to say I grudgingly learned object oriented programming and used it sparingly. I still find reading it difficult to do and can get lost in all the inheritance. I'm not saying it shouldn't be used, all I'm saying is it should only be used for those tasks where it makes sense it should be modeled with objects. Python is beautiful because you actually write object oriented code without having to think too much about Classes unless you really do need to model something.

Some languages along the way have required that variables be “declared” and some do not. In Python I can write:
index = 1
In C, I have to write “int” in front of it to tell the compiler it's an integer and I also have to remember that pesky semicolon.

Personally, I don't like to declare variables. I find it's a waste of my time. If the language is well written it should be able to figure out what I want to do by it's usage. Some have done a good job of this and some haven't. Some languages get all confused (or should I say the user did) and the compiler does something unexpected (from the user's perspective) because the user made a simple coding error. The user in this case being an average geek.

Well that wraps up my History of Compilers. I hope you enjoy reading it as much as I did writing it.