Computer Archeology

Recovering files from an Atari ST's hard drive.

The term "computer archeology" has two meanings. One meaning is the art of using computers in archeology. But that's only because they haven't found a computer in a dig--yet. Here, we're using the term in the way such science fiction writers as James P. Hogan and Vernor Vinge use it: the art of recovering data from defunct, possibly alien, computers.

This article explains how to recover data from an Atari ST hard drive, using PC hardware, Linux and a bit of care. The effort benefits from two design decisions the Atari ST designers made that show the benefits of open standards. First, the Atari ST can use a standard SCSI I hard drive with an Atari-specific host adapter. Second, the ST uses 12-bit FAT filesystems, so I did not have to develop a filesystem driver. Linux's open architecture, open-source nature and excellent native development tools would have made it the ideal platform for such a project.

The target hardware consists of a 234MB Quantum ProDrive LPS hard drive (large in those days) originally connected to the Atari by an ICD host adapter.

Recovering the Hard Drive

Recovering the data from the hard drive was easy. I already had installed a SCSI host adapter on a PC. All I needed to do was check the SCSI address of the hard drive, add it to the SCSI cable and provide power. Upon firing up Linux, I found the hard drive associated with a device file in /dev. As I already had two other SCSI drives on the computer, the Atari drive showed up as /dev/sdc. So far, so good. The first thing to do was read the master boot record to make sure I was accessing the correct drive. In UNIX, everything is a file, even memory (/proc/kcore), so I was able to get the data from the hard drive with a one-line command:



dd of=table if=/dev/sdc bs=512 count=1


Examining the file table in Emacs' hexl mode (Listing 1) convinced me I was in the right place. The first several bytes are 68,000 instructions which make an operating system call, not the Intel 808x CLI and branch instructions one would expect in a PC MBR. (See the LILO and/or GRUB documentation for the gory details of a PC MBR.) Listing 2 shows a partial Forth disassembly (in Forth's Reverse Polish notation) I later made of the MBR. Second, the word GEM, the name of the Atari graphical user interface, or GUI, shows up in the region where the partition table should be.

Convinced I was in the right place, I copied the entire hard drive to a file:



dd of=quantum if=/dev/sdc bs=512


That gave me a file image of the hard drive. Now I could shut down, remove the Quantum hard drive from the PC, put it back in its static-free bag and boot to Linux.

Accessing the Partitions

One problem with the Atari ST is that hard-drive partitioning, if done at all, had to be done by a driver specific to the host adapter. Unfortunately, this means that every host adapter provided a different partitioning scheme. Fortunately, back when this Atari ST was my active machine, Mike Yantis had reverse-engineered the ICD partition table and had written code in Forth to access the table. I had ported his code to my Atari, and it was in a file on the hard drive. So, I had to recover the partition table so I could read the file that has the code I need in order to recover the partition table. Great.

There is, however, a workaround. I tried reading the entire hard-drive image into that great programmers' Swiss Army Knife, Emacs, but Emacs balked at the size of the file. So I made a copy of the first 64MB:



dd if=quantum of=test bs=1024 count=61440


I loaded that into Emacs, switched to hexl mode (M-x hexl-mode) and searched for the word "partition". That didn't work. I tried the same thing with the next 64MB:



dd if=quantum of=test bs=1024 count=61440 skip=61440


That file contained the code I needed. I now knew the layout of the partition table: each entry was twelve bytes long. The first one, for C:, started at an offset of 454 (0x1C6) bytes. It was followed by three more. The entry for G: started at 342 (0x156), and it was followed by five more partitions. The ICD partition scheme could support up to 13 partitions, but I never used more than 10, so that was as far as my code went.

There are no extended or logical partitions here as there are in PC MBRs; thanks, Murphy. Whoever named them logical partitions had a vicious sense of humor; they aren't.

Each partition table entry consists of the following:

  • A byte of binary data. At a guess, bit 7 may indicate this partition is to be booted, while bit 0 says it has a valid filesystem.

  • Three characters of text indicating, possibly, the filesystem or OS that uses the partition. The Atari used the string "GEM". There is no zero termination in this field.

  • A four-byte value (in 68000 byte order, which also happens to be in network byte order) indicating the location of the first sector of the partition. As SCSI drives are addressed in a strictly linear fashion, a simple sector number is sufficient. As with IDE's LBA addressing mode, cylinders, heads and sector manipulations are internal to the drive.

  • Another four-byte value indicating the size in sectors of the partition.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Hatari

Anonymous's picture

Supposedly, Hatari (and maybe Steem, I think) can mount a complete HD image file as normal inside the emulator.

They can also mount local directories as 'virtual' hard drives on the emulated machine.. so by mounting the image and a local dir, you can copy files over, which is also a solution. (perhaps for users who don't have root access)

Hatari can read disk images

Michael Baltaks's picture

I can confirm that a SCSI drive imaged using dd worked fine with hatari, although the drive in question was not bootable, so I also needed a floppy image with the ahdi hard disk drivers.

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState