Archaeology and GIS—The Linux Way
Since the days of Heinrich Schliemann's search for Troy, archaeologists have been confronted with the dilemma of how to record the spatial characteristics of archaeological data, and once recorded, how to analyze those data. This spatial information has historically been recorded on paper maps of varying accuracy and scales. When researchers wanted to perform analyses of these data, they were required to spend hours, if not days, transposing this information to new paper maps and making arduous measurements by hand. This time-consuming process is nearing its end as researchers have moved to take advantage of geographic information systems (GIS) for spatial analysis. A GIS is best thought of as a dynamic database for spatial data. Fortunately for budget- and quality-conscious researchers, one of the oldest and most robust GIS packages, GRASS, is available for free on Linux.
In the late 1970s, the method of storing archaeological data began to slowly change from paper to digital. As access to affordable computers became more common, archaeologists could see the advantages of digitally storing their spatial data for analysis. During this era, several commercial vendors began to develop primitive GIS systems in an attempt to meet the needs of researchers for storing spatial data. Unfortunately, these products worked only with vector data, completely ignoring the utility of raster data for analysis. The Construction Engineering Research Laboratory (CERL) of the U.S. Army Corps of Engineers found this oversight in the commercial packages too great and decided the solution was to build their own UNIX-based GIS which would meld vector and raster analysis capabilities. Because this code was written with federal dollars, it was made freely available to the public. From this project, the GIS package Geographical Resource Analysis Support System, more commonly known by the acronym “GRASS”, was born. From its beginnings in 1982, GRASS has grown into a powerful GIS package that rivals expensive commercial packages, yet remains free to the public.
The McGregor Guided Missile Range, located on Ft. Bliss in western Texas and southern New Mexico, is just a bit smaller than the state of Delaware. As part of the army's long-standing commitment to effective management and research, Ft. Bliss hired the Archaeological Research Center (ARC) at the University of Texas-El Paso to do a 100% pedestrian survey of 180 square kilometers of the McGregor Range. Principal Investigator Tim Church and Ft. Bliss Lead Archaeologist Galen Burget realized that utilizing old-fashioned survey methods, which are primarily paper-based, would not be feasible for a survey of this size. Furthermore, the research team was keenly interested in utilizing spatial environmental and ecological information collected by other scientists at the post for a landscape analysis.
Archaeologists/computer jockeys Trevor Kludt and R. Joe Brandon were hired to help develop the most efficient method of dealing with the anticipated high volume of data, and to insure this data would be compatible with existing ecological data. R. Joe also directed the archaeological field crew during the project to monitor the feasibility of the data acquisition methodologies being developed. This hands-on approach allowed Trevor and R. Joe to tweak the methodology for collecting data so that all aspects of the project were integrated seamlessly.
The research team decided from the outset that a flexible and powerful GIS package was critical if we were to accomplish our goals. Considering that virtually no money was in the budget for additional hardware or software and the whole show would have to be run on a single mid-level Pentium PC (Gateway P-133 with 2GB hard disk and 32MB memory), we were in a bind. After considering a number of commercial GIS packages, we determined that GRASS running under Linux was the answer. GRASS offered the full suite of sophisticated GIS functions this project demanded, while Linux provided us with a wide range of tools to tackle the myriad customizations we knew were inevitable. Once the decision was made, we had Linux (Red Hat 4.0) up and running in no time, then downloaded the compiled binaries of GRASS (v4.2). This was all done for about what the field crews would spend on refreshments after a day surveying in the desert heat.
Knowing from the outset that our field data were to be integrated with raster (cell)-based environmental and ecological data, the research team decided to develop a raster-based field methodology for the survey. At first blush, this decision seemed very practical. In practice, it turned out to be the biggest challenge of the project. Traditional field methods focus primarily on defining the boundaries of archaeological concentrations (sites) and are therefore vector-based. The methods we were developing, focusing on the distribution of materials within the grid units or “cells”, were raster-based. How could we put these individual raster units together to define our sites?
For speed, flexibility and consistency, we decided to develop computer routines to do this monotonous work for us. As we needed these routines quickly, initial prototyping was done using Qbasic while Linux and GRASS were being set up. Eventually, we translated these routines into a comprehensive Tcl script suite and a number of Bash shell scripts. These scripts allowed us to automate the importation of the field-collected data, dynamically define site boundaries, provide reports on site inventories, import data into GRASS, run a suite of spatial tests and comparisons, and output the finished site and project maps. Without the power and stability of Linux and the interoperability of GRASS, these tasks would have taken a significant amount of time.
Fieldwork proceeded rapidly. Each of the 180 square kilometer survey areas was divided into a grid of 16m2 “cells” which matched the size of the analysis raster cells within GRASS. Figure 1 shows the results of a single km2 survey unit, displaying artifact distribution, site boundaries and a 100-meter grid. The site boundaries were created using the GRASS module r.to.vect. The map output was created in a script file that can be sent to the screen or the printer. The size of the cells, 16m on a side, was a trade-off between the desire for precision (smaller units) and efficiency (larger units). Field crews would traverse these survey areas using 1:3,000-scale aerial photographs with the 16m2 grid superimposed on the photo. The detail of these air photos allowed the crew chief to navigate to near sub-meter precision quickly and consistently. As the crews walked through the survey areas, crew members would call out when they had spotted an artifact or feature, and the crew chief would then call back the corresponding row and column number for that “cell”. This information was then noted on a data recording form. In the lab, this information was checked for accuracy and errors, entered into a DOS-based data file (on a 386 with 40MB hard disk and 2MB memory salvaged from the scrap heap), then migrated to the Linux environment.
As the project unfolded, it became clear that the sheer volume of incoming data would give us no rest. Each day, the six field crews were covering thousands of cells on the ground and sending in mountains of data. Figure 2 shows the display results generated by GRASS for a small portion of the study area. This displays the variety of survey units, roads, site boundaries, artifacts and features. The map output was created in a script file that can be sent to the screen or the printer. Since development of the data processing routines coincided with the start of fieldwork, we quickly developed a backlog of data to process. And since each of the 180 survey areas was treated the same, we counted on the power of scripting redundant tasks to increase our efficiency in keeping up with the data stream. Once our routines were tested, debugged and on-line, we caught up quickly and were able to keep pace with the field crews. In the end, we accomplished what had at times seemed impossible. We are convinced the powerful scripting and developmental tools available under Linux, coupled with the sophisticated GIS routines available under GRASS, enabled us to successfully tackle the difficult and interesting challenges encountered during our project. Equally important, we were able to complete this project on time and under budget while creating one of the most dynamic spatial archaeological databases anywhere. In retrospect, we realized that even with a greater budget, we could not have fared better than we did utilizing the strengths of Linux and GRASS.
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
|Android's Limits||Jun 04, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- Introduction to MapReduce with Hadoop on Linux
- Senior Perl Developer
- Technical Support Rep
- Weechat, Irssi's Little Brother
- UX Designer
- One Tail Just Isn't Enough
- Android's Limits
- Free is costly
1 hour 8 min ago
- Bought photoshop CS5 for developing a website :(
1 hour 25 min ago
- Reply to comment | Linux Journal
2 hours 12 min ago
- Reply to comment | Linux Journal
2 hours 13 min ago
- Replica Watches
4 hours 38 min ago
- Reply to comment | Linux Journal
8 hours 48 min ago
- on the path to understanding
8 hours 52 min ago
- As a fisher,we know that a
1 day 4 hours ago
- All I Say Is Worth Share!
1 day 5 hours ago
1 day 5 hours ago
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?