Quantum GIS: the Open-Source Geographic Information System

 in
Exploring Quantum GIS (QGIS) using an example of real-estate planning.
How a GIS Formats Data: Vector vs. Raster

The hefty challenge for a GIS is to portray our lovely yet complex world accurately yet rapidly—and without the need for a cluster! There are two tricks, or methods, a GIS uses to create a digital representation of Earth's features on your desktop.

The first method is using vector data (the type used later in this article). As complicated as the world can be, a GIS can represent any geographical object using three geometric elements—namely points, lines and polygons. Small stuff like community centers and traffic lights can be portrayed as points. Features such as rivers and pipelines are really just glorified lines, so they can be shown as such. Finally, nearly everything else, such as a state park, though it might be oddly shaped, is finite and contained in boundaries, making it a polygon at the end of the day. Broadly speaking, the vector format is analogous to traditional maps, where the world is abstracted with symbology, and precision is very important.

The second method is raster data. Raster data is used to portray Earth's characteristics that have no shape visually, including measurements like ocean depth, forest-cover type, elevation and annual rainfall. Some image types you will encounter include GeoTIFFs, Erdas Imagine Images, GRASS AIGs and USGS Digital Elevation Models. Some common examples of raster-based imagery are satellite images and aerial photos. In these two types of raster imagery, the value of each cell is a measurement of light that is reflected off the Earth's surface. Particular ranges of these values can signify specific land-cover or vegetation types.

Vector-Based Data Formats in GIS

As you splash around in the world of GIS, you also will encounter a plethora of vector-based spatial file formats. If you have ever used the application ArcGIS from ESRI, you probably are familiar with geodatabases and coverages, two of the most common spatial file formats in proprietary GIS. Of these two more-advanced spatial data formats, only coverages are usable in QGIS, but not geodatabases. In addition, in QGIS, we can utilize ESRI shapefiles, which are plentiful in on-line data repositories and a sort of standard, as they have been around a long time. In fact, shapefiles are the standard format for ESRI's ArcView, which is the company's previous generation of GIS applications. Essentially, a shapefile is a set of files with vector-based location and attribute data, which can be represented in a GIS application.

QGIS also supports some other file formats, such as MapInfo and PostGIS. PostGIS is especially interesting, as it is an open-source spatial database technology. PostGIS “spatially enables” the PostgreSQL server, allowing it to be used as a back-end spatial database for GIS and—for those who are familiar with GIS technologies—as such, is similar to ESRI's SDE or Oracle's Spatial extension.

Some Hard-Core Cartography: Projections and Coordinate Systems

Two other important concepts critical to any cartographic endeavor are map projections and coordinate systems.

Remember the big, flat world map you had in your fourth-grade classroom? The one with Greenland bigger than Africa? That map is an ideal illustration of what happens when you depict a round object such as the earth onto a flat map. Converting a 3-D globe onto a 2-D map is called a map projection.

In a GIS, you need to consider the projection, because any map you view or create is essentially flat like a paper map. Thus, the same concept applies to both situations.

Just as important as the map projection is the coordinate system. A coordinate system is the Cartesian system of x and y axes that a GIS uses to define locations on a map. This is opposed to the latitude and longitude system that defines location on a sphere.

In larger projects, knowledge of projections and coordinate systems is very important, and if a mismatch exists among different parts of a project, life can get frustrating quickly. Fortunately, this project is simple enough to avoid much concern, as I am working at the county level and all my shapefiles come from the same data source. However, when working with larger areas and multiple data sources, it is important to be familiar with these concepts and standardize your projection and coordinate system project-wide.

Enough Theory, Let's Get Some Data!

At this point, we have enough GIS theory to understand what we're doing and start the real-estate planning project. At this stage, I track down the requisite data.

This project involves finding a parcel of land in Washtenaw County, Michigan, where I can build a cluster of homes in a natural setting. I am looking for a suitable land parcel that was once a wetland but today is agricultural and suitable for conversion back to a wetland. The ideal site will be close to a river or lake, have good road access and be as close to the city of Ann Arbor as possible.

When you embark on a GIS-based project, it's wise to specify all of the elements you need, because in general, each will likely be one of the layers you must acquire. Thus, for this project, we need layers that depict, respectively, land use, areas with potential for wetland restoration, roads and hydrography (rivers and lakes). In general, the most common format for each layer will be in the form of a shapefile, which QGIS can handle without a hitch.

So where can I obtain these shapefiles? Fortunately, a plethora of excellent repositories of free, downloadable geospatial data exist. An excellent example is the public Michigan Geographic Data Library (MGDL), which offers a vast collection of vector- and raster-based data at the watershed, county and state levels. Just some of the datasets available include those I am looking for, as well as aerial photos of the entire state, federal census information, geology, soil types, public land ownership and topography. In the MGDL, the default format for vector-based data is the shapefile.

From the MGDL, I can download the following datasets at the county extent:

  • Michigan Geographic Framework Hydrography (lakes and rivers).

  • 1992 National Land Cover Dataset.

  • Michigan Geographic Framework Transportation (roads).

  • Potential Wetland Restoration.

______________________

James Gray is Products Editor for Linux Journal

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState