Keeping Up with Python: the 2.2 Release

Python 2.2 resolves some well known deficiencies of the language and introduces some new powerful constructs that are key strengths of other object-oriented languages.
Dictionaries

Dictionaries and files are two other Python data types that received the iteration makeover. A dictionary's iterator traverses its keys. The idiom “for eachKey in myDict.keys()” can be shortened to “for eachKey in myDict”, as shown in Listing 1.

Listing 1. Looping through a Dictionary

In addition, three new built-in dictionary methods have been introduced to define the iteration: myDict.iterkeys() (iterate through the keys), myDict.itervalues() (iterate through the values) and myDict.iteritems() (iterate through key/value pairs). Note that the “in” operator has been modified to check a dictionary's keys. This means the Boolean expression myDict.has_key(anyKey) can be simplified as “anyKey in myDict”.

Files

File objects produce an iterator that calls the readline() method. Thus, they loop through all lines of a text file, allowing the programmer to replace essentially “for eachLine in myFile.readlines()” with the more simplistic “or eachLine in myFile”:

>>> myFile = open('config-win.txt')
>>> for eachLine in myFile:
...     print eachLine,   # comma suppresses extra \n
...
[EditorWindow]
font-name: courier new
font-size: 10
>>> myFile.close()
Classes

You can also create custom iterators for your own classes. This allows you to avoid the hack of overloading the __getitem__() special class method. Overloading __getitem__() implies the user can ask for any subscript in any order. But some objects do not logically allow this. Using an iterator rather than overloading __getitem__() makes explicit what the user can or cannot do.

To add iteration to your classes, override the __iter__() special method to return itself (making the object its own iterator). Then override the next() method:

def __iter__(self):
    return self
def next(self):
    # return next item or raise StopIteration

We can tweak our code for a similar example. This time, we choose to return a random element from the sequence (Listing 2). This example demonstrates some unusual things we can do with custom class iterations. One is infinite iteration. Because we read the sequence nondestructively, we never run out of elements, so we never need to raise StopIteration.

Listing 2. Custom Class Iterations

In Listing 3, we create an iterator object using our class, but rather than iterating through one item at a time, we give the next() method an argument telling how many items to return.

Listing 3. Creating an Iterator Object Using Our Class

Now let's try it out:

>>> a = AnyIter(range(10))
>>> i = iter(a)
>>> for j in range(1,5):
>>>     print j, ':', i.next(j)
1 : [0]
2 : [1, 2]
3 : [3, 4, 5]
4 : [6, 7, 8, 9]
Mutable Objects and Iterators

Before we move on to generators, remember that interfering with mutable objects while you are iterating them is not a good idea. This was a problem before iterators appeared. One popular example of this is to loop through a list and remove items from it if certain criteria are met (or not):

for eachURL in allURLs:
    if not eachURL.startswith('http://'):
        allURLs.remove(eachURL)            # YIKES!!

All sequences are immutable except lists, so the danger occurs only there. A sequence's iterator only keeps track of the Nth element you are on, so if you change elements around during iteration, those updates will be reflected as you traverse through the items. If you run out, then StopIteration will be raised, but you can continue with the iteration if you add items to the end and resume, as shown in Listing 4.

Listing 4. Iteration Example

When iterating through keys of a dictionary, you must not modify the dictionary. Using a dictionary's keys() method is okay because keys() returns a list that is independent of the dictionary.

But iterators are tied much more intimately with the actual object and will not let us play that game anymore:

>>> myDict = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> for eachKey in myDict:
...   print eachKey, myDict[eachKey]
...   del myDict[eachKey]
...
a 1
Traceback (most recent call last):
  File "", line 1, in ?
RuntimeError: dictionary changed size during
              iteration

This will help prevent buggy code. For full details on iterators, see PEP 234.

______________________

Webcast
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers

Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.

Learn More

Sponsored by AMD

White Paper
Red Hat White Paper: Using an Open Source Framework to Catch the Bad Guy

Built-in forensics, incident response, and security with Red Hat Enterprise Linux 6

Every security policy provides guidance and requirements for ensuring adequate protection of information and data, as well as high-level technical and administrative security requirements for a system in a given environment. Traditionally, providing security for a system focuses on the confidentiality of the information on it. However, protecting the data integrity and system and data availability is just as important. For example, when processing United States intelligence information, there are three attributes that require protection: confidentiality, integrity, and availability.

Learn more about catching the bad guy in this free white paper.

Learn More

Sponsored by DLT Solutions