Simplified Exception Identification in Python
One of the features that makes Python a great programming language is exceptions for error handling. Exceptions are convenient in many ways for handling errors and special conditions in a program. But, if several kinds of exceptions occur in short sections of code and in several parts of a program, it quickly becomes tedious and error-causing to recode the "except:" chunks of the code. Error recovery, especially, is a place where you want to have well-tested, clear and simple chunks of code. This article suggests an approach that helps users achieve this goal.
A second factor that makes Python so flexible is it does not reinvent the wheel when its API interfaces with the operating system. Any C or C++ programmer familiar with standard *NIX system calls and libraries can leverage his/her previous knowledge when moving into Python application development. On the other hand, the Python socket module generates exceptions that differ from one platform to another. As an example, the Connection refused error exception is assigned the number 111 under Linux, but it is 10061 in another popular operating system. Again, it quickly becomes boring to code multiple except: clauses for use on many different platforms.
Always searching for a better and easier way to do things, let's look now at how we can identify and categorize exceptions in Python in order to simplify error recovery. In addition, let's try to do this in a way that can be applied to multiple operating systems.
There is more than one generic way to discover the identity of an exception. First, you need to code a catch-all except: clause, like this:
try: ...some statements here... except: ...exception handling...
In the exception handling code, we want to have as few lines of code as possible. Also, it is highly desirable to funnel both normal exceptions (disconnect, connection refused) alongside the weirder ones (Attribute error!), running both through a single execution path. You will, of course, need more statements to do the precise action required by the condition. But, if you can do the first four or five steps in a generic way, it will making testing things later as easier task.
In our example, Python offers two ways to access the exception information. For both, the Python script first must have import sys before the try: .. except: portion of the code. With the first method, the function sys.exc_type gives the name of the exception, and sys.exc_value gives more details about the exception. For example, in a NameError exception the sys.exc_value command might return "There is no variable named 'x'", when x was referenced without first having been assigned a value. This method, however, is not thread-safe. As a result, it is not that useful, because most network applications are multithreaded.
The second way to access exception information is with sys.exc_info(). This function is thread-safe and also is more flexible, although it might look intimidating at first. If you run the following code:
import sys try: x = x + 1 except: print sys.exc_info()
You will see this message:
(<class exceptions.NameError at 007C5B2C>, <exceptions.NameError instance at 007F5E3C>, <traceback object at 007F5E10>)
How's that for cryptic! But, with a few lines of code we can unravel this into rather useful chunks of information. Suppose that you run the following code instead:
import sys import traceback def formatExceptionInfo(maxTBlevel=5): cla, exc, trbk = sys.exc_info() excName = cla.__name__ try: excArgs = exc.__dict__["args"] except KeyError: excArgs = "<no args>" excTb = traceback.format_tb(trbk, maxTBlevel) return (excName, excArgs, excTb) try: x = x + 1 except: print formatExceptionInfo()
This will display:
('NameError', ("There is no variable named 'x'",), [' File "<stdin>", line 14, in ?\n'])
The function formatExceptionInfo() takes the three-element tuple returned by sys.exc_info() and transforms each element into a more convenient form, a string. cla.__name__ gives the name of the exception class, while exc.__dict__["args"] gives other details about the exception. In the case of socket exceptions, these details will be in a two-element tuple, like ("error", (32, 'Broken pipe'). Lastly, traceback.format_tb() formats the traceback information into a string. The optional argument (maxTBlevel> in the sample code) allows users to control the depth of the traceback that will be formatted. The traceback information is not essential to identify or categorize exceptions, but if you want to log all the spurious unknown exceptions your program encounters, it is useful to write that traceback string in the log.
With the first two elements--the name of the exception class and the exception details--we can now try to identify the exception and reduce it to a well-known one from a set of previously recognized exception patterns.