Transitioning to Python 3

The Python language, which is not new but continues to gain momentum and users as if it were, has changed remarkably little since it first was released. I don't mean to say that Python hasn't changed; it has grown, gaining functionality and speed, and it's now a hot language in a variety of domains, from data science to test automation to education. But, those who last used Python 15 or 20 years ago would feel that the latest versions of the language are a natural extension and evolution of what they already know.

At the same time, changes to the language—and particularly changes made in Python 3.x—mean that Python 2 programs won't run unmodified in Python 3. This is a known issue, and it was part of the process that Python's BDFL (Benevolent Dictator for Life) Guido van Rossum announced back when the "Python 3000" project was launched years ago. Guido expected it would take time for organizations to move from Python 2 to Python 3, but he also felt that the improvements to the language were necessary.

The good news is that Python 3, which at the time of this writing exists in version 3.5, is indeed better than Python 2. The bad news is that there still are a lot of companies (including many of my training and consulting clients) that still use Python 2.

Why don't they just upgrade? For the most part, it's because the time and effort needed to do so aren't seen as a worthwhile investment of developer resources. Most differences between Python 2 and 3 are easily expressed and understood by people, but the upgrades aren't completely automatic. Moving a large code base from Python 2 to 3 might take days, but it also might take weeks or months.

That said, companies will soon be forced to upgrade, because as of the year 2020, there will be no more support for Python 2. That's a risk many companies aren't going to want to take.

If you have to upgrade, but can't upgrade, that puts you in a terrible spot. However, there is another option: upgrade incrementally, modifying just 1–2 files each week so that they work with both Python 2 and 3. After a number of months of such incremental changes, you'll be able to switch completely to Python 3 with relatively little investment.

How can you make your code compatible with both? In this article, I provide a number of suggestions on how to do this, using both an understanding of Python 3's changes and the tools that have been developed to make this transition easier. Don't wait until 2019 to start making these changes; if you're a Python developer, you already (in mid-2016) should be thinking about how to change your code to be Python 3-compatible.

What Has Changed?

The first thing to ask is this: what exactly changed in Python 3? And, how easily can you move from Python 2 to Python 3? Or, how can you modify your Python 2 programs so they'll continue to work in Python 2, but then also work unmodified in Python 3? This last question is probably the most important one for my clients, and possibly for your business as well, during this transition period.

On the face of things, not very much actually changed in Python 3. It's a cleaner, more efficient and modern language that works like more modern Python developers want and expect. Things that Python developers were doing for years, but that weren't defaults in the language, are now indeed defaults. Sure, there are things I'm still getting used to after years of bad habits, such as failing to use parentheses around the arguments passed to print, but on the whole, the language has stayed the same.

However, this doesn't mean that nothing has changed or that you can get away with not changing your code.

For example, you almost certainly never wanted to use Python 2's input built-in function to get user input. Rather, you wanted to use the raw_input built-in function. So in Python 3, there is no equivalent to Python 2's input; the Python 3 input function is the same as Python 2's raw_input.

A more profound change is the switch in the behavior of strings. No longer do strings contain bytes; now they contain Unicode characters, encoded using UTF-8. If 100% of your work uses ASCII, you're in luck; nothing in your programs will really need to change. But if you use non-ASCII characters, and if you do so in the same program as you work with the contents of binary files, you'll have to make some adjustments. Python 2's str class is now a bytes class, and Python 2's unicode class is now the str class.