Programming Tools: Refactoring
Refactoring is the process of modifying code without affecting
either its input or output; that is, the code's interface remains the
same. How do we know that refactoring works? The only objective way to
find the answer is by testing the code before and after refactoring
occurs. The results should match exactly. This makes testing an integral
part of any refactoring effort--you cannot successfully separate the two.
Some confusion exists between what it means to refactor code as opposed
to reworking it. Refactoring is a subset of reworking. If you are making no changes
to the interface of a program, you are refactoring. If you add or change
the interface, you are rewriting the code. Strictly speaking, when you
do both together, you are rewriting. However, for political reasons,
you still might want to call it refactoring.
When it comes to methodology, refactoring usually is done in small
steps. After each step, you run those unit tests originally written
for the existing code. Repeat these two steps as needed. Also, a number
of refactoring patterns can be followed. Martin Fowler's book (see
Resources) contains some basic ones, and Joshua Kerievsky's more recent
text (see Resources) contains some more advanced ones. As a side issue,
it is interesting to note that most books and products dealing with
refactoring use Java as the language of choice. Is that because Java is
the most appropriate language, or does Java code need more refactoring?
All the refactoring tools I have seen so far require heavy human
intervention. They don't do pattern recognition, so they aren't able
to see commonality in code or design.
Java Tools
As noted above, a lot of refactoring tools are out there, especially for
Java. Open-source tools are available, some of which are listed in
Resources. Commercial refactoring products can be found in the usual
way. In this article, I discuss two open-source tools, Eclipse
and Bicycle Repair Man. Eclipse is designed for use with Java, while
Bicycle Repair Man is designed to be used with Python.
Let's start with Eclipse and Java. Eclipse offers an impressive set
of refactoring facilities, including support for:
- Copying and moving Java elements
- Extracting a method
- Extracting a local variable
- Extracting a constant
- Renaming a package
- Renaming a compilation unit
- Renaming a class or interface
- Renaming a method
- Renaming a field
- Renaming a local variable
- Renaming method parameters
- Changing method signature
- Inlining a local variable
- Inlining a method
- Inlining a constant
- Self encapsulating a field
- Replacing a local variable with a
query - Pulling members up to superclass
- Pushing members down to subclasses
- Moving static members between types
- Moving an instance method to a
component - Converting a local variable to a field
- Converting an anonymous inner class to a nested
class - Converting a nested type to a top level
type - Extracting an interface from a type
- Replacing references to a type with references to one of
its subtypes
To give you some idea of what Eclipse's refactoring can do, I chose at
random the Java program shown in Listing 1. It outputs an About dialog
box.
Listing 1. Random Java
Program
package junit.awtui;
import java.awt.*;
import java.awt.event.*;
import junit.runner.Version;
class AboutDialog extends Dialog {
public AboutDialog(Frame parent) {
super(parent);
setResizable(false);
setLayout(new GridBagLayout());
setSize(330, 138);
setTitle("About");
Button button= new Button("Close");
button.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e) {
dispose();
}
}
);
Label label1= new Label("JUnit");
label1.setFont(new Font("dialog", Font.PLAIN, 36));
Label label2= new Label("JUnit "+
Version.id()+
" by Kent Beck and Erich Gamma");
label2.setFont(new Font("dialog", Font.PLAIN, 14));
Logo logo= new Logo();
GridBagConstraints constraintsLabel1= new GridBagConstraints();
constraintsLabel1.gridx = 3; constraintsLabel1.gridy = 0;
constraintsLabel1.gridwidth = 1; constraintsLabel1.gridheight = 1;
constraintsLabel1.anchor = GridBagConstraints.CENTER;
add(label1, constraintsLabel1);
GridBagConstraints constraintsLabel2= new GridBagConstraints();
constraintsLabel2.gridx = 2; constraintsLabel2.gridy = 1;
constraintsLabel2.gridwidth = 2; constraintsLabel2.gridheight = 1;
constraintsLabel2.anchor = GridBagConstraints.CENTER;
add(label2, constraintsLabel2);
GridBagConstraints constraintsButton1= new GridBagConstraints();
constraintsButton1.gridx = 2; constraintsButton1.gridy = 2;
constraintsButton1.gridwidth = 2; constraintsButton1.gridheight = 1;
constraintsButton1.anchor = GridBagConstraints.CENTER;
constraintsButton1.insets= new Insets(8, 0, 8, 0);
add(button, constraintsButton1);
GridBagConstraints constraintsLogo1= new GridBagConstraints();
constraintsLogo1.gridx = 2; constraintsLogo1.gridy = 0;
constraintsLogo1.gridwidth = 1; constraintsLogo1.gridheight = 1;
constraintsLogo1.anchor = GridBagConstraints.CENTER;
add(logo, constraintsLogo1);
addWindowListener(
new WindowAdapter() {
public void windowClosing(WindowEvent e) {
dispose();
}
}
);
}
}
First, I scanned the code and saw that two similar fragments
of code dealt with labels. Second, I manually brought together all
the code to do with labels. Using similar logic, I brought together the
code dealing with the button and the logo. This step is shown in Listing
2.
Listing 2. Rearranging the Code
package junit.awtui;
import java.awt.*;
import java.awt.event.*;
import junit.runner.Version;
class AboutDialog extends Dialog {
public AboutDialog(Frame parent) {
super(parent);
setResizable(false);
setLayout(new GridBagLayout());
setSize(330, 138);
setTitle("About");
Label label1= new Label("JUnit");
label1.setFont(new Font("dialog", Font.PLAIN, 36));
Label label2= new Label("JUnit "+
Version.id()+
" by Kent Beck and Erich Gamma");
label2.setFont(new Font("dialog", Font.PLAIN, 14));
Logo logo= new Logo();
GridBagConstraints constraintsLabel1= new GridBagConstraints();
constraintsLabel1.gridx = 3;
constraintsLabel1.gridy = 0;
constraintsLabel1.gridwidth = 1;
constraintsLabel1.gridheight = 1;
constraintsLabel1.anchor = GridBagConstraints.CENTER;
add(label1, constraintsLabel1);
GridBagConstraints constraintsLabel2= new GridBagConstraints();
constraintsLabel2.gridx = 2;
constraintsLabel2.gridy = 1;
constraintsLabel2.gridwidth = 2;
constraintsLabel2.gridheight = 1;
constraintsLabel2.anchor = GridBagConstraints.CENTER;
add(label2, constraintsLabel2);
Button button= new Button("Close");
button.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e) {
dispose();
}
}
);
GridBagConstraints constraintsButton1=new GridBagConstraints();
constraintsButton1.gridx = 2;
constraintsButton1.gridy = 2;
constraintsButton1.gridwidth = 2;
constraintsButton1.gridheight = 1;
constraintsButton1.anchor = GridBagConstraints.CENTER;
constraintsButton1.insets= new Insets(8, 0, 8, 0);
add(button, constraintsButton1);
GridBagConstraints constraintsLogo1= new GridBagConstraints();
constraintsLogo1.gridx = 2;
constraintsLogo1.gridy = 0;
constraintsLogo1.gridwidth = 1;
constraintsLogo1.gridheight = 1;
constraintsLogo1.anchor = GridBagConstraints.CENTER;
add(logo, constraintsLogo1);
addWindowListener(
new WindowAdapter() {
public void windowClosing(WindowEvent e) {
dispose();
}
}
);
}
}
Next, I extracted each method by first highlighting each method and then
selecting Refactor | Extract Method. The dialog box is shown below.
The resulting code is shown in Listing 3.
Listing 3. Refactored Code
package junit.awtui;
import java.awt.*;
import java.awt.event.*;
import junit.runner.Version;
class AboutDialog extends Dialog {
public AboutDialog(Frame parent) {
super(parent);
setResizable(false);
setLayout(new GridBagLayout());
setSize(330, 138);
setTitle("About");
gbLabel1();
gbLabel2();
gbButton();
gbLogo();
addWindowListener(
new WindowAdapter() {
public void windowClosing(WindowEvent e) {
dispose();
}
}
);
}
/**
*
*/
private void gbLogo() {
Logo logo= new Logo();
GridBagConstraints constraintsLogo1= new GridBagConstraints();
constraintsLogo1.gridx = 2;
constraintsLogo1.gridy = 0;
constraintsLogo1.gridwidth = 1;
constraintsLogo1.gridheight = 1;
constraintsLogo1.anchor = GridBagConstraints.CENTER;
add(logo, constraintsLogo1);
}
/**
*
*/
private void gbButton() {
Button button= new Button("Close");
button.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e) {
dispose();
}
}
);
GridBagConstraints constraintsButton1= new GridBagConstraints();
constraintsButton1.gridx = 2;
constraintsButton1.gridy = 2;
constraintsButton1.gridwidth = 2;
constraintsButton1.gridheight = 1;
constraintsButton1.anchor = GridBagConstraints.CENTER;
constraintsButton1.insets= new Insets(8, 0, 8, 0);
add(button, constraintsButton1);
}
/**
*
*/
private void gbLabel2() {
Label label2= new Label("JUnit "+
Version.id()+
" by Kent Beck and Erich Gamma");
label2.setFont(new Font("dialog", Font.PLAIN, 14));
GridBagConstraints constraintsLabel2= new GridBagConstraints();
constraintsLabel2.gridx = 2;
constraintsLabel2.gridy = 1;
constraintsLabel2.gridwidth = 2;
constraintsLabel2.gridheight = 1;
constraintsLabel2.anchor = GridBagConstraints.CENTER;
add(label2, constraintsLabel2);
}
/**
*
*/
private void gbLabel1() {
Label label1= new Label("JUnit");
label1.setFont(new Font("dialog", Font.PLAIN, 36));
GridBagConstraints constraintsLabel1= new GridBagConstraints();
constraintsLabel1.gridx = 3;
constraintsLabel1.gridy = 0;
constraintsLabel1.gridwidth = 1;
constraintsLabel1.gridheight = 1;
constraintsLabel1.anchor = GridBagConstraints.CENTER;
add(label1, constraintsLabel1);
}
}
Looking at the code again, it seems that gbLabel1 and gbLabel2
share the same logic, but the values of some of the variables differ
slightly. Manually comparing the two, I tried to create a new method based
on gbLabel1 that had parameters on those variable values that differed
between gbLabel1 and gbLabel2. I did this by using the Introduce Parameter
refactoring wizard. Listing 4 shows the result of using this wizard to
replace the label's text and font values with method parameters.
Listing 4. Replacing Values with Parameters
package junit.awtui;
import java.awt.*;
import java.awt.event.*;
import junit.runner.Version;
class AboutDialog extends Dialog {
public AboutDialog(Frame parent) {
super(parent);
setResizable(false);
setLayout(new GridBagLayout());
setSize(330, 138);
setTitle("About");
gbLabel1(new Label("JUnit"),new Font("dialog",Font.PLAIN, 36));
gbLabel2();
gbButton();
gbLogo();
addWindowListener(
new WindowAdapter() {
public void windowClosing(WindowEvent e) {
dispose();
}
}
);
}
/**
*
*/
private void gbLogo() {
Logo logo= new Logo();
GridBagConstraints constraintsLogo1= new GridBagConstraints();
constraintsLogo1.gridx = 2;
constraintsLogo1.gridy = 0;
constraintsLogo1.gridwidth = 1;
constraintsLogo1.gridheight = 1;
constraintsLogo1.anchor = GridBagConstraints.CENTER;
add(logo, constraintsLogo1);
}
/**
*
*/
private void gbButton() {
Button button= new Button("Close");
button.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e) {
dispose();
}
}
);
GridBagConstraints constraintsButton1= new GridBagConstraints();
constraintsButton1.gridx = 2;
constraintsButton1.gridy = 2;
constraintsButton1.gridwidth = 2;
constraintsButton1.gridheight = 1;
constraintsButton1.anchor = GridBagConstraints.CENTER;
constraintsButton1.insets= new Insets(8, 0, 8, 0);
add(button, constraintsButton1);
}
/**
*
*/
private void gbLabel2() {
Label label2= new Label("JUnit "+
Version.id()+
" by Kent Beck and Erich Gamma");
label2.setFont(new Font("dialog", Font.PLAIN, 14));
GridBagConstraints constraintsLabel2= new GridBagConstraints();
constraintsLabel2.gridx = 2;
constraintsLabel2.gridy = 1;
constraintsLabel2.gridwidth = 2;
constraintsLabel2.gridheight = 1;
constraintsLabel2.anchor = GridBagConstraints.CENTER;
add(label2, constraintsLabel2);
}
/**
* @param lblText TODO
* @param lblFont TODO
*
*/
private void gbLabel1(Label lblText, Font lblFont) {
Label label1= lblText;
label1.setFont(lblFont);
GridBagConstraints constraintsLabel1= new GridBagConstraints();
constraintsLabel1.gridx = 3;
constraintsLabel1.gridy = 0;
constraintsLabel1.gridwidth = 1;
constraintsLabel1.gridheight = 1;
constraintsLabel1.anchor = GridBagConstraints.CENTER;
add(label1, constraintsLabel1);
}
}
Two things in Listing 4 are worth noting. First, in the comments before the newly modified method,
TODOs remind you to comment on the meaning of the parameters. Second, in
both gbLabel1 and gbLabel2 the placement of the label is specified using
a rectangle with its upper-left corner and height and width values. Eclipse cannot recognize these
as attributes of a rectangle; it must be done manually. I used the Change Method Signature
wizard to introduce the Rectangle parameter.
Besides changing the method signature, I also needed to change manually
the code where gbLabel1 is invoked. Listing 5 shows the results with
some minor formatting changes being added.
Listing 5. Changing gbLabel1 Invocation Code
package junit.awtui;
import java.awt.*;
import java.awt.event.*;
import junit.runner.Version;
class AboutDialog extends Dialog {
public AboutDialog(Frame parent) {
super(parent);
setResizable(false);
setLayout(new GridBagLayout());
setSize(330, 138);
setTitle("About");
gbLabel1(new Label("JUnit"),
new Font("dialog", Font.PLAIN, 36),
new Rectangle(3,0,1,1));
gbLabel2();
gbButton();
gbLogo();
addWindowListener(
new WindowAdapter() {
public void windowClosing(WindowEvent e) {
dispose();
}
}
);
}
/**
*
*/
private void gbLogo() {
Logo logo= new Logo();
GridBagConstraints constraintsLogo1= new GridBagConstraints();
constraintsLogo1.gridx = 2;
constraintsLogo1.gridy = 0;
constraintsLogo1.gridwidth = 1;
constraintsLogo1.gridheight = 1;
constraintsLogo1.anchor = GridBagConstraints.CENTER;
add(logo, constraintsLogo1);
}
/**
*
*/
private void gbButton() {
Button button= new Button("Close");
button.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e) {
dispose();
}
}
);
GridBagConstraints constraintsButton1= new GridBagConstraints();
constraintsButton1.gridx = 2;
constraintsButton1.gridy = 2;
constraintsButton1.gridwidth = 2;
constraintsButton1.gridheight = 1;
constraintsButton1.anchor = GridBagConstraints.CENTER;
constraintsButton1.insets= new Insets(8, 0, 8, 0);
add(button, constraintsButton1);
}
/**
*
*/
private void gbLabel2() {
Label label2= new Label("JUnit "+
Version.id()+
" by Kent Beck and Erich Gamma");
label2.setFont(new Font("dialog", Font.PLAIN, 14));
GridBagConstraints constraintsLabel2= new GridBagConstraints();
constraintsLabel2.gridx = 2;
constraintsLabel2.gridy = 1;
constraintsLabel2.gridwidth = 2;
constraintsLabel2.gridheight = 1;
constraintsLabel2.anchor = GridBagConstraints.CENTER;
add(label2, constraintsLabel2);
}
/**
* @param lblText Text to appear in label
* @param lblFont Font to use in showing text
* @param lblRect the size and position of the label
*
*/
private void gbLabel1(Label lblText, Font lblFont, Rectangle lblRect) {
Label label1= lblText;
label1.setFont(lblFont);
GridBagConstraints constraintsLabel1= new GridBagConstraints();
constraintsLabel1.gridx = lblRect.x;
constraintsLabel1.gridy = lblRect.y;
constraintsLabel1.gridwidth = lblRect.width;
constraintsLabel1.gridheight = lblRect.height;
constraintsLabel1.anchor = GridBagConstraints.CENTER;
add(label1, constraintsLabel1);
}
}
Next, I noticed that the variable label1 could be replaced by label,
because only one variable of this type is in this method. To do this,
I used the Rename... wizard. I highlighted the name to be changed and
invoked the wizard, as shown below.
Because the gbLabel1 method now has become quite generic, we also can
change its name simply to gbLabel. We start by highlighting an instance of
its name and invoking the Rename... wizard. Next, we replace the call to
gbLabel2 with an equivalent call to gbLabel. Again, this must be done manually.
Also, because gbLabel2 is no longer needed, we should delete it. Listing 6
shows the result of all this.
Listing 6. Cleaning Up the Code
package junit.awtui;
import java.awt.*;
import java.awt.event.*;
import junit.runner.Version;
class AboutDialog extends Dialog {
public AboutDialog(Frame parent) {
super(parent);
setResizable(false);
setLayout(new GridBagLayout());
setSize(330, 138);
setTitle("About");
gbLabel(new Label("JUnit"),
new Font("dialog", Font.PLAIN, 36),
new Rectangle(3,0,1,1));
gbLabel(new Label("JUnit "+
Version.id()+
" by Kent Beck and Erich Gamma"),
new Font("dialog", Font.PLAIN, 14),
new Rectangle(2,1,2,1));
gbButton();
gbLogo();
addWindowListener(
new WindowAdapter() {
public void windowClosing(WindowEvent e) {
dispose();
}
}
);
}
/**
*
*/
private void gbLogo() {
Logo logo= new Logo();
GridBagConstraints constraintsLogo1= new GridBagConstraints();
constraintsLogo1.gridx = 2;
constraintsLogo1.gridy = 0;
constraintsLogo1.gridwidth = 1;
constraintsLogo1.gridheight = 1;
constraintsLogo1.anchor = GridBagConstraints.CENTER;
add(logo, constraintsLogo1);
}
/**
*
*/
private void gbButton() {
Button button= new Button("Close");
button.addActionListener(
new ActionListener() {
public void actionPerformed(ActionEvent e) {
dispose();
}
}
);
GridBagConstraints constraintsButton1= new GridBagConstraints();
constraintsButton1.gridx = 2;
constraintsButton1.gridy = 2;
constraintsButton1.gridwidth = 2;
constraintsButton1.gridheight = 1;
constraintsButton1.anchor = GridBagConstraints.CENTER;
constraintsButton1.insets= new Insets(8, 0, 8, 0);
add(button, constraintsButton1);
}
/**
* @param lblText Text to appear in label
* @param lblFont Font to use in showing text
* @param lblRect the size and position of the label
*
*/
private void gbLabel(Label lblText, Font lblFont, Rectangle lblRect) {
Label lbl= lblText;
lbl.setFont(lblFont);
GridBagConstraints constraintsLabel1= new GridBagConstraints();
constraintsLabel1.gridx = lblRect.x;
constraintsLabel1.gridy = lblRect.y;
constraintsLabel1.gridwidth = lblRect.width;
constraintsLabel1.gridheight = lblRect.height;
constraintsLabel1.anchor = GridBagConstraints.CENTER;
add(lbl, constraintsLabel1);
}
}
If you wish, you can continue on the same track with both the button and
logo related methods. However, it is more of the same, so I won't bore
you further on using refactoring with Eclipse and Java.
Python Tools
The standard refactoring tool for Python is Bicycle Repair Man (BRM). Its
feature set is more limited and less mature than Eclipse's set, in part
because of the language differences and in part because of the number
of developers working on them. Bicycle Repair Man's current refactoring
wizards are Rename, Extract method, Inline variable, Extract variable,
Move class and Move function.
For this test, I chose a program I knew something about--the token
handler for PyMetrics. Last month I wrote about code metrics
and PyMetrics. I did not like how I coded the PyMetrics routine
shown below, and its McCabe metric of 52 for the __call__ method gave me reason to refactor and
rework the code. Recall that the McCabe Cyclomatic metric measures the
number of paths through your function or procedure. At 52, this method
is considered extremely high risk and untestable. Listing 7 shows the __call__ method that I wanted refactored.
Listing 7. PyMetric __call__ Method
""" TokenHandler readies tokens to passing to metric routines.
$Id: tokenhandler.py,v 1.2 2005/01/19 06:13:33 rcharney Exp $
"""
__version__ = "$Revision: 1.2 $"[11:-2]
#import token
from globals import *
class TokenHandler( object ):
""" Class to handle processing of tokens."""
def __call__( self, tokCount, tok ):
""" Common code for handling tokens."""
tokCount += 1
self.metrics['tokenCount'] = tokCount
if not tok.type in [WS,INDENT,DEDENT,EMPTY,ENDMARKER]:
prevTok = self.saveTok
self.saveTok = tok
self.context['blockNum'] = self.metrics.get('blockCount',0)
self.context['blockDepth'] = self.metrics.get('blockDepth',0)
self.context['parenDepth'] = self.parenDepth
self.context['bracketDepth'] = self.bracketDepth
self.context['braceDepth'] = self.braceDepth
self.classDepth += self.classDepthIncr
self.context['classDepth'] = self.classDepth
self.classDepthIncr = 0
self.fcnDepth += self.fcnDepthIncr # incr at start of fcn body
self.context['fcnDepth'] = self.fcnDepth
self.fcnDepthIncr = 0
# start testing for types that change in context
if self.docString and tok.type == token.STRING:
self.docString = False
tok.semtype = DOCSTRING
if self.checkForModuleDocString: # found the module's doc string
self.metrics['numModuleDocStrings'] += 1
self.checkForModuleDocString = False
self.__postToken( tok )
return
if tok.type == COMMENT:
# compensate for older tokenize including newline
# obals......in token when only thing on line is comment
# this patch makes all connents consistent
self.metrics['numComments'] += 1
if tok.text[-1] == '\n':
tok.text = tok.text[:-1]
if prevTok and prevTok.type != NEWLINE and prevTok.type != EMPTY:
tok.semtype = INLINE
self.metrics['numCommentsInline'] += 1
self.__postToken( tok )
return
if self.defFunction:
if tok.type == NAME:
self.__incr( 'numFunctions')
self.fqnName.append( tok.text )
self.currentFcn = self.__genFQN( self.fqnName )
self.fqnFunction.append( (self.currentFcn,self.blockDepth,FCNNAME) )
self.defFunction = False
self.findFcnHdrEnd = True
if self.pa.verbose > 0: print "fqnFunction=%s" % self.fqnFunction
self.__postToken( tok )
return
elif tok.type == ERRORTOKEN:
# we must compensate for the simple scanner mishandling errors
self.findFcnHdrEnd = True
self.invalidToken = mytoken.MyToken(type=NAME,
semtype=FCNNAME,
text=tok.text,
row=tok.row,
col=tok.col,
line=tok.line)
self.skipUntil = '(:\n'
return
elif self.defClass and tok.type == NAME:
self.__incr( 'numClasses' )
self.fqnName.append( tok.text )
self.currentClass = self.__genFQN( self.fqnName )
self.fqnClass.append( (self.currentClass,self.blockDepth,CLASSNAME) )
self.defClass = False
self.findClassHdrEnd = True
if self.pa.verbose > 0: print "fqnClass=%s" % self.fqnClass
self.__postToken( tok )
return
# return with types that don't change in context
self.__postToken( tok )
if tok.type == EMPTY:
self.endOfStmt()
return
if tok.type == WS:
return
if tok.type == NEWLINE:
self.endOfStmt()
return
# at this point, we have encountered a non-white space token
# if a module doc string has not been found yet, it never will be.
numModDocStrings = self.metrics['numModuleDocStrings']
if self.checkForModuleDocString and numModDocStrings == 0:
self.checkForModuleDocString = False
msg = ( "Module %s is missing a module doc string. Detected at line %d"
% (self.context['inFile'],tok.row) )
if not self.pa.quietSw:
sys.stderr.writelines( msg )
if tok.type == OP and tok.text == ':':
if self.findFcnHdrEnd:
self.findFcnHdrEnd = False
self.docString = True
self.fcnDepthIncr = 1
return
elif self.findClassHdrEnd:
self.findClassHdrEnd = False
self.docString = True
self.classDepthIncr = 1
return
if tok.type == OP:
if tok.text == '(':
self.parenDepth += 1
elif tok.text == ')':
self.parenDepth -= 1
elif tok.text == '[':
self.bracketDepth += 1
elif tok.text == ']':
self.bracketDepth -= 1
elif tok.text == '{':
self.braceDepth += 1
elif tok.text == '}':
self.braceDepth -= 1
return
if tok.type == INDENT:
self.__incr( 'blockCount')
self.blockDepth += 1
self.metrics['blockDepth'] = self.blockDepth
if self.metrics.get('maxBlockDepth',0) < self.blockDepth:
self.metrics['maxBlockDepth'] = self.blockDepth
return
elif tok.type == DEDENT:
assert self.fcnDepth == len( self.fqnFunction )
self.blockDepth -= 1
self.metrics['blockDepth'] = self.blockDepth
while len(self.fqnFunction) and self.blockDepth<=self.fqnFunction[-1][1]:
msg = self.endOfFunction( self.fcnExits )
if msg and not self.pa.quietSw:
print msg
if self.pa.verbose > 0:
print "Removing function %s" % self.__extractFQN( self.fqnFunction )
del self.fqnFunction[-1]
del self.fqnName[-1]
self.fcnDepth -= 1
assert self.fcnDepth == len( self.fqnFunction )
if len( self.fqnFunction ) > 0:
self.currentFcn = self.fqnFunction[-1][0]
else:
self.currentFcn = None
while len(self.fqnClass) and self.blockDepth <= self.fqnClass[-1][1]:
self.endOfClass()
self.classDepth -= 1
if self.pa.verbose > 0:
print "Removing class %s" % self.__extractFQN( self.fqnClass )
del self.fqnClass[-1]
del self.fqnName[-1]
if len( self.fqnClass ) > 0:
self.currentClass = self.fqnClass[-1][0]
else:
self.currentClass = None
return
self.docString = False
if tok.semtype == KEYWORD:
self.__incr( 'numKeywords')
if tok.text == 'def':
self.defFunction = True
elif tok.text == 'class':
self.defClass = True
elif tok.text == 'return':
assert self.fcnDepth == len( self.fqnFunction )
if self.fcnDepth == 0: # then we are not in any function
if not self.pa.quietSw: # then report on these types of errors
msg = ("Module %s contains the return statement" +
" at line %d that is outside any function" %
(self.context['inFile'],tok.row) )
print msg
if prevTok.text == ':':
# this return on same line as conditional,
# so it must be an extra return
self.fcnExits.append( tok.row )
elif self.blockDepth > 1:
# Let fcnBlockDepth be the block depth of the function body.
# We are trying to count the number of return statements
# in this function. Only one is allowed at the fcnBlockDepth for the
# function. If the self.blockDepth is greater than fcnBlockDepth,
# then this is a conditional return - i.e., an additional return
fcnBlockDepth = self.fqnFunction[-1][1] + 1
if self.blockDepth > fcnBlockDepth:
self.fcnExits.append( tok.row )
return
To produce Listing 8, I first highlighted lines 35-54 in Listing 7 and then used the
Extract Method wizard to move them into the doString() method.
Listing 8. Moving Code with the Extract
Method Wizard
""" TokenHandler readies tokens to passing to metric routines.
$Id: tokenhandler.py,v 1.2 2005/01/19 06:13:33 rcharney Exp $
"""
__version__ = "$Revision: 1.2 $"[11:-2]
#import token
from globals import *
class TokenHandler( object ):
""" Class to handle processing of tokens."""
def __call__( self, tokCount, tok ):
""" Common code for handling tokens."""
tokCount += 1
self.metrics['tokenCount'] = tokCount
if not tok.type in [WS,INDENT,DEDENT,EMPTY,ENDMARKER]:
prevTok = self.saveTok
self.saveTok = tok
self.context['blockNum'] = self.metrics.get('blockCount',0)
self.context['blockDepth'] = self.metrics.get('blockDepth',0)
self.context['parenDepth'] = self.parenDepth
self.context['bracketDepth'] = self.bracketDepth
self.context['braceDepth'] = self.braceDepth
self.classDepth += self.classDepthIncr
self.context['classDepth'] = self.classDepth
self.classDepthIncr = 0
self.fcnDepth += self.fcnDepthIncr # incr at start of fcn body
self.context['fcnDepth'] = self.fcnDepth
self.fcnDepthIncr = 0
# start testing for types that change in context
self.doStrings(tok, prevTok)
if self.defFunction:
if tok.type == NAME:
self.__incr( 'numFunctions')
self.fqnName.append( tok.text )
self.currentFcn = self.__genFQN( self.fqnName )
self.fqnFunction.append( (self.currentFcn,self.blockDepth,FCNNAME) )
self.defFunction = False
self.findFcnHdrEnd = True
if self.pa.verbose > 0: print "fqnFunction=%s" % self.fqnFunction
self.__postToken( tok )
return
elif tok.type == ERRORTOKEN:
# we must compensate for the simple scanner mishandling errors
self.findFcnHdrEnd = True
self.invalidToken = mytoken.MyToken(type=NAME,
semtype=FCNNAME,
text=tok.text,
row=tok.row,
col=tok.col,
line=tok.line)
self.skipUntil = '(:\n'
return
elif self.defClass and tok.type == NAME:
self.__incr( 'numClasses' )
self.fqnName.append( tok.text )
self.currentClass = self.__genFQN( self.fqnName )
self.fqnClass.append( (self.currentClass,self.blockDepth,CLASSNAME) )
self.defClass = False
self.findClassHdrEnd = True
if self.pa.verbose > 0: print "fqnClass=%s" % self.fqnClass
self.__postToken( tok )
return
# return with types that don't change in context
self.__postToken( tok )
if tok.type == EMPTY:
self.endOfStmt()
return
if tok.type == WS:
return
if tok.type == NEWLINE:
self.endOfStmt()
return
# at this point, we have encountered a non-white space token
# if a module doc string has not been found yet, it never will be.
numModDocStrings = self.metrics['numModuleDocStrings']
if self.checkForModuleDocString and numModDocStrings == 0:
self.checkForModuleDocString = False
msg = ( "Module %s is missing a module doc string. Detected at line %d"
% (self.context['inFile'],tok.row) )
if not self.pa.quietSw:
sys.stderr.writelines( msg )
if tok.type == OP and tok.text == ':':
if self.findFcnHdrEnd:
self.findFcnHdrEnd = False
self.docString = True
self.fcnDepthIncr = 1
return
elif self.findClassHdrEnd:
self.findClassHdrEnd = False
self.docString = True
self.classDepthIncr = 1
return
if tok.type == OP:
if tok.text == '(':
self.parenDepth += 1
elif tok.text == ')':
self.parenDepth -= 1
elif tok.text == '[':
self.bracketDepth += 1
elif tok.text == ']':
self.bracketDepth -= 1
elif tok.text == '{':
self.braceDepth += 1
elif tok.text == '}':
self.braceDepth -= 1
return
if tok.type == INDENT:
self.__incr( 'blockCount')
self.blockDepth += 1
self.metrics['blockDepth'] = self.blockDepth
if self.metrics.get('maxBlockDepth',0) < self.blockDepth:
self.metrics['maxBlockDepth'] = self.blockDepth
return
elif tok.type == DEDENT:
assert self.fcnDepth == len( self.fqnFunction )
self.blockDepth -= 1
self.metrics['blockDepth'] = self.blockDepth
while len(self.fqnFunction) and self.blockDepth<=self.fqnFunction[-1][1]:
msg = self.endOfFunction( self.fcnExits )
if msg and not self.pa.quietSw:
print msg
if self.pa.verbose > 0:
print "Removing function %s" % self.__extractFQN( self.fqnFunction )
del self.fqnFunction[-1]
del self.fqnName[-1]
self.fcnDepth -= 1
assert self.fcnDepth == len( self.fqnFunction )
if len( self.fqnFunction ) > 0:
self.currentFcn = self.fqnFunction[-1][0]
else:
self.currentFcn = None
while len(self.fqnClass) and self.blockDepth <= self.fqnClass[-1][1]:
self.endOfClass()
self.classDepth -= 1
if self.pa.verbose > 0:
print "Removing class %s" % self.__extractFQN( self.fqnClass )
del self.fqnClass[-1]
del self.fqnName[-1]
if len( self.fqnClass ) > 0:
self.currentClass = self.fqnClass[-1][0]
else:
self.currentClass = None
return
self.docString = False
if tok.semtype == KEYWORD:
self.__incr( 'numKeywords')
if tok.text == 'def':
self.defFunction = True
elif tok.text == 'class':
self.defClass = True
elif tok.text == 'return':
assert self.fcnDepth == len( self.fqnFunction )
if self.fcnDepth == 0: # then we are not in any function
if not self.pa.quietSw: # then report on these types of errors
msg = ( "Module %s contains the return statement at line %d " +
"that is outside any function"
% (self.context['inFile'],tok.row) )
if prevTok.text == ':':
# this return on same line as conditional,
# so it must be an extra return
self.fcnExits.append( tok.row )
elif self.blockDepth > 1:
# Let fcnBlockDepth be the block depth of the function body.
# We are trying to count the number of return statements
# in this function. Only one is allowed at the fcnBlockDepth for the
# function. If the self.blockDepth is greater than fcnBlockDepth,
# then this is a conditional return - i.e., an additional return
fcnBlockDepth = self.fqnFunction[-1][1] + 1
if self.blockDepth > fcnBlockDepth:
self.fcnExits.append( tok.row )
return
def doStrings(self, tok, prevTok):
if self.docString and tok.type == token.STRING:
self.docString = False
tok.semtype = DOCSTRING
if self.checkForModuleDocString: # we have found the module's doc string
self.metrics['numModuleDocStrings'] += 1
self.checkForModuleDocString = False
self.__postToken( tok )
return
if tok.type == COMMENT:
# compensate for older tokenize including newline
# obals......in token when only thing on line is comment
# this patch makes all connents consistent
self.metrics['numComments'] += 1
if tok.text[-1] == '\n':
tok.text = tok.text[:-1]
if prevTok and prevTok.type != NEWLINE and prevTok.type != EMPTY:
tok.semtype = INLINE
self.metrics['numCommentsInline'] += 1
self.__postToken( tok )
return
Notice that BRM is not as clever as Eclipse. It did not detect the use of
return statements; it simply moved them into
the extracted method. This would cause a problem when running, because the return would
be returning from the extracted method--not from the original __call__
method. I fixed this manually by modifying the new method to return True
if one of its returns was used; otherwise, it returned False. The __call__
method then tested the return value and exited itself if True. I also
realized that the method name doStrings was a slight misnomer because
the method also handled comments. So, I changed the method name to
doStringsOrComments using BRM's Rename feature. Listing 9 shows these
changes.
Listing 9. Renaming with BRM's Rename
Feature
""" TokenHandler readies tokens to passing to metric routines.
$Id: tokenhandler.py,v 1.2 2005/01/19 06:13:33 rcharney Exp $
"""
__version__ = "$Revision: 1.2 $"[11:-2]
#import token
from globals import *
class TokenHandler( object ):
""" Class to handle processing of tokens."""
def __call__( self, tokCount, tok ):
""" Common code for handling tokens."""
tokCount += 1
self.metrics['tokenCount'] = tokCount
if not tok.type in [WS,INDENT,DEDENT,EMPTY,ENDMARKER]:
prevTok = self.saveTok
self.saveTok = tok
self.context['blockNum'] = self.metrics.get('blockCount',0)
self.context['blockDepth'] = self.metrics.get('blockDepth',0)
self.context['parenDepth'] = self.parenDepth
self.context['bracketDepth'] = self.bracketDepth
self.context['braceDepth'] = self.braceDepth
self.classDepth += self.classDepthIncr
self.context['classDepth'] = self.classDepth
self.classDepthIncr = 0
self.fcnDepth += self.fcnDepthIncr # incr at start of fcn body
self.context['fcnDepth'] = self.fcnDepth
self.fcnDepthIncr = 0
# start testing for types that change in context
if self.doStringsOrComments(tok, prevTok):
return
if self.doFunctionName(tok):
return
elif self.doClassName(tok):
return
# return with types that don't change in context
self.__postToken( tok )
if tok.type == EMPTY:
self.endOfStmt()
return
if tok.type == WS:
return
if tok.type == NEWLINE:
self.endOfStmt()
return
# at this point, we have encountered a non-white space token
# if a module doc string has not been found yet, it never will be.
numModDocStrings = self.metrics['numModuleDocStrings']
if self.checkForModuleDocString and numModDocStrings == 0:
self.checkForModuleDocString = False
msg = ( "Module %s is missing a module doc string. Detected at line %d"
% (self.context['inFile'],tok.row) )
print msg
if not self.pa.quietSw:
sys.stderr.writelines( msg )
if tok.type == OP and tok.text == ':':
if self.findFcnHdrEnd:
self.findFcnHdrEnd = False
self.docString = True
self.fcnDepthIncr = 1
return
elif self.findClassHdrEnd:
self.findClassHdrEnd = False
self.docString = True
self.classDepthIncr = 1
return
if tok.type == OP:
if tok.text == '(':
self.parenDepth += 1
elif tok.text == ')':
self.parenDepth -= 1
elif tok.text == '[':
self.bracketDepth += 1
elif tok.text == ']':
self.bracketDepth -= 1
elif tok.text == '{':
self.braceDepth += 1
elif tok.text == '}':
self.braceDepth -= 1
return
if tok.type == INDENT:
self.__incr( 'blockCount')
self.blockDepth += 1
self.metrics['blockDepth'] = self.blockDepth
if self.metrics.get('maxBlockDepth',0) < self.blockDepth:
self.metrics['maxBlockDepth'] = self.blockDepth
return
elif tok.type == DEDENT:
assert self.fcnDepth == len( self.fqnFunction )
self.blockDepth -= 1
self.metrics['blockDepth'] = self.blockDepth
while len(self.fqnFunction) and self.blockDepth<=self.fqnFunction[-1][1]:
msg = self.endOfFunction( self.fcnExits )
if msg and not self.pa.quietSw:
print msg
if self.pa.verbose > 0:
print "Removing function %s" % self.__extractFQN( self.fqnFunction )
del self.fqnFunction[-1]
del self.fqnName[-1]
self.fcnDepth -= 1
assert self.fcnDepth == len( self.fqnFunction )
if len( self.fqnFunction ) > 0:
self.currentFcn = self.fqnFunction[-1][0]
else:
self.currentFcn = None
while len(self.fqnClass) and self.blockDepth <= self.fqnClass[-1][1]:
self.endOfClass()
self.classDepth -= 1
if self.pa.verbose > 0:
print "Removing class %s" % self.__extractFQN( self.fqnClass )
del self.fqnClass[-1]
del self.fqnName[-1]
if len( self.fqnClass ) > 0:
self.currentClass = self.fqnClass[-1][0]
else:
self.currentClass = None
return
self.docString = False
if tok.semtype == KEYWORD:
self.__incr( 'numKeywords')
if tok.text == 'def':
self.defFunction = True
elif tok.text == 'class':
self.defClass = True
elif tok.text == 'return':
assert self.fcnDepth == len( self.fqnFunction )
if self.fcnDepth == 0: # then we are not in any function
if not self.pa.quietSw: # then report on these types of errors
msg = ( "Module %s contains the return statement at line %d " +
"that is outside any function" %
(self.context['inFile'],tok.row) )
if prevTok.text == ':':
# this return on same line as conditional,
# so it must be an extra return
self.fcnExits.append( tok.row )
elif self.blockDepth > 1:
# Let fcnBlockDepth be the block depth of the function body.
# We are trying to count the number of return statements
# in this function. Only one is allowed at the fcnBlockDepth for the
# function. If the self.blockDepth is greater than fcnBlockDepth,
# then this is a conditional return - i.e., an additional return
fcnBlockDepth = self.fqnFunction[-1][1] + 1
if self.blockDepth > fcnBlockDepth:
self.fcnExits.append( tok.row )
return
def doClassName(self, tok):
""" Return True if current token is class name."""
if self.defClass and tok.type == NAME:
self.__incr( 'numClasses' )
self.fqnName.append( tok.text )
self.currentClass = self.__genFQN( self.fqnName )
self.fqnClass.append( (self.currentClass,self.blockDepth,CLASSNAME) )
self.defClass = False
self.findClassHdrEnd = True
if self.pa.verbose > 0: print "fqnClass=%s" % self.fqnClass
self.__postToken( tok )
return True
return False
def doFunctionName(self, tok):
""" Return True if this is a function name."""
if self.defFunction:
if tok.type == NAME:
self.__incr( 'numFunctions')
self.fqnName.append( tok.text )
self.currentFcn = self.__genFQN( self.fqnName )
self.fqnFunction.append( (self.currentFcn,self.blockDepth,FCNNAME) )
self.defFunction = False
self.findFcnHdrEnd = True
if self.pa.verbose > 0: print "fqnFunction=%s" % self.fqnFunction
self.__postToken( tok )
return True
elif tok.type == ERRORTOKEN:
# we must compensate for the simple scanner mishandling errors
self.findFcnHdrEnd = True
self.invalidToken = mytoken.MyToken(type=NAME,
semtype=FCNNAME,
text=tok.text,
row=tok.row,
col=tok.col,
line=tok.line)
self.skipUntil = '(:\n'
return True
return False
def doStringsOrComments(self, tok, prevTok):
""" Return True if this is a docstring or comment."""
if self.docString and tok.type == token.STRING:
self.docString = False
tok.semtype = DOCSTRING
if self.checkForModuleDocString: # we have found the module's doc string
self.metrics['numModuleDocStrings'] += 1
self.checkForModuleDocString = False
self.__postToken( tok )
return True
if tok.type == COMMENT:
# compensate for older tokenize including newline
# obals......in token when only thing on line is comment
# this patch makes all connents consistent
self.metrics['numComments'] += 1
if tok.text[-1] == '\n':
tok.text = tok.text[:-1]
if prevTok and prevTok.type != NEWLINE and prevTok.type != EMPTY:
tok.semtype = INLINE
self.metrics['numCommentsInline'] += 1
self.__postToken( tok )
return True
return False
An early refactoring program, BRM has a number of other issues of which you
should be aware. They include:
- If a variable is used before an extracted method and
later is redefined in the extracted method, the variable is passed as an
argument to the extracted method, even when the passed value is never
used. - If an argument is a simple variable and it is modified
in the extracted method, the name in the extracted method is rebound to
a different object. (For example, the simple parameter is passed as if it were
a value.) This leaves unchanged the variable passed into the
extracted method when the method returns.
Far more refactoring can be done on both the Java and Python code
samples. But, I believe I have given you a brief overview of what
current refactoring tools can do.
Conclusion
The benefits of using tools to do code refactoring include:
- They take a fair amount of drudgery out of refactoring
code. - They calculate which variables need to appear in the
parameter list of the extracted methods, barring the problem
above. - They make renaming variables/classes/functions/methods
almost painless.
I realize I haven't discussed why you should refactor your code, or anyone
else's for that matter. Often, there are both technical and managerial reasons to do
refactoring. Using prototypes for production purposes is one of the
bugaboos of programming, and we know the problems this causes. However, refactoring may
be a way out of this cul-de-sac. Perhaps, you can use the benefits of
refactoring to justify a rewrite to management.
Postscript
I am puzzled by the differences in the interest between refactoring and
metrics. I have found no objective way of measuring the effect of code
refactoring. It seems that we simply know that the refactored code is
"better".
I am also bothered by my response to one comment on last month's
column. The writer mentioned that Function Points are an accepted
metric for measuring complexity. They also provide good, consistent
metrics for estimating project costs. I made an error of omission in
not mentioning function points. On further reflection, I still don't
see them as effective code metrics. Function points most often are used
during specification and design of systems. They also are labor-intensive.
I know of no product out there that can do a proper function point analysis
on an existing code base. Please correct me if I am wrong.
Resources
Sources for Information about
Refactoring
An Older Java
Refactoring Tool
Refactoring: Improving the Design of Existing Code
by Martin Fowler, Don Roberts, John Brant, William Opdyke (ISBN
0-201-48567-2), Addison-Wesley.
Refactoring to Patterns by Joshua Kerievsky, ISBN 0-321-21335-1, Addison-Wesley










This week 5 lucky Members will receive a copy of The Official Ubuntu Server Book by Benjamin Mako Hill and Linux Journal's very own Kyle Rankin. No entry necessary. Check back here early next week to find out who the lucky Online Members are.




Comments
Bicycle Repair Man and Eclipse
It is also worth mentioning that Bicycle Repair Man is integrated into Eclipse through the excellent PyDev plug-in: http://pydev.sourceforge.net/
PyDev tries to offer all the great relevant tools available when working in Java to Python developers.
I also agree with the author that while the definition of refactoring can be abstracted to a very high architecture level most programmers use the term "refactor" to mean removing all of the hacks and shortcuts in code that were inserted because of a short timeline for a software demonstration, release, or for testing with components created by other developers on a team.
Bicycle Repair Man and Eclipse
It is also worth mentioning that Bicycle Repair Man is integrated into Eclipse through the excellent PyDev plug-in: http://pydev.sourceforge.net/
PyDev tries to offer all the great relevant tools available when working in Java to Python developers.
I also agree with the author that while the definition of refactoring can be abstracted to a very high architecture level most programmers use the term "refactor" to mean removing all of the hacks and shortcuts in code that were inserted because of a short timeline for a software demonstration, release, or for testing with components created by other developers on a team.
Definition of refactoring
Here is another instance where I think computer science is at risk of being obscured instead of aided by technology.
The article addresses refactoring as if it were concerned exclusively with code. It's not. Refactoring is primarily about architecture and only secondarily about code. In other words, the focus is on design more than implementation.
Peter Deutsch says, and I agree, that:
Interface design and functional factoring constitute the key intellectual content of software and are far more difficult to create or re-create than code.
Most of the work of refactoring a system consists of revising the relationships between its functional components, indeed adding and eliminating components as well. These are changes to structure, not details of implementation.
It needn't be a software system. It would be perfectly reasonable to refactor an automobile engine, for example, so that the timing chain might be driven off the clutch assembly rather than the generator assembly, or in a more extreme case the cylinder layout may change from inline to vee. In a broad sense, the engine is functionally unchanged. That is, it still meets its original design requirements. However, its structure may have changed radically.
When you refactor a system, you're taking a fresh look at its design tradeoffs. It's really a "big picture" exercise in which basic assumptions of modularity, flow of control, and data abstraction are reexamined. This may answer the author's misgivings about how we're supposed to "know that the refactored code is 'better'." It wouldn't be the code that's better but the design, and of course that is a matter best debated at the design level.
What then is the benefit of "refactoring" tools, since indeed many only pertain to code and not to design? Well, to belabor the engine analogy, a boring machine would be in a comparable sense an "engine refactoring" tool. It doesn't actually do any refactoring itself, but it allows you a way to implement the design changes which are the result of your refactoring efforts.
Definition of refactoring
Your point is well taken. However, this column is about programming tools. My inclination is to be as concrete as I can. Since the title of this column is programming as opposed to design, and tools as in software that I can use to improve my programming experience, I chose to use the much narrower definition.
That said, if you are aware of any refactoring tools for the design portion of a system's life cycle, please let me know. I would be really interested.
Reg. Charney
Definition of refactoring
I think your best bet for that approach would be to look at UML refactoring tools. A good overview can be found in Martin Fowler, Refactoring: Improving the Design of Existing Code.
This is not an endorsement of UML as the basis of refactoring, by the way, because I believe that misses the point. Refactoring is substantially a conceptual exercise. Like optimization, the biggest gains require an understanding of the problem domain. On the other hand, if UML is already part of your process, or if you're willing to make it one, then refactoring at that level would be completely appropriate. It certainly gets you closer to understanding the design issues of a project than if you were to go straight to the code.
Again, you have a valid point
Again, you have a valid point. Do you or do any of our readers know of tools that would work on the UML description of a system?
Reg. Charney
M-x tags-query-replace
enough said :-)
Post new comment