An Introduction to Application Development with Catalyst and Perl

on May 14, 2012

Catalyst is the latest in the evolution of open-source Web development frameworks. Written in modern Perl and inspired by many of the projects that came before it, including Ruby on Rails, Catalyst is elegant, powerful and refined. It's a great choice for creating any Web-based application from the simple to the very complex.

Like many other popular Perl-based projects, Catalyst has a strong focus on flexibility and choice. Catalyst is especially powerful because it provides an abundance of features and the core environment, structure and interfaces on which virtually anything can be built without forcing you to do things in any particular way.

Writing applications in Catalyst is fast too. Just because you can tackle any aspect of application design yourself, doesn't mean you have to. Catalyst provides a wide array of refined, high-level, drop-in solutions to all kinds of problems and needs without limiting access to the nuts and bolts. Templating, ORM, authentication, automatic session management and all the other high-level features you'd want from a Web framework are available in Catalyst—and more.

Catalyst's approach is to provide these high-level features as optional plugins and modules. This is one of the greatest strengths of Perl—a tremendous number of refined modules and libraries are available. So, instead of re-inventing all this functionality, Catalyst provides a framework to bring together seamlessly what already exists.

Catalyst is bigger than itself—it is also everything that's available in CPAN. That alone makes it one of the most feature-rich frameworks there are.

In this article, I provide an introduction to Catalyst and how to use it for rapid application development. I cover the basics of how to create and lay out a new application as well as how to write the actions that will handle requests. I explain how to define flexible URL dispatch logic and some of the APIs that are available. I focus on the fundamentals, but I cover some of the popular available components as well, such as Template::Toolkit. I also talk about how you can extend Catalyst itself, and how you can deploy an application with Apache.

Background Knowledge and the MVC Architecture

Catalyst and Catalyst applications are written in Perl, so some basic Perl knowledge is necessary to use Catalyst effectively. You also should have some experience with object-oriented programming concepts, such as classes, methods, inheritance and so on.

Like Rails, Django, CakePHP and many other Web frameworks, Catalyst follows the venerable Model-View-Controller architectural pattern. MVC is a proven approach to structuring and segmenting application code for efficiency, flexibility and maintainability.

Plenty of tutorials and resources are available for MVC, so I won't spend too much time covering it here. If you've worked with other Web frameworks, chances are you're already familiar with MVC. If not, the most important thing to understand is that it is more about best practices than anything else.

The focus of this article is to explain the core details of how Catalyst operates, but since Catalyst made most of its layout decisions according to MVC, you'll still see it along the way.

Getting Catalyst

Before you can install Catalyst on your system, you obviously need Perl. Most Linux distros already have Perl installed out of the box, but if not, install it with your package manager.

Catalyst itself is a Perl library that you can install with cpan:


cpan Catalyst::Devel

The previous command installs Catalyst with development tools along with its many dependencies. For production/hosting systems that will run only applications without the need for development tools, you can install the smaller Catalyst::Runtime bundle instead.

Because Catalyst has so many dependencies, it can take quite a while to install on a fresh system. By default, CPAN asks if it should install each dependency individually, which can become redundant really quick. You can configure CPAN not to ask, but instead, I usually just cheat by holding down Enter for a few seconds to queue up a bunch of default ("yes, install the module!") keystroke/answers.

If the install fails on the first attempt, don't fret. Whatever the problem may be, it probably will be explained in the scrollback along with what to do to solve it. Typically, this involves nothing more than installing/upgrading another module that wasn't automatically in the dependency tree for whatever reason, or just running the cpan command a second time.

Catalyst Application Layout

Every Catalyst application is a Perl module/library/bundle—exactly like the modules on CPAN. This consists of a package/class namespace and standard structure of files and directories. The Catalyst::Devel package comes with a helper script to create new "skeleton" applications and to initialize the files and directories for you. For example, to create a new application called KillerApp, run the following:


catalyst.pl KillerApp

This creates a new application structure at KillerApp/ with the following subdirectories:

lib/: this is the Perl include directory that stores all the Perl classes (aka packages or modules) for the application. This is added to the Perl lib path at runtime, and the directory structure corresponds to the package/class namespaces. For example, the two classes that are initially created have the following namespaces and corresponding file paths:

KillerApp — lib/KillerApp.pm
KillerApp::Controller::Root — lib/KillerApp/Controller/Root.pm

These directories also are created but initially are empty:

lib/KillerApp/Model/
lib/KillerApp/View/

root/: this is where other kinds application-specific files are stored. Static Web files, such as images, CSS and JavaScript go in the subdirectory static, which usually is exposed as the URL /static. Other kinds of files go in here too, such as templates.

script/: this contains application-specific scripts, including the development server (killerapp_server.pl) that you can use to run the application in its own standalone Web server, as well as scripts to deploy the application in a "real" Web server. The helper script killerapp_create.pl creates new model, view and controller component classes.

t/: this is where "tests" go. If you follow a test-driven development process, for every new feature you write, you also will write an automated test case. Tests let you quickly catch regressions that may be introduced in the future. Writing them is a good habit to get into, but that's beyond the scope of this article.

The created skeleton application is already fully functional, and you can run it using the built-in test server:


cd KillerApp/
script/killerapp_server.pl

This fires up the app in its own dedicated Web server on port 3000. Open http://localhost:3000/ to see the default front page, which initially displays the Catalyst welcome message.

The Request/Response Cycle

All Web applications handle requests and generate responses. The fundamental job of any Web framework/platform/environment is to provide a useful structure to manage this process. Although there are different ways of going about this—from elegant MVC applications to ugly, monolithic CGI scripts—ultimately, they're all doing the same basic thing:

Decide what to call when a request comes in.

Supply an API for generating the response.

In Catalyst, this happens in special methods called "actions". On every request, Catalyst identifies one or more actions and calls them with special arguments, including a reference to the "context" object that provides a convenient and practical API through which everything else is accomplished.

Actions are contained within classes called "controllers", which live in a special path/namespace in the application (lib/KillerApp/Controller/). The skeleton application sets up one controller ("Root"), but you can create more with the helper script. For example, this creates a new controller class KillerApp::Controller::Something:


script/killerapp_create.pl controller Something

The only reason to have more than one controller is for organization; you can put all your actions in the Root controller with no loss of features or ability. Controllers are just the containers for actions.

In the following sections, I describe how Catalyst decides which actions to call on each request ("dispatch") and then explain how to use the supplied context object within them.

Dispatch

Catalyst provides a particularly flexible and powerful mechanism for configuring dispatch rules. Rather than having a separate configuration to assign URLs to specific actions, Catalyst uses the actions themselves to determine URL mappings dynamically.

Each action definition (which is just a Perl subroutine) represents not only a block of code, but also what URL paths apply to it. This is specified in subroutine attributes—a lesser-known Perl feature that provides arbitrary labels that can be used for introspection.

Catalyst supports a handful of parameterized attributes to determine the URL path to action mappings in a variety ways. For example, the following action has an absolute path set using the :Path attribute:


sub myaction :Path('/some/place') {
        my ( $self, $c, @args ) = @_;
        # do stuff...
}

Regardless of what controller you put it in, the above action would map to all URLs starting with /some/place (http://localhost:3000/some/place with the development server).

If you omitted the starting slash and used :Path('some/place'), the action would map to a path relative to the namespace of the controller. For example, if it were in KillerApp::Controller::Foobar, it would be mapped to URL paths starting with /foobar/some/place.

Instead of using :Path to set the path explicitly, you can set :Local to use the name of the controller and method. For instance, the following action, if contained in the controller KillerApp::Controller::Some, would also map to /some/place:


sub place :Local {
        my ( $self, $c, @args ) = @_;
        # do stuff...
}

If it were contained in the controller KillerApp::Controller::Some::Other, it would map to /some/other/place.

Actions include subpaths by default, so the above also would match /some/other/place/blah/foo/1. When this happens, the leftover parts of the path are supplied as arguments to the action method ('blah','foo','1'). You can use the :Args attribute to limit how deep the action will match subpaths, if at all. With an Args value of 0, this action would match only /some/place, but nothing below it:


sub myaction :Path('/some/place') :Args(0) {
        my ( $self, $c ) = @_;
        # do stuff...
}

Other attributes are available too. :Global works like :Local but ignores the controller name, and path pattern matching can be done with :Regex and :LocalRegex.

When a URL matches more than one action, Catalyst picks the one that matches best. However, there are a few built-in actions (method names "begin", "end" and "auto") that, if defined, are called at various stages of every request in addition to the matched action. Using the advanced :Chained attribute type, you can configure additional/multiple actions to be called with single requests in any order you like.

You also can programmatically dispatch to other action/paths from within the action code itself:


sub myaction :Path('/some/place') {
        my ( $self, $c, @args ) = @_;
        $c->forward('/some/other/place');
}

The Context Object ($c)

Controller actions serve as entry points to application code. A special per-request object called the "context" is supplied as an argument to every action when it is called by the dispatcher. The context object typically is read into a variable named $c, but it could be called anything.

The context provides interfaces and information about the application and its current state. It contains the details of the request currently being processed ($c->request) and access to what will become the response ($c->response).

At the beginning of the request, before any actions are called, the response object is created with empty/default data. Each of the actions that are called then has an opportunity to manipulate the response. At the end of the request, its final state is sent back to the client. This iterative approach to generating the response lends itself to a modular and dynamic structure.

The following action illustrates a few of the simple APIs that are available, such as inspecting the User-Agent and query/post parameters in the request, and setting the body and headers of the response:


sub myaction :Path('/some/place')  {
     my ( $self, $c, @args ) = @_;
	
     my $myparam = $c->request->params->{myparam};
	
     if(defined $myparam) {
          $c->response->body("myparam is $myparam");
     }
     else {
          $c->response->body("myparam was not supplied!!");
     }
	
     $c->response->body( 
          $c->response->body . 
          "\n\nExtra path args: " . join('/',@args)
     ) if (@args > 0);
	
     $c->response->headers->header( 'Content-Type' => 'text/plain' );
	
     $c->response->body("Bad command or file name") 
          if ($c->request->user_agent =~ /MSIE/);
}

Accessing the URL http://localhost:3000/some/place/boo/baz?myparam=foo would display the text that follows (except when using IE, in which case "Bad command or file name" is displayed instead):


myparam is foo

Extra path args: boo/baz

Within the action code, you can write any logic you like to build your application. Because the context object is just a variable, you can pass it as an argument into other functions. Following normal Perl programming rules, you can use other classes and libraries, instantiate objects and so on.

This is the extent of what you have to do—write controller actions and use the context object—but it's only the beginning of what you can do.

Catalyst Components

Over and above the core functionality, Catalyst provides a robust MVC structure to build your application. This includes useful base classes, sensible default behaviors and helpful sugar functions. You can leverage a tremendous amount of turnkey functionality by creating your classes within the supplied framework as models, views and controllers.

All these are considered "components" within Catalyst, and there aren't actually many functional differences between them. The Model-View-Controller monikers primarily are used for the purpose of categorization. Models are meant to contain data and business logic; views are supposed to handle rendering and display; and controllers tie everything together.

Operationally, components are essentially application classes with some extra, Catalyst-specific functionally. They are loaded automatically at startup as static object instances. Any component can be accessed throughout the application via the context object:


sub myaction :Path('/some/place') {
        my ( $self, $c, @args ) = @_;
        $c->model('MyModel')->do_something;
        $c->forward( $c->view('MyView') );
}

In the above example action, you simply are calling the method do_something in a model named MyModel (KillerApp::Model::MyModel), and then forward to the view named MyView (KillerApp::View::MyView).

Earlier, I showed how to use forward to dispatch to another action by supplying a path. When you pass a component to forward, the process method of the supplied component is called as if it were a controller action, which is roughly equivalent to this:


$c->view('MyView')->process($c,@args);

These are just a few examples of the available conventions and shortcuts. The important thing to understand is that all these sugar functions just boil down to calling methods and normal program flow.

Template::Toolkit Views

One of the most common needs of a Web application is a templating system for rendering content. Templates probably are the best all-around approach to rendering text-based content, especially for markup like HTML.

Catalyst can use multiple Perl template systems, but the most popular is Template::Toolkit—a powerful, general-purpose template-processing system that is fast and feature-rich. It has a versatile and robust syntax that is simple and easy to use, but it also supports advanced capabilities, such as control structures, stream processing and extendability. Template::Toolkit is a whole programming language in its own right.

Catalyst provides a drop-in interface to Template::Toolkit via the view/component class Catalyst::View::TT. You can create a view within your application that extends this class using the helper script. Run this to create a new view named "HTML":


script/killerapp_create.pl view HTML TT

The new view is fully functional out of the box. As a general-purpose wrapper around Template::Toolkit, it provides a simple API to select templates and supply input data. The rest of the view-specific code goes in the templates themselves, which are stored within "root" in your application directory.

Here is an example of a simple Template::Toolkit template to render an HTML page:


<html><head>
<h3>[% title %]</h3>
</head><body>
<h1>Message: [% message %]</h1>
</body>
</html>

The character sequences within [% %] are "directives"—snippets of code that get replaced when the template is processed. The directives above are simple variable substitutions, which are the most basic kind. In this case, the values supplied for title and message will be inserted when the template is rendered.

If you saved the above template in a file named main.tt within root/templates, for example, you could use it with an action like this:


sub myaction :Path('/some/place') :Args(0) {
        my ( $self, $c ) = @_;
	
        $c->stash->{template} = 'templates/main.tt';
        $c->stash->{data}->{title} = 'TT rendered page';
        $c->stash->{data}->{message} = 'A cool message!';
	
        $c->forward( $c->view('HTML') );
}

The stash object above is another built-in feature of Catalyst that I haven't covered so far. It isn't very complicated; it's simply a hashref within the context object. It provides a standard per-request place to share data across components, similar to request and response, but for general use.

Catalyst::View::TT-based views use the content of the stash to determine operation. The value of template identifies the template to call, and the stash as a whole is used as the input data—each key in the stash becomes a variable in the template. The content generated from processing the template is used to set the body of the response.

The data in a real application probably will be more complex than the simple key/values in the previous example. One of Template::Toolkit's great features is its ability to handle Perl data structures directly. Consider the following action:


sub myaction :Path('/some/place') :Args(0) {
        my ( $self, $c ) = @_;
	
        $c->stash->{template} = 'templates/main.tt';
	
        $c->stash->{data} = {
                title	=> 'TT rendered page',
                subhash => {
                        alpha => 'foo',
                        bravo => 'abdc',
                        charlie => 'xyz'
                },
                thinglist => [
                        'Thing 1',
                        'Thing 2',
                        'Big thing',
                        'Small thing'
                ]
        };
	
        $c->forward( $c->view('HTML') );
}

This would work with a template like this:


<html>
<h3>[% data.title %]</h3>
</head><body>
<b>Alpha:</b> [% data.subhash.alpha %]<br>
<b>Bravo:</b> [% data.subhash.bravo %]<br>
<b>Charlie:</b> [% data.subhash.charlie %]<br>
<br>
<b>List of Things:</b><br>
[% FOREACH item IN data.thinglist %]
        [% item %]<br>
[% END %]
</body>
</html>

Objects also can be supplied and accessed in the same manner as hashes. In fact, the context object is supplied automatically in "c". For example, if you want to display the client's IP address, instead of separately putting $c->request->address in the stash, you can just access it directly within the template like this:


[% c.request.address %]

Template::Toolkit has many more features and abilities, including wrappers, conditional statements, filters, function calls and so on. Catalyst::View::TT also has additional defaults and configuration options that I didn't cover here (see the documentation for more details).

It is totally up to you how to balance logic between the templates and the rest of your application. Depending on what you are trying to achieve, your application easily could be written more in Template::Toolkit than in Perl!

DBIx::Class Models

One of the other most common needs of an application is a database. DBIx::Class (often shortened to DBIC) has emerged as the most popular ORM (Object Relational Mapper) library available for Perl. It is an exceptionally powerful, robust, object-oriented interface to many relational database servers (including MySQL, PostgreSQL, Oracle, MSSQL and many others).

Like Template::Toolkit, but to an even greater degree, Catalyst provides refined, drop-in component wrappers to interface with DBIx::Class (Catalyst::Model::DBIC::Schema).

Using DBIx::Class is a whole topic in and of itself that I don't have space to cover here, but it is a must-have if you plan to integrate your application with a database. See Resources for information on where to go to start learning about this wonderful library.

Plugins and Application-Wide Settings

Besides pre-built component classes for drop-in functionality, many plugins are available to modify the behavior and extend the functionality of Catalyst itself. A few of the most common are the optional authentication, authorization and session plugins.

These plugins provide a consistent API for handling these tasks with a variety of available back ends. Like the core request/response object interfaces, they are available as application-wide features that are accessed and controlled through methods in the context object, which become available once these plugins have been loaded.

You can authenticate a user (within an action handling a login form post, for example) like this:


$c->authenticate({
   username => $c->request->params->{username},
   password => $c->request->params->{password}
});

If this succeeds, the $c->user object is available in subsequent requests to control and direct application flow based on the authenticated user. This is accomplished using sessions (usually cookie-based) that are handled for you automatically. You also have access to $c->session to persist any additional per-session data across requests.

The API of this framework is agnostic to the back end, and many are available. You can handle authentication and user storage via databases (DBIC), system accounts, PAM and LDAP, to name a few. There also are multiple ways to handle session data to support different application needs, such as distributed server deployments and so on. (See the documentation for Catalyst::Plugin::Authentication, Catalyst::Plugin::Authorization and Catalyst::Plugin::Session for more information.)

Plugins and application-wide settings are configured within the main/core class (lib/KillerApp.pm). Within this file, you can specify global configuration parameters, load Plugins and even add your own code to override and extend core functionality.

The top-level "KillerApp" class actually is the application—it programmatically loads and integrates the other components and classes throughout the system. Like any derived class, its behaviors can be selectively altered from that of its parent class ("Catalyst"). Since it uses the powerful "Moose" object system, in addition to adding and replacing methods, you also can take advantage of additional powerful features like method modifiers and Roles (in fact, Plugins are essentially Moose Roles applied to this class).

Catalyst was written with customization and extensibility in mind. It's structured to allow its functions and behaviors to be modified easily in a fine-grained manner.

For example, you could configure every response to be set with "no-cache" across the whole application simply by adding a method modifier like this to lib/KillerApp.pm:


before 'finalize_headers' => sub {
        my $c = shift;
        $c->response->headers->header( 'Cache-control' => 'no-cache' );
};

Catalyst calls methods with meaningful names (such as 'finalize_headers') throughout the various stages of processing that you are free to hook into or override.

Deploying Your Application

Like most things in Catalyst, many options are available when you're ready to deploy your application to a real Web server—Apache/FastCGI is one of the best choices available. I briefly cover this below.

If you put your application in /var/www, for example, you can deploy with an Apache virtual host configuration like this:


<VirtualHost *:80>
    ServerName www.example.com
    ServerAdmin webmaster@example.com

    Alias /static/ /var/www/KillerApp/root/static/

    FastCgiServer /var/www/KillerApp/script/killerapp_fastcgi.pl \
        -processes 5 

    Alias / /var/www/KillerApp/script/killerapp_fastcgi.pl/

    ErrorLog /var/www/logs/error_log
    CustomLog /var/www/logs/access_log combined

</VirtualHost>

FastCGI is a proven, language-independent interface for running Web applications. It is essentially just plain CGI, but it keeps programs running in the background instead of firing them up for every request. This is the major limitation of CGI that FastCGI overcomes. FastCGI has been around for a long time. The use of this efficient protocol is another example of how Catalyst leverages existing solutions.

FastCGI allows you to specify the number of processes to run (five in the example above), making your application multithreaded. Incoming requests are distributed evenly among the processes, which are maintained in a pool.

The alias for <code>/static/</code> above tells Apache to serve the files directly in this directory (images, CSS, JavaScript files and so on). This is more efficient than serving these files through the application, which isn't necessary.

Conclusion

This article is meant to provide only a taste of Catalyst and its capabilities. As I hope you have seen, Catalyst is a viable platform for any Web development project. With a flexible design and many available mature features, you can use Catalyst to build robust applications quickly and conveniently.

Catalyst is actively developed and is getting better all the time, including its ever-increasing quality documentation. It also has a very active user community with top experts available via IRC.

When you're ready to start writing an application, you should be able to find the information and support you need to hit the ground running. See the Resources for this article for important links and where to get started.

Resources

Catalyst Home Page: http://www.catalystframework.org

Catalyst::Manual: http://search.cpan.org/perldoc?Catalyst::Manual

Template Toolkit Home Page: http://www.template-toolkit.org

DBIx::Class::Manual: http://search.cpan.org/perldoc?DBIx::Class::Manual

Catalyst IRC Channel: #catalyst on http://irc.perl.org

"Moose" by Henry Van Styn, LJ, September 2011: http://www.linuxjournal.com/content/moose