Emacs for Science

I typically cover software packages that do actual calculations to advance scientific knowledge, but here I'm exploring a slightly stranger tool in the arsenal of scientific computation.

Emacs is a text editor that has almost all the functionality of an operating system. A collection of enhancements and configuration settings are available bundled under the name of scimax. Being an Emacs user myself, I was surprised I'd never heard of it before now. This project has been in development for some time, but it recently has started to find wider attention.

Before installing it, however, you need to install the latest version of Emacs to get all of the needed functionality. As with most of my articles, I am assuming that you are using a Debian-based distribution. You can install Emacs by using the daily snapshot package, available at the official launchpad archive. Install it with the following commands:


sudo add-apt-repository ppa:ubuntu-elisp/ppa
sudo update
sudo install emacs-snapshot

This will ensure that you have the latest and greatest version.

Once this is installed, go ahead and install the scimax code itself. You can get it from the main GitHub repository with the command:


git clone https://github.com/jkitchin/scimax.git

Start it with the command:


emacs-snapshot -q -l path/to/scimax/init.el

The first time you do this, there will be a lot of activity while Emacs downloads and installs the full suite of extra packages you need in order for the scimax code to have all of the required dependencies.

When you finally have everything installed and start scimax, you will see several new menu items in your Emacs session.

Figure 1. You will see several new menu item entries at the top of your Emacs window.

The real driving need behind all of the work that has gone into scimax is to make research more easily reproducible and to handle documentation of your research with minimal extra overhead.

Since you want to work with organized documents within Emacs, the base of scimax is built on top of org-mode. Org-mode, by itself, is probably something you will want to look into as a potent tool. Scimax takes this powerful core and makes it easier to use with a series of shortcuts.

Because of org-mode's power, it would be time well spent if you learned at least the basics of how to use the main parts of this package. Org-mode has a markup syntax of its own, and scimax adds a layer of shortcuts that make it easier to write.

Along with the regular org-mode markup syntax, scimax makes it easier to include LaTeX sections for more advanced textual layout. Many people in scientific fields are familiar with LaTeX. For those who aren't, LaTeX is document layout program where you write your text and include layout instructions for the LaTeX engine. The idea is that you separate out the text from the formatting of that text.

If you have graphical images as part of your research, scimax has added some extra functionality to make image rescaling and presentation better than the org-mode defaults by using external programs from the imagemagick package.

Because org-mode is designed to be a document structuring package for Emacs, it allows for exporting your text into a great many other formats. Also, since it separates out the formatting from the actual text, it can be applied to many different document structures, such as articles, books or web pages.

Scimax uses the ox-manuscript Emacs package to handle exporting to high-quality document formats. This is useful when you need to produce final versions of your scientific reports or articles in a format like PDF.

Bibliographic references within your document are handled through bibtex entries.

So far, I've covered a quick overview of the document management, organization and formatting tools that are provided through scimax, but Emacs and org-mode have much more functionality. You can interact with the outside world in a few different ways. The first is through email. You can grab selections of your text, or an entire buffer, and issue an org-mime command within Emacs to tell it to send an HTML-based email. Depending on your system, you may need additional configuration in order for this to work as expected.

The other way to interact with the outside world is through Google searches. As someone who writes a fair bit myself, I cannot understate the need for a Google window to be open to be able to verify some fact or statement as I am writing. With scimax, the google-this Emacs package gets installed and is available as you are working. This allows you to fire up Google searches based on either specific text selections or the contents of entire buffers immediately from the document that you are working on.

Along with communicating with the outside world, the other powerful interaction with external tools is through org-mode's ability to run external programs and have their output inserted into sections of your document. This one piece of functionality makes the dream of reproducible research a real possibility. You do need to be diligent and put it into practice, but you no longer have the excuse of saying that it isn't possible. The idea is that, from within your org-mode document, you can define a block of code that makes some calculation or generates some graph. You then can have org-mode fire this block so that it can be run through an external engine and have the results pulled back in and inserted as the displayed text in the original location.

The default engine configured in scimax is Python, which is definitely a good starting point. With more configuration, you can add support for several other languages. The powerful idea here is that you always can go back to the original code that generated some result or some graph and re-create it. More and more scientific journals are demanding this level of reproducibility, so having it as part of your article contents directly means you never will lose track of it.

The last thing I want to cover is how to organize all of the work that scimax is helping you do. The highest level of organization is the ability to set up projects. A project is essentially a directory with all of the associated files for that given project. These projects are handled by the Emacs projectile package. This package allows you to move between projects, find files within projects or do full searches through a given project.

Projectile assumes that these project directories are under some kind of version control system, such as Git. Luckily, scimax includes the magit Emacs package, which adds lots of extra functions that allow you to interact with the Git repository that the current file belongs to directly from Emacs. You can create or clone repositories, stage and commit changes, manage diffs between versions, and even handle pushes to and pulls from remote repositories. Along with the explicit control over a Git repository, scimax includes extensions to org-mode to handle track changes, as well as to insert edit marks within your org-mode document.

Putting all of this organizational work together, scimax provides the ability to create and use scientific notebooks. A series of commands starting with nb- allow you to wrap all of the organizational functionality to create, manage and archive these notebooks. Now, you have no reason not to start documenting all of your scientific research in a reproducible way—except maybe the learning curve. But, as the old saying goes, nothing worth doing is easy, and I think this is definitely worth doing, at least for some people.

Load Disqus comments