The Grand Unified Desktop
In December 2002, I wrote an article for the Linux Journal web site (www.linuxjournal.com/article/6476) about the new Bluecurve desktop for Red Hat 8.0, which includes programs from both GNOME and KDE. I received a lot of feedback, including many suggestions on how to write desktop applications that would cooperate with applications written using other toolkits and work well in any desktop environment.
An integrated desktop is one whose components work together, wherever they come from, and one that never forces the end user to do the same thing many times. The current default solutions, like GNOME and KDE, often present limits if one tries to mix pieces.
Communication between graphical clients, from simple drag-and-drop to the handling of icons, menus, metadata such as URLs and, in general, window-level interactions, work out of the box only for some subsets of clients and window managers. The same applies to centrally managed and good-looking fonts.
And what about localization? How many programs are still a pain to use with a non-English language or keyboard?
KDE and GNOME are two collections of applications that seem to solve many of these problems. But are toolkit-centric, all-or-nothing approaches the easiest to maintain? What if somebody creates an even better toolkit? What about applications from other toolkits or the whole best-of-breed philosophy so natural among free software users?
In other words, what is the best way to develop that killer application for your favorite desktop, other people's favorite desktops and the desktop of the future?
Re-inventing the wheel may be silly, but it's much less silly than inventing a wheel that needs a new road. Consequently, even if there already are 20 chat clients for Linux, code another one if it feels good; however, make sure that it interoperates with what already exists. Start with common sense, and use the right free standards and protocols. Afterward, check that whatever toolkit or library you choose respects them too. Many top developers already are fully aware that this is the winning approach in the long term. GNOME developer Havoc Pennington wrote, “Interactions between applications and the runtime environment really need to take place via documented, toolkit-independent protocols and file formats.” Let's look at how to do this in practice.
The most interesting thing in the office formats field, but also for many other types of structured data storage, probably is the work of the Oasis Group. The Oasis file formats are being developed starting with the OpenOffice.org ones, so OpenOffice.org developers will have an easier time. But every program, from Emacs to KOffice to the next word processor, can use them.
Along these lines, there's no need to create yet another bookmark format. XBEL (XML Bookmark Exchange Language) is a bookmark interchange format that Galeon and Konqueror already use.
The idea of an interchange format is valid for other configuration files too. UNIX and Linux still are following the toolbox approach: no mammoths, but a lot of little pieces, each doing one thing well. One practical consequence of this is that all the applications doing the same thing will need to know and store the same set of parameters, or very similar ones.
Sticking to the chat client example, the really silly thing would be to have one configuration file for Qt_chat_app, one for GTK_chat_app, one for MyToolKit_chat_app and so on. There is a fundamental difference between a file and the way it is created or updated. Using different editing methods are okay (vi, a control panel, etc.), but in the long term, what really matters is having a single ~/.chatrc file with every existing chat client using it and not complaining if the settings changed from another chat client. Usenet newsreaders already use the standard .newsrc file. Until a standard emerges for your category of application, an acceptable compromise is to have your new program load the settings of other similar clients when it first starts and use them as defaults.
Guessing the MIME type of a file from its name is an action performed looking at a MIME database. This can be unified too, so that every desktop and program is guaranteed to read the same data; freedesktop.org has a draft specification of how to do this.
Many people say that X is showing its age and point to ambitious, far-reaching projects like Fresco. Even without aiming so high, however, a lot can be done to make sure that any mix of clients and window managers behave properly. A starting point for approaching this kind of problem is the freedesktop.org site, especially its standards section.
Remember what we just said about applications of the same type sharing the same set of configuration parameters? We were referring to the core functionality, but the same approach can be used with all GUI-level settings, in all the components of your desktop. Things like background color or the double-click timeout are needed by all programs and still can be configured once and for all by each user, regardless of the particular mix of toolkits he or she needs. The X Resource Manager was not designed to be changed interactively and merges what is written in ~/.Xdefaults with data from other sources, so it is not the right back end for a GUI configuration tool. Go for the Xsetting specification instead, which is being designed to solve this and other problems.
Let's go one step further. What comes after the internal configuration of each class of applications and the way they appear and react to local user actions? The graphical interaction with other applications and the window manager.
Dragging and dropping text from any window to another one already is possible, theoretically at least, if all interested parties follow the XDND protocol. This currently is supported not only by GTK+ and Qt, but also by script-oriented toolkits like Tk and Perl/Tk and by relative outsiders like FOX.
Figure 1 shows how XDND works. When an application must send something to another one, it issues the proper toolkit call. The toolkit sends everything, through the XDND protocol, to the other toolkit, and the answer is passed back to the program in the same way.
The source must declare all the MIME types it supports to its toolkit. These can be text in many formats, images or filenames. The target must declare which formats it recognizes inside that list. In this way, formatted text can be passed between two word processors without losing font, color and so on. At the same time, passing a paragraph from, say, KOffice to vi will lose formatting but keep the text intact.
When drag-and-drop doesn't work, the problem is almost always inside the application's XS, which didn't declare all the MIME-types, used only custom ones or misused either the toolkit or the protocol in some similar way. Practically speaking, it means that bug reports must be filed against the applications, not the toolkits or XDND.
Other interactions between X clients, or between clients and the window manager, must follow the ICCCM (Inter-Client Communication Conventions Manual) and the Extended Window Manager Hints, EWMH for short, formerly known as the NetWM specification. It should be noted that even the good old X clipboard is fully described in the ICCCM. Detailed explanations of how to handle this are provided by Keith Packard and Jamie Zawinski.
The second window interaction standard works on top of ICCCM. It deals with all the window management features not specified in its predecessor, because they started to appear after it. EWMH originally was meant to be a replacement of what was then the GNOME window manager specification, but it is designed correctly in that it can be implemented and supported in any desktop environment. GNOME 2 and KDE 3 both support EWMH, so any application conforming to it can be expected to play nicely inside both environments or with any of their components. Developers willing to have a more precise, yet quick feel of what EWMH is supposed to do may want to look at KDE's netwm.h.
Have we covered all that is needed to build a Grand Unified Desktop? Certainly not. Consider Red Hat 8.0. Almost everybody, including those who hate the idea of a GNOME/KDE hybrid, agreed that at least the fonts look much better than before. The way to duplicate this particular bit of magic on your distribution of choice is well known.
First, the improvement is due to anti-aliasing, which makes characters smoother by adding pixels in the proper places. Figures 2 and 3 show the same text in a standard xterm and in an anti-aliased one.
Second, Red Hat 8.0 is the first mainstream distribution to have used as much UTF-8 encoding as possible (more on this later) and the new client-side font-management system implemented by the xft2 and fontconfig libraries.
Basically, fontconfig can figure out what fonts are available in the system and which is best for each document. Once this is known, xft2 can tell the X server what to draw. Both libraries need to interact with FreeType or an equivalent rasterizing engine. In other words, font detection and rendering have been cleanly separated but packaged in a way that any client can embed. This means that:
Eventually, font servers should not be needed anymore.
Installing new fonts, even without the root password, is much easier.
Any application using fontconfig reads all font configurations from the same XML file(s), which can be edited by any front end.
Font management (in any application) can now proceed at the speed of these two libraries, not at the speed of X itself.
Because fontconfig doesn't require xft2 or any other X-related element, it can be embedded in anything that deals with fonts, including print drivers and libraries. libgnomeprint22 is doing exactly that.
Another nice thing about the xft2/fontconfig system is that it is not an all-or-nothing deal. It can coexist peacefully with traditional font servers, which still may be needed to assist older applications. This is what happens with Red Hat 8.0. And, xft2 can talk to both old and new XFree86 servers. The first ones receive long sequences of low-level drawing instructions. The others, compiled with the Render extension, receive faster, more sophisticated commands.
When drawing nice symbols is only half of the work and an application also needs advanced text layout, the tool of choice is Pango, which uses fontconfig. Pango is another tool born inside a more or less monolithic desktop, GNOME, but currently is developed to be usable in any other environment.
One last word about fonts—nicer drawings are cool, but if they were only used for digits and the English alphabet, they wouldn't really be worth the effort, would they? Moving to Eastern European, African or Asian languages, ASCII can't even handle “Hello, World!”
For developers, this means any new application must be coded to deal with multilanguage encodings from day one, and all existing programs must be at least checked to guarantee that they still will work. This is not merely a suggestion. If you think you are safe because you only use ASCII and only write some scripts, think again. The next time you upgrade and your code is catapulted into an internationalized shell, terminal or window manager, it will break. My Perl one-liner for random e-mail signatures stopped working for this very reason on Red Hat 8.0, and in almost three months, no one on three different lists could suggest a solution.
The character encoding that is becoming the standard on Linux is Unicode UTF-8 [see “Unicode” by Reuven M. Lerner in the March 2003 issue of LJ]. The good news is that it can represent all characters existing on the planet, so no other encodings are necessary. Figure 4 shows Emacs, as packaged in Red Hat 8.0, dealing with all kinds of symbols and characters. The bad news is that because non-ASCII characters take more than 8 bits and sometimes much more space on screen, many deeply ingrained dogmas, starting with the equation “1 character = 1 byte”, simply disappear. The right resources to deal with this, apart from the Linux Unicode HOWTO, are the mini-guide to “UTF-8 soft and hard conversion” and the UTF8_STRING mechanism that aims to preserve the X cut-and-paste system in a UTF-8 world. At a higher level, programming for internationalization, regardless of the operating system, is the goal of Openi18n.org.
What is left to achieve a nice GUI? Menus and icons. A desktop entry standard that describes, regardless of the environment, how to build menus, how to launch each application and so on, already exists at freedesktop.org. It does have some limitations, namely the lack of a common place where .desktop files should be put and the hardwiring of menus, coming from the fact that they simply mirror how these files are located on disk. A virtual folder extension to the standard is being written to overcome these limits. Another specification with a similar scope is available to standardize icon locations and theme selection.
Yes, if you distribute the source, everyone can compile and install your program, but why make it hard for others to figure out why the program can't find which libraries are installed? Why hard code things so they will work only on one distribution? The Linux filesystem hierarchy from the LSB group is your friend here, whatever application you plan to write.
I have nothing against pure KDE or pure GNOME. I only hope that the next generation of desktop applications will make it easier for everyone to build his or her own environment from any combination of programs without sacrificing real functionality and performance. The methods and tools described here are a good way to build such applications, and I am grateful to their developers. Many thanks also to Havoc Pennington, Keith Packard, the members of the kde-devel list and everyone who answered on the linuxjournal.com web site to help me in writing this article.
email: [email protected]
Marco Fioretti is a hardware systems engineer interested in free software both as an EDA platform and (as the current leader of the RULE Project) as an efficient desktop. Marco lives with his family in Rome, Italy.