The Arrival of NX, Part 2

How X works.

This is the second in a seven-part series written by FreeNX Development
Team member Kurt Pfeifle about his involvement with NX technology. Along
the way, he gives some basic insight into the inner workings of NX and
FreeNX while outlining its future roadmap. Much of what Kurt describes
here can be reproduced and verified with one or two recent Knoppix CDs,
version 3.6 or later. A working FreeNX Server setup and the NoMachine NX
Client has been included in Knoppix now for over a year. Part 1 of the
series, "How I Came to Know NX", is

How important is roundtrip suppression for remote GUI work? To understand
its significance, we first have to grasp a few basic mechanics of the X
X Basics
The X protocol regulates communication between an X server and an X
client. The X client typically is a program that needs a GUI
to facilitate user interaction. The X server
is a specialized program that "draws" that GUI and the
GUIs of any other running program onto the screen. Moreover, the X
server also handles keyboard and mouse events issued by the user and
sends them back to the X client program, which then acts on the
user's commands.
Figure 1. The NX login for a remote Windows
If an X client program needs to draw something on screen--such as a new
dialog window--it issues a series of requests to the X server. About 160
different types of X requests, including extensions, are specified in
the X protocol. Each request represents, for example, a primitive
graphic element--a certain, possibly large, set of requests is
required to create any specific window element or complete window. These
requests sometimes are called opcodes. If you are curious, they
are described, in programming language, at the end of the source code
file named Xproto.h. You can find it on your own hard disk if you are
running an XFree86- or X server and have its source code
header files installed. On my Knoppix-4.0 system it is in
A few of the requests sent by the X client also solicit replies from
the X server. Each request, made by the X client program, and its
reply, from the X server, constitute a roundtrip. Roundtrips slow down
the responsiveness of a GUI program because of the time it takes for
the requests to complete the two-way trip. Often, a user does not notice
X roundtripping. To date, for most uses of working with Linux or UNIX,
the X client and the X server reside on the same machine; that is, they
are physically proximate. This is the most simple and also the most
common use of the X Window System.

This need not be so, however. The X client program and the X server may reside on
different host machines, physically distant from each other. They even may
be many thousands of miles apart. The X protocol does not care: as we
say, it is "network transparent". A local X server can display the GUI
output of a remotely running program to the local user's screen. It
also can send the local user's mouse and keyboard commands to the remote X
client application far away.

Try it. To do so, you need to have a user account on a remote Linux or UNIX
machine. Run this command:

ssh -X your_username@remote_hostname xterm 

After some delay, this should make a new xterm command
window appear on your screen. It may look exactly like your local xterm.
The only difference may be that the shell prompt shows a different
username and host. If it doesn't work for you, this could be because
SSH on the remote machine is set up to disallow X forwarding. However,
such a detail is beyond the scope of this article.

When an application starts and its first window is displayed on the
screen, the totality of X client/server roundtrips may amount to many
thousands. So be forewarned--the remote application displayed on your
local screen may feel rather sluggish.

In the all-local case--where the X server and X client program reside
on the same host machine--these roundtrips do not take too long to
complete. The communication between the two is going through UNIX domain
sockets, a custom version of named pipes, which are special files on your hard
disk that serve interprocess communications. The many roundtrips taking place in
the all-local case, therefore, are reasonably fast.

In the remote case--where the X server program is on localhost and the X client
program is on a different host--all interprocess communication is
transported through TCP/IP network sockets and the remote network
connection. This works well, but it is several orders of magnitude
slower than the the all-local case.


+----------+                                                       +----------+
|          |      --> responses                   <-- requests     |          |
|          |        --> events                                     | remote X |
|          |   X      --> errors                               X   |applicat. |
| local X  | <---------------------------------------------------> |(or compl.|
|  display |     many "round trips": request + response pairs)     |KDE/GNOME |
|(X server)|                                                       | session) |
|          |                                                       |          |
+----------+                                                       +----------+

(c) Kurt Pfeifle, Danka Deutschland GmbH <kpfeifle at danka dot de>

Although communication between the X client and X server running on the same
machine is handled by way of UNIX domain sockets, the exact same communication
between a local X client and a remote X server is done over network--TCP
and/or UDP--sockets. If this were the only difference, this alone would
cause a large gap in performance. But there is an additional performance
retardant: network latency.
Link Latency
Any link's quality basically is determined by two parameters, the
network's bandwidth and its latency. Bandwidth describes how many bytes
per second can be shoveled into the pipe. The rate of bytes per second
pouring out at the other end should be the same. The network's latency
describes how much time each packet of data needs to travel from one end
to the other.

Typically, a modem link has a latency of 200 to 500 milliseconds. An
ADSL link exhibits a latency of approximately 50 milliseconds. A local
Ethernet LAN link's latency, though, is less than 1 millisecond. A UNIX
domain socket link, such as the internal link within the machine of the
all-local example above, is well below 0.1 milliseconds.
Figure 2. The NX client, while connecting to a Windows XP
machine, encounters the Windows login screen.
You can test the latency of any network link with the ping command.
The ping command shows roundtrip time in your terminal window. If you
are a distance of 4,000 kilometers away from your peer--say, the
distance from California to Massachusetts--your ping can't be faster
than about 44 milliseconds. The speed of light is 300,000 kilometers per
second in a vacuum, and it is approximately 40% slower when traveling
through fibre.

If you send one large chunk of data that takes, say, 60 seconds to
complete its one-way trip, you are likely not to care much about the
latency of the link, even if it adds as much as one second to the total
transfer time. Reducing roundtrip time by 99.9% does not reduce your
overall transfer time significantly, as if you reduced latency from
1,000 milliseconds to one millisecond. Doubling your bandwidth would
help a lot more. Doing so would reduce the required time for your data to
be shoveled into the pipe to 30 seconds, and the receiving end would
acknowledge a completed transfer after 31 seconds.

The situation is radically different if your data flow has an opposite
profile. If you cannot send one large chunk of data but must do many
little ones, and if you have to wait for responses for most of them,
latency increases its influence on overall performance. Only few and
small data chunks, such as packets, can be sent into the pipe within each
single millisecond period. But you may have to wait a comparatively
longer period, on the order of 500 milliseconds, for confirmation or
response from the remote end. If a lot of little data chunks
require confirmation--that is, if they cause roundtrips--these
physical facts really start to impact the remote GUI experience.
About the Verbosity of X11 Programs
In and of itself, X is an efficient protocol, which may sound
surprising at first. However, many GUI programs making use of X are
coded inefficiently. Look at them through the eyes of a user working
remotely over a modem connection, and you can see what I mean.

There are many areas where GUI programs--KDE and GNOME alike--could
be improved to enable them to run faster over the network.
Take a simplified and contrived example as illustration.
The most modern desktop eye candy uses a lot of animation. Take a
pull-down menu: often you see it rolling out in an animated fashion.
By the way, I do not share the opinion of some UNIX purists who deem
these kinds of animations "useless" or "superfluous". They can help
users understand the system. But I digress. How is this animation
expressed in X? The X application tells the X server to draw
one or more rows of pixels at a time. How efficiently is this done in
general? There are both inefficient and optimized ways to do this.
Here, I use a simplified example to highlight each case.

In the "bad" version, the application says to the X server, "Draw these 10 rows
of pixels and report back if you are done." The X server draws and
reports back. The next request is, "Now draw another row of pixels and
report back again." The result is roundtrip after roundtrip, until the nice
menu animation is completed.

The "good" version is a bit different. Here the application request to
the X server translates to this, "Draw this complete series of pixel
rows, one after the other, at a speed of 1 row per millisecond. Report
back when you are all done." This takes only one roundtrip.

The "bad" version of the code may not be distinguished from the "good"
version if the user works only on localhost with his applications. But,
if running in a remote situation across a real network, the difference
starts to become obvious: the "good" version still is executing
smoothly, while the "bad" is slow and looks erratic.
What's the Problem?
Keith Packard had this to say in his "LBX Postmortem" paper:

X applications have usually been developed in a high-bandwidth/low-latency
environment, either entirely within a single machine or perhaps over
a local area network. Such environments exhibit bandwidth in excess
of 1MB/sec and latencies less than 1ms. Moving applications to serial
lines decreases the bandwidth by more than a factor of 100 and increases
latency by a similar amount.

A developer working in a high-bandwidth/low-latency environment does not
notice the inefficiency of his creation if it is run at some other time
in a different environment. A software design engineer writing the
required specification document for new software forgets to make
provisions for low-bandwidth/high-latency tests. Over the years, the
whole UNIX and Linux GUI software development environment drifted away
from a paradigm that held network transparency of X in high esteem.
Figure 3. A remote Windows XP session is seen running
within our NX client on Knoppix Linux.
This lax attitude even afflicts toolkits. In many respects, toolkits
have become one of the biggest sources of excessive roundtrips. One
can't blame individual programmers for it, though. A typical KDE or GNOME
developer probably is not aware of the impact his compiled code has
on network performance. Even if he is aware, often he cannot do much
about it. He chose a toolkit to work with, and he is depending on that
toolkit's innate X11 efficiency.
Latency of Links
"But wait", you say. "Hardware gets better and more powerful all the time. Isn't
that coming to our rescue?"

In network computing, bandwidth is not as much of a limiting factor as
is latency. If bandwidth is too low and need is too high, you can add a
cable or two or more. Doing so pushes more data through the wire(s)
within a given period of time. Of course, it costs more money to buy
additional lines, but my point is there is no hard technical limitation.
Additionally, you can hope for increased bandwidth in future networks.

In the case of bandwidth, no hard technical limit is in sight
yet, but the story is different for latency. Latency can be reduced
down to only a certain level. You can't make signals travel faster than
light, not through any medium. Many network connections already are
within 50% of the theoretical optimum--the speed of light--regarding
latency. For latency, the technical limitations are very apparent.

Herein lies the dilemma: increasing bandwidth helps to accelerate remote
X11 connections up to only a certain limit. Once you reach that limit,
adding even more bandwidth doesn't speed up your remote desktop
experience. In the remote desktop context, you don't fill the
capacious wire with little data packets, as many as there may
happen to be. Here, any added bandwidth idles. Instead, you spend most of the
time with an empty pipe, waiting for little roundtrips to complete.

Typical modem roundtrips require 500 milliseconds to complete; typical
ISDN roundtrips need about 50 milliseconds. Under these conditions,
elimination of X11 roundtrips is a decisive means to speed up remote
user desktop interaction.

We discuss more about NX roundtrip suppression and traffic compression
in Part 3 of this article series, titled "How (Well) NX Works."

To learn more about FreeNX and witness a real-life workflow
demonstration of a case of remote document creation, printing and
publishing, visit the
booth (#2043) at the
LinuxWorld Conference & Expo
in San Francisco, August 8-11, 2005. I will be there,
along with other members of collaborating projects.

Kurt Pfeifle is a system specialist and the technical lead of the
Consulting and Training Network Printing group for
Danka Deutschland GmbH,
in Stuttgart, Germany. Kurt is known across the Open Source and
Free Software communities of the world as a passionate CUPS evangelist;
his interest in CUPS dates back to its first beta release in June 1999.
He is the author of the KDEPrint Handbook and contributes to the
KDEPrint Web site.
Kurt also handles an array of matters for and wrote most of the printing documentation for the
Samba Project.


Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

I tried NX over a GPRS wirele

Jack's picture

I tried NX over a GPRS wireless connection here in Europe but it was too slow to use. GPRS is supposed to have the same bandwidth of an analog modem but its latency may be 1 or more seconds. I did not try any other protocol such as VNC, Citrix or RDP so I certainly won't blame NX. It's just that I wanted to try because I read very good reports about NX.

Good Stuff!

Anonymous's picture

Thanks Kurt! I'm looking forward to the next article. The information about good and bad X programming for remote X was interesting.

Question: How can I use NX in this situation: I want to connect from my home linux machine to a linux machine at work and remotely run a graphical application, however because of security layers I actually have to make 3 ssh connections (home -> login -> dmz -> work ). All machines are running linux.

In your case, I would build s

Markk's picture

In your case, I would build some kind of a VPN (for example, using OpenVPN home <-> login <-> dmz <-> work, or work <-> home if you can reach home (public IP) from work directly on some port (i.e., 22, 25, 80, 110, 119...).

Yet another option might be to do some port forwarding.

If you can ping home from work, you can even build a "ping tunnel" :)

As you can see, there are many options.

NX and the Xserver

Jon Smirl's picture

I don't see you in the xorg lists trying to add this to the Xserver. Is there a reason for this?

How does NX compare to XCB? XCB is probably going into Xorg 7.1. Should NX try for inclusion in 7.1?

Does NX also handle GLX or is it just core X?

Re: NX and the Xserver

Gian Filippo Pinzari's picture

Hi Jon,

congratulations for your work at I'm downloading the latest RC0
release right now ;-).

The NX developers follow closely the development of and are very
open to any form of cooperation, anyway NX is a client-server system
built "on top" of X, not an "extension" of X so it's unclear what
advantages for both groups would arise from merging the projects. The
NX client-server protocol, for example, playing an important role
before the X connection is established, includes functions to set up
a new session, list the running sessions and the available users,
gather information about the state of the system and many other
functions that go much beyond the scope of the X protocol. The design
of NX also allows for the NX connection to be tunneled by different
protocols. This connection doesn't "speak" the X11 protocol natively.
Undoubtedly many parts of the NX development match the scope of the
X development, like the X protocol compression and the proxying of
the X connections by nxagent, but these development may converge and
benefit of each other development while remaining different projects.
Think at X and the X forwarding built into SSH, or X and the X
toolkits... These systems have a deep interest in X development but
are still developed and maintained by different groups. From a
technical standpoint, NX aims to transport different protocols and
may as well paint the remote display by using SVG or Flash, without
changing very much of the upper layers, so, while it holds true that
we see X as the best candidate to rule the world of the tomorrow's
network computing, I don't think NX should be strictly tied to the
development of the X server.


Xlib is mostly a toolkit problem. nxagent may well use XCB instead
of Xlib in future, but this would change very little for us. Xlib
has interfaces to implement asynchronous X calls already. While they are mostly unexploited by toolkits, we use them extensively to avoid
blocking while waiting for the network.


NX is not limited to the core protocol and already provides
"acceleration" for most extensions, for example RENDER. Obviously
we give priority to extensions that are most important to our users,
as for example the extensions widely used by the toolkits and by the
most important desktop environments and business applications. GLX,
for example, is fully supported in the 1.5.0, though it is not yet "accelerated". Actually there is no reason why GLX should not be accelerated in future ;-).


/Gian Filippo.

> so it's unclear what advant

renox's picture

> so it's unclear what advantages for both groups would arise from merging the projects.

For X the advantage is clear: the strength of X is its network transparency but as you've written, this strength is running into trouble due to modern toolkit/developpers. A fix is needed.

> NX client-server protocol, for example, [cut] , includes functions to set up a new session, list the running sessions and the available users, gather information about the state of the system and many other
functions that go much beyond the scope of the X protocol.

That's funny, I just read that Jim Gettys at OLS advocated that X should get the notion of users and sessions.

X needs something like NX, and the advantage for X are clear.
Now of course it must be compatible with its license, I haven't been able to found what's NX license on the website but somehow I doubt that this is the X11 license and this means that X developpers will probably eventually reinvent the wheel..

The advantage for NX if it were incorpored within X are less clear of course.

This is a very well written a

Anonymous's picture

This is a very well written and informative series--even for the relative neophyte (me). I want to thank the author. There are certain technical terms or jargon ("network sockets," "shovel into the pipe") I would like to see changed (define "sockets;" why not "sent into the wire"rather than invoking plumbing and/or mining idioms that are somewhat foreign to the electronics medium with which you deal?), but overall it makes a very complex subject very approachable. Keep up the good work. The open source world needs more of this sort of writing.