University of Toronto WearComp Linux Project
This paper is part one of a two-part series. In this part I will describe a framework for machine intelligence that arises from the existence of human intelligence in the feedback loop of a computational process.
I will also describe the apparatus of the invention that realizes this form of intelligence, beginning with a historical perspective outlining its visual and photographic origins. The apparatus of this invention, called “WearComp”, emphasizes self-determination and personal empowerment.
I also intend to present the material within a philosophical context I call COSHER (Completely Open Source, Headers, Engineering and Research) that also emphasizes self-determination and mastery over one's own destiny.
This “personal empowerment” aspect of my work is what I believe to be a fundamental issue in operating systems such as Linux. It is this aspect that WearComp and Linux have in common, and it is for this reason that Linux is the selected operating system for WearComp.
An important goal of being COSHER is allowing anyone the option of acquiring, and thus advancing, the world's knowledge base.
I will also introduce a construct called “Humanistic Intelligence” (HI). HI is motivated by the philosophy of science, e.g., open peer review and the ability to construct one's own experimental space. HI provides a new synergy between humans and machines that seeks to involve the human rather than having computers emulate human thought or replace humans. Particular goals of HI are human involvement at the individual level and providing individuals with tools to challenge society's preconceived notions of human-computer relationships. An emphasis in this article is on computational frameworks surrounding “visual intelligence” devices, such as video cameras interfaced to computer systems.
I begin with a statement of what I believe to be a fundamental problem we face in today's society as it pertains to computers and, in particular, to computer program source code and disclosure. Later, I will suggest what I believe to be solutions to this problem. Linux is one solution, together with an outlook based on science and on self-determination and individual empowerment at the personal level.
A first, fundamental problem is that of software hegemony, seamlessness of thought and the building of computer science upon a foundation of secrecy. Advanced computer systems is an area where a single individual can make a tremendous contribution to the advancement of human knowledge, but is often prevented from doing so by various forms of software fascism. A system that excludes any individual from exploring it fully may prevent that individual from “thinking outside the box” (especially when the box is “welded shut”). Such software hegemonies can prevent some individuals from participating in the culture of computer science and the advancement of the state of the art.
A second fundamental problem pertains to some of the new directions in human-computer interaction (HCI). These new directions are characterized by computers everywhere, constantly monitoring our activities and responding intelligently. This is the ubiquitous surveillance paradigm in which keyboards and mice are replaced by cameras and microphones watching us at all times. Perpetrators of this environmental intelligence claim we are being watched for our benefit and that they are making the world a better place for us.
Computers everywhere constantly monitoring our activities and responding intelligently have the potential to make matters worse from the software hegemony perspective, because of the possibility of excluding the individual user from knowledge not only of certain aspects of the computer upon his or her desk, but also of the principle of operation and the function of everyday things. Moreover, the implications of secrecy within the context of these intelligence-gathering functions puts forth a serious threat to personal privacy, solitude and freedom.
Science provides us with ever-changing schools of thought, opinions, ideas and the like, while building upon a foundation of verifiable (and sometimes evolving) truth. The foundations, laws and theories of science, although true by assumption, may at any time be called into question as new experimental results unfold. Thus, when doing an experiment, we may begin by making certain assumptions; at any time, these assumptions may be verified.
In particular, a scientific experiment is a form of investigation that leads wherever the evidence may take us. In many cases, the evidence takes us back to questioning the very assumptions and foundations we had previously taken as truth. In some cases, instead of making a new discovery along the lines anticipated by previous scientists, we learn that another previous discovery was false or inaccurate. Sometimes these are the biggest and most important discoveries—things that are found out by accident.
Any scientific system that tries to anticipate “what 99% of the users of our result will need” may be constructing a thought prison for the other 1% of users who are the very people most likely to advance human knowledge. In many ways, the entire user base is in this thought prison, but many would never know it since their own explorations do not take them to the outermost walls of this thought prison.
Thus, a situation in which one or more of the foundation elements are held in secret is contrary to the principles of science. Although many results in science are treated as a “black box”, for operational simplicity there is always the possibility that the evidence may want to lead us inside that box.
Imagine, for example, conducting an experiment on a chemical reaction between a proprietary solution “A”, mixed with a secret powder “B”, brought to a temperature of 212 degrees T. (Top-secret temperature scale which you are not allowed to convert to other units.) It is hard to imagine where one might publish results of such an experiment, except perhaps in the Journal of Non-Reproducible Results.
Now, it is quite likely that one could make some new discoveries about the chemical reaction between A and B without knowing what A and B are. One might even be able to complete a doctoral dissertation and obtain a Ph.D. for the study of the reaction between A and B (assuming large enough quantities of A and B were available).
Results in computer science that are based, in part, on undisclosed matters inhibit the ability of the scientist to follow the evidence wherever it may lead. Even in a situation where the evidence does not lead inside one of the secret “black boxes”, science conducted in this manner is irresponsible in the sense that another scientist in the future may wish to build upon the result and may, in fact, conduct an experiment that leads backwards as well as forwards. Should the new scientist follow evidence that leads backwards, inside one of these secret black boxes, then the first scientist will have created a foundation contaminated by secrecy. In the interest of academic integrity, better science would result if all the foundations upon which it was built were subject to full examination by any scientist who might, at some time in the future, wish to build upon a given discovery.
Thus, although many computer scientists may work at a high level, there would be great merit in a computational foundation open to examination by others, even if the particular scientist using the computational foundation does not wish to examine it. For example, the designer of a high-level numerical algorithm who uses a computer with a fully disclosed operating system (such as Linux) does other scientists a great service, even if he uses it only at the API level and never intends to look at its source code or that of the Linux operating system underneath it.
Imagine a clock designed so that when the cover was lifted off, all the gears would fly out in different directions, such that a young child could not open up his or her parents' clock and determine how it works. Devices made in this manner would not be good for society, in particular for the growth and development of young engineers and scientists with a natural curiosity about the world around them.
As the boundary between software and hardware blurs, devices are becoming more and more difficult to understand. This difficulty arises in part as a result of deliberate obfuscation by product manufacturers. More and more devices contain general-purpose microprocessors, so that their function depends on software. Specificity of function is achieved through specificity of software rather than specificity of physical form. By manufacturing everyday devices in which only executable code is provided, manufacturers have provided a first level of obfuscation. Furthermore, additional obfuscation tools are often used in order to make the executable task image more difficult to understand. These tools include strippers that remove things such as object link names and even tools for building encrypted executables which contain a dynamic decryption function that generates a narrow sliding window of unencrypted executable, so that only a small fragment of the executable is decrypted at any given time. In this way, not only is the end user deprived of source code, but the executable code itself is encrypted, making it difficult or impossible to look at the code even at the machine-code level.
Moreover, complex programmable logic devices (CPLDs), such as the Alterra 7000 series, often have provisions to permanently destroy the data and address lines leading into a device, so that a single chip device can operate as a finite-state machine yet conceal even its machine-level contents from examination. (See Resources 1 for an excellent tutorial on FPGAs and CPLDs.) Devices such as Clipper chips go a step further by incorporating fluorine atoms, so that if the user attempts to put the device into a milling machine to mill it off layer by layer for examination under an electron microscope, the device will self-destruct in a quite drastic manner. Thus, the Clipper phones could contain a “Trojan horse” or some other kind of back door and we might never be able to determine whether or not this is the case—yet another example of deliberate obfuscation of the operational principles of everyday things.
We have a growing number of general-purpose devices in which the function or purpose depends on software, downloaded code or microcode. Because this code is intellectually encrypted, so is the purpose and function of the device. In this way, manufacturers may provide us with a stated function or purpose, but the actual function or purpose may differ or include extra features of which we are not aware.
A number of researchers have been proposing new computer user interfaces based on environmental sensors. Buxton, who did much of the early pioneering research into intelligent environments (smart rooms, etc.), was inspired by automatic flush urinals (as described, for example, in U.S. Pat. 4309781, 5170514, etc.) and formulated, designed and built a human-computer interaction system called the “Reactive Room” (see Resources 2 and 3). This system consisted of various sensors, including optical sensors (such as video cameras) and processing, so that the room would respond to the user's movement and activity.
Increasingly, we are witnessing the emergence of intelligent highways, smart rooms, smart floors, smart ceilings, smart toilets, smart elevators, smart light switches, etc. However, a typical attribute of these “smart spaces” is that they were designed by someone other than the occupant. Thus, the end user of the space often does not have a full disclosure of the operational characteristics of the sensory apparatus and the flow of intelligence data from the sensory apparatus.
In addition to the intellectual encryption described in the previous section, where manufacturers could make it difficult, or perhaps impossible, for the end user to disassemble such sensory units in order to determine their actual function. There is also the growth of hidden intelligence, in which the user may not even be aware of the sensory apparatus. For example, U.S. Pat. 4309781 (for a urinal flushing device) describes:
... sensor... hidden from view and thus discourage tampering with the sensor... when the body moves away from the viewing area... located such that an adult user of average height will not see it... sensing means, will be behind other components... positioned below the solenoid to allow light in and out. But the solenoid acts in the nature of a hood or canopy to shield the sensing means from the normal line of sight of most users.... Thus most users will not be aware of the sensing means. This will aid in discouraging tampering with the sensing means. A possible alternate arrangement would be to place the sensing means below and behind the inlet pipe.
U.S. Pat. 4998673 describes a viewing window concealed inside the nozzle of a shower head, where a fiber optics system is disclosed as a means of making the sensor remote. The concealment is to prevent users from being aware of its presence. U.S. Pat. 5199639 describes a more advanced system where the beam pattern of the nozzle is adapted to one or more characteristics of the user, while U.S. Pat. 3576277 discloses a similar system based on an array of sensing elements.
A method of creating viewing windows to observe the occupants of a space while at the same time making it difficult for the occupants to know if and when they are observed is proposed in U.S. Pat. 4225881 and U.S. Pat. 5726706.
In addition to concealing the sensory apparatus, a goal of many visual observation systems is to serve the needs of the system architect rather than the occupants. For example, U.S. Pat. 5202666 discloses a system for monitoring employees within a restroom environment, in order to enforce hygiene (washing of hands after using the toilet).
Other forms of intelligence, such as intelligent highways, often have additional unfortunate uses beyond those purported by the installers of the systems. For example, traffic-monitoring cameras were used to round up, detain and execute peaceful protesters in China's Tiananmen Square.
U.S. Pat. 4614968 discloses a system where a video camera is used to detect smoke by virtue of the fact that smoke reduces the contrast of a fixed pattern opposite the video camera. However, the patent notes that the camera can also be used for other functions such as visual surveillance of an area, since only one segment or line of the camera is needed for smoke detection. Again, the camera may thus be justified for one use; additional uses, not disclosed to occupants of the space, may then evolve. U.S. Pat. 5061977 and 4924416 disclose the use of video cameras to monitor crowds and automatically control lighting in response to the absorption of light by the crowds. While this form of environmental intelligence is purportedly for the benefit of the occupants (to provide them with improved lighting), there are obvious other uses.
U.S. Pat. 5387768 discloses the use of visual inspection of users in and around an automated elevator. Again, these provide simple examples of environmental intelligence in which there are other uses, such as security and surveillance. Although even those other uses (security and surveillance) are purportedly for the benefits of the occupants, and it is often even argued that concealing operational aspects of the system from the occupants is also for their benefit, it is an object of this paper to challenge these assumptions and provide an alternate form of intelligence.
When the operational characteristics, function, data flow and even the very existence of sensory apparatus is concealed from the end user, such as behind the grille of a smoke detector, environmental intelligence does not necessarily represent the best form of human-machine relationship for all concerned. Even when the sensors are visible, there must be the constant question as to whether or not the interests of the occupant are identical to those who control the intelligence-gathering infrastructure.
The need for personal space, free from monitoring, has also been recognized (see Resources 4) as essential to a healthy life. As more and more personal space is stolen from us, we may need to be the architects of alternate spaces of our own.
The first solution to these problems is a framework called Completely Open Source, Headers, Engineering, and Research (COSHER). Before investing considerable time in learning how to use new software and in developing works for that new software, which may then become locked into a particular file format, we ask ourselves a very simple question: is the software in question COSHER?
This means that there has been no deliberate attempt at obfuscation of the underlying principles of the operation of this software or in preventing us from freely distributing the intellectual foundations upon which we may invest many years of our lives. Deliberate attempts at obfuscation include such practices as eliminating source code and stripping executable task images.
By using COSHER software, we are making a statement that we prefer Computer Science to Computer Secrecy. Science supports the basic principles of peer review, a continued development and advancement of software principles and principles that we build on top of the software.
Moreover, the time we invest in learning the software as well as creating works in the software will be less likely to go to waste if we have a copy of the complete source code of the software. In this manner, should the software ever become discontinued or unsupported, we will be able to become our own software support group and migrate the software forward to new architectures as our old computers become obsolete. If it is COSHER, chances are we will be less likely to lose the many hours or years we invest in producing works within the software. Furthermore, if we make new discoveries that are built on a foundation of COSHER software, they are easier to distribute.
In science, it is important that others be able to reproduce our results. Imagine what it would be like if we had built our results on top of DOS 3.1. Others would have to either rewrite our software to exactly reproduce our results, or find an old version of DOS 3.1. Since this is proprietary software, we are not at liberty to freely distribute it with our research, but it is also no longer available for purchase. However, if we had built our work on COSHER software such as Linux 1.13, we can include a full distribution of Linux 1.13 in an archive together with our results. Many years in the future, a scientist wishing to reproduce our results could then obtain a virtual machine (emulator for our specific architecture which will no doubt be obsolete by then) and install the COSHER operating system (Linux 1.13) that came with our archive, then compile and run our programs.
The Linux operating system is a good example of a COSHER operating system. GNU software is also COSHER. Many COSHER software packages are available, including GIMP (Gnu Image Manipulation Program) and the VideoOrbits software package (described in http://wearcam.org/orbits/index.html).
I propose a computational framework for individual personal empowerment. This framework is based on my “WearComp” invention—an apparatus for (embodiment of) realization of HI.
This framework involves designing a new kind of personal space. An embodiment of the “WearComp” invention is an apparatus that is owned, operated and controlled by the occupant of that space. In one sense, the apparatus of this invention is like a building built for one occupant and collapsed down around that one occupant.
WearComp as a Basis for HI
I invented WearComp in Canada in the 1970s as a photographic tool for the visual arts (see Resources 5), in particular, something I called “mediated reality” (altered perception of visual reality). The goal of mediated reality, unlike related concepts such as virtual (or augmented) reality, was to reconfigure (augment, deliberately diminish or otherwise alter) the perception of reality in order to attain a heightened awareness of how ordinary, everyday objects respond to light.
HI is a new form of human-computer interaction comprising a computer that is subsumed into the personal space of the user (e.g., the computer may be worn, hence the term “user” and “wearer” of the computer are interchangeable), controlled by the wearer, with both operational and interactional constancy (e.g., it is always on and always ready and accessible [see Resources 6]).
The WearComp invention, described in IEEE Computer, Vol. 30, No. 2 at http://wearcomp.org/ieeecomputer.htm (a historical account was given in IEEE ISWC-97, October 1997 and is also on-line at http://wearcomp.org/historical/index.html) forms the basis for HI. The evolution of the apparatus of this invention is depicted in Figure 1.
A wearable computer is a computer that is subsumed into the personal space of the user, controlled by the user and has both operational and interactional constancy.
Most notably, it is a device that is always with the user and into which the user can always enter commands and execute a set of entered commands while walking around or doing other activities.
The most salient aspect of computers in general (whether wearable or not) is their reconfigurability and their generality, e.g., their function can be made to vary widely, depending on the instructions provided for program execution. This is true for the wearable computer (WearComp). For example, the wearable computer is more than just a wristwatch or regular eyeglasses; it has the full functionality of a computer system and, in addition, is inextricably intertwined with the wearer.
This is what sets the wearable computer apart from other wearable devices such as wristwatches, regular eyeglasses, wearable radios, etc. Unlike these other wearable devices that are not programmable (reconfigurable), the wearable computer is as reconfigurable as the familiar desktop or mainframe computer.
The formal definition of wearable computing defined in terms of its three basic modes of operation and its six fundamental attributes is provided elsewhere in the literature. (See Resources 7.)
Such a computational framework allows one to subsume all of the personal electronics devices one might normally carry, such as cellular phone, pager, wrist watch, heart monitor, camera and video camera into a single device. Obviously, since it is a fully featured computer, it is possible to respond to e-mail, plan events on a calendar, type a report, etc., while walking, standing in line at the bank or anywhere. In this way, WearComp anticipated the later arrival of the so-called “laptop computer”, but has advantages over the laptop in the sense that it can be used while walking around doing other things. However, the real power of WearComp is in its ability to serve as a basis for personal imaging and humanistic intelligence.
WearComp not only subsumes the function of the laptop computer, but goes beyond it. Another area in which WearComp provides a truly new form of user interface not found on laptops and PDAs (personal digital assistants) is in its constancy of user interface and operation. This characteristic may become most evident in its use as a personal security camera. Imagine, perhaps as you walk down some quiet street at night, an assailant appears, demanding cash from you. You would not likely have the time or opportunity to pull out a camcorder to record the experience, but since the eyeglasses are worn constantly, you would have a video record of the experience to aid investigation.
Less extreme examples of WearComp as a new user-interface include the ability to construct a personal documentary video without conscious thought or effort. For example, in a fully mediated reality, all light entering the eyes, in effect, passes through the computer and may therefore be recorded (and possibly transmitted to remote locations). Wearable Wireless Webcam (see Resources 8) is an example of a personal documentary video recorded using a reality mediator.
In the future, we may very well have the capability to capture and recall our own personal experiences and to have photo albums generated automatically for us. We will never miss baby's first steps, because we will have a retroactive record feature that lets us, for example, “begin recording from 5 minutes ago”. Photo albums, in addition to being generated automatically, may also be exhibited while they are being generated. Rather than sending postcards to friends and relatives or showing them an album after you come back from vacation, you may just put on your sunglasses and have the album sent to them automatically, as was done with the Wearable Wireless Webcam experiment in which video was transmitted and still images automatically selected from the video.
While there will no doubt be more environmental intelligence than personal intelligence, there is at least the hope that there might be an end to the drastic imbalance between the two. The individual making a purchase in a department store may have several cameras pointing at him to make sure that if he removed merchandise without payment, there would be evidence of the theft. However, in the future, he will have a means of collecting evidence that he did pay for the item, or a recorded statement from a clerk about the refund policy. More extreme examples such as the case of Latasha Harlins, a customer falsely accused of shoplifting and fatally shot in the back by a shopkeeper as she attempted to walk out of the shop, come to mind.
In this sense, the camera-based reality mediator becomes an equalizer much like the Colt 45 in the “Wild West”. In the WearCam case, it is simply a matter of mutually assured accountability.
Much work remains to be done in development of this project. Currently, I teach Electrical and Computer Engineering (ECE1766) at the University of Toronto. To the best of my knowledge, this is the world's first course on how to be a “cyborg” entity. Students learn not only by doing, but by being. I call this form of learning existential learning. Each student creates a “reconfigured self”--a new form of personal space. Thus, students learn about the concept of personal empowerment from a first-person perspective through personal involvement.
We are writing new protocols for the altered perception of reality (mediated reality) that the WearComp provides. One example is picture-transfer protocol (PTP), in which packets of variable length are transmitted. Each packet is a JPEG compressed picture. Because of image compression, the amount of data varies depending on image content, hence the packet length depends on image content.
The reason for one packet per picture is that pictures are taken 60 times per second, which is much faster than they can be sent. Thus, whenever there is a lost packet and a re-transmission is needed, a newer picture will most likely be available to be sent instead. With PTP, retransmissions are always current.
Next month I will describe a mathematical (computational) framework called “Mediated Reality”, in which we will see that picture data is of greatest value only if it is up-to-date. Old pictures are of less value when trying to construct a computer-mediated reality. Thus, packet resends should always be of the most current image; hence the design of PTP is based on variable packet lengths, in which the packet length is the length of a picture.
Further information about the WearComp Linux project may be found in http://wearcam.org/ece1766.html.
Thanks to Kodak and Digital Equipment Corporation (DEC) for assistance with the Personal Imaging and Humanistic Intelligence projects.