ViaVoice and XVoice: Providing Voice Recognition
Conversing with a computer has long been a staple of science fiction. Such conversations are still largely in the realm of fiction, but voice recognition technology has improved significantly over the last decade. A number of voice recognition and control products are available on various platforms. Many people don't realize, however, that it is possible to control the Linux desktop by voice, and it has been possible for some time.
Voice control can provide computer access for those with overuse syndromes or other arm injuries--users who in the past had to switch platforms to find voice support. Aside from the geek factor, ordinary users can benefit from reduced arm stress and improved ease-of-use and speed for some tasks. Although the future of the software discussed in this article is somewhat in question--and does not give a completely hands-free environment--it does work. All that is required is a modest investment of time and money.
Voice control on Linux is possible by using two software packages. IBM ViaVoice for Linux supplies the basic voice recognition engine. XVoice, available under the GPL, uses the ViaVoice libraries to provide control of the desktop and applications.
IBM offers ViaVoice for Linux (for US English) in the United States and Canada. It is available for around $40, plus shipping, and includes a headset. It also can be downloaded from the IBM web site for a small discount. A slightly newer version of ViaVoice also is available as part of the Mandrake 8.0 PowerPack and ProSuite editions. The Mandrake ViaVoice apparently offers language support for both British and American English, French and German. Mandrake versions later than 8., however, no longer include ViaVoice. This article focuses solely on installing and using the version available from IBM.
ViaVoice for Linux requires a 233MHz Pentium MMX or better, with at least 128MB of RAM and a 16-bit sound card. It was designed to install on Red Hat 6.2, but I am using it successfully on Red Hat 7.3. Others also have had success installing it on non-Red Hat systems. Be prepared to experience some installation problems, though.
The first step is to install a Java Runtime Environment. ViaVoice 126.96.36.199 was tested with JRE-1.2.2 revision RC4 from blackdown.org. Using this exact revision will avoid incompatibilities with a different JRE.
After the JRE is installed, mount the CD and run vvsetup in the CD root directory as root. Once installed, run vvstartuserguru as yourself to set up as a ViaVoice user, configure the right audio levels and begin training ViaVoice for your voice. I could not get myself installed as a user until I deleted the /viavoice directory in my home directory (created during installation). I then had to rerun the user guru. This move fixed the problem, but it's rather disappointing that the installation script is so frail. Judging by the accounts of other people trying to install ViaVoice, I had an easy installation.
A base installation of ViaVoice, like other voice recognition software, does not provide great accuracy at first. Each user must train ViaVoice to better recognize his or her own idiosyncratic voice.
One training method is to read back text that ViaVoice displays in the user guru. This process is fairly easy to do, but it may not reflect the type of words and phrases that you tend to use a lot, making it less effective.
A better alternative is to use the ViaVoice Dictation Java application when working on actual documents. As you dictate, some words or phrases are recognized incorrectly. When this occurs, you use the correction facilities within Dictation to correct the errors. ViaVoice then tunes its voice models to better fit your voice. This method is more labor-intensive, but usually these corrections can be done with voice commands. A word of warning: save your work often, as Dictation is prone to crash.
An industry consultant told me that with 10 to 60 hours of training, current voice recognition technology should reach 98% accuracy. I have lost track of how much time I've spent on training, but my accuracy is only about 92-95% on arbitrary text. This may be because ViaVoice for Linux is much older than the Mac and Windows versions, or it could be for any number of other reasons. Fortunately, spoken commands are much more accurately recognized because there are fewer valid possibilities to match.
Even with only a couple of hours of training, you should notice improved accuracy. One thing I found is I needed to be more careful with my pronunciation. Bad microphones or background noise also can cause accuracy problems.
Fast/Flexible Linux OS Recovery
On Demand Now
In this live one-hour webinar, learn how to enhance your existing backup strategies for complete disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible full-system recovery solution for UNIX and Linux systems.
Join Linux Journal's Shawn Powers and David Huffman, President/CEO, Storix, Inc.
Free to Linux Journal readers.Register Now!
- Petros Koutoupis' RapidDisk
- Download "Linux Management with Red Hat Satellite: Measuring Business Impact and ROI"
- The Italian Army Switches to LibreOffice
- Linux Mint 18
- Oracle vs. Google: Round 2
- The FBI and the Mozilla Foundation Lock Horns over Known Security Hole
- Varnish Software's Varnish Massive Storage Engine
- Privacy and the New Math
- Firefox 46.0 Released
Until recently, IBM’s Power Platform was looked upon as being the system that hosted IBM’s flavor of UNIX and proprietary operating system called IBM i. These servers often are found in medium-size businesses running ERP, CRM and financials for on-premise customers. By enabling the Power platform to run the Linux OS, IBM now has positioned Power to be the platform of choice for those already running Linux that are facing scalability issues, especially customers looking at analytics, big data or cloud computing.
￼Running Linux on IBM’s Power hardware offers some obvious benefits, including improved processing speed and memory bandwidth, inherent security, and simpler deployment and management. But if you look beyond the impressive architecture, you’ll also find an open ecosystem that has given rise to a strong, innovative community, as well as an inventory of system and network management applications that really help leverage the benefits offered by running Linux on Power.Get the Guide