Linux-Based Voice Recognition

Software

by Janine M. Lodato

on February 3, 2003

Editors' Note: The following article is from Linux Journal's sister on-line publication, Linux Gazette.

Let's look at Linux-based voice recognition software from the perspective of China. It would behoove Linux computer makers to begin manufacturing their computers in China. China offers a low-cost method of manufacturing and provides them with a large market for their hardware, which can also be exported to other important markets around the world.

Linux computers have the capacity to accommodate voice recognition systems, such as IBM ViaVoice. This is especially advantageous to Chinese speakers because both Mandarin and Cantonese are complex languages in their written forms. Using a keyboard is next to impossible for Chinese languages, because so many characters are involved in typing a document. Documents could be produced more easily, therefore, using voice recognition software running on a Linux platform.

Other languages also would benefit from using voice recognition software for purposes of speed. Hands-busy, eyes-busy professionals can benefit greatly from voice recognition so they don't have to use a mouse and keyboard to document their findings. In this manner, voice-activated, easily-used telephone systems might benefit all walks of life. Anyone driving a car, for example, will find voice recognition a much more effective way of communicating while manipulating a vehicle.

The health-care market alone may justify the Linux-based voice recognition project. Health-care services are the largest expense of the Group of Ten nations, and it is the fastest growing sector as well. Health-care workers would benefit from using their voices to document patients' treatments. Voice recognition would allow them a hands-free environment in which to analyze, treat and write about particular cases easily and quickly.

In addition, electronically connected medical devices, using a wireless LAN, could benefit:

Hospital administration staff
- Improve the usage efficiency of resources
- Achieve standardized, quality patient management
- Dramatically reduce data recording (transcription) errors
- Lower costs
- Make any room a telemetry room on demand (that is, do laboratory measurements in any room regardless of where the central equipment is located)
Medical staff
- Be empowered with a 24/7 complete set of vital-sign data
- Have more time for hands-on care
- See changes in patient status immediately to enable quicker responses

For life sciences fields, the simplicity, reliability and low cost of using Linux for servers, tablets, embedded devices and desktops is paramount. Currently, only about 10% of the documents in the health-care field in the USA are produced electronically, due to the cumbersome and unreliable nature of the Windows environment. Thirty percent of the cost of health-care is a direct result of manual creation of the documents. Furthermore, many malpractice cases are caused by imprecise transcriptions of manually scribbled medical records and directives, as anybody who looks at a prescription can attest.

Obviously, the market for these new technologies exists. What remains is for a hungry company with aggressive sales people to tap into that market. Once those sales people distribute the technology, the needs of many will be met and a new mass market will open up that Microsoft is not now filling: assistive technology (AT). Although the field already exists, it needs to be expanded to include both physically disabled and functionally disabled people.

Yes, voice recognition offers great promise for the future. However, it isn't perfect and needs to be improved. One area that needs improving is lip reading, which would bolster its accuracy. Another area for improvement is multi-tonal voice input; another is directional microphones. In conclusion, every generation of voice recognition software will improve as the hardware for Linux gets bigger and stronger.

Load Disqus comments