Indexing Texts with SMART
I do not use SMART for bread-and-butter retrieval, but for the weights it computes and the indices it creates. At this point I usually want to do some other manipulations of the data. I offer my thanks to the developers of Unix in general and to Linux in particular for creating a whole string of ever more complicated and sophisticated shell scripts, the standard Unix tools and a few of “My Very Own” utilities that suffice to process the SMART output to a file that is ready for import into SPSS.
Now I have to quit Linux and boot MS-DOS, start MS Windows and finally enter SPSS to do the statistics and create some graphs. I am a newcomer to Unix (indeed it was the fact that Linux offered a way to use SMART that pulled me over the line two years ago). While MS Windows is not my favorite operating system, SPSS gets the job done. When the output is written to disk, I immediately escape back to Linux to write the final article, report, or whatever with LaTeX.
On this point I have two messages—one bad. The good news is that SMART is obtainable by anonymous ftp from Cornell University and can be used free for scientific and experimental purposes. Better yet, it compiles under Linux without much tweaking and twiddling. There is also a fairly active mailing list for people who use SMART (email@example.com).
The bad news: the manual—what manual? SMART is not for the faint of heart; after unpacking and compilation, you'll find some extremely obscure notes and examples, and that is all. Nevertheless, if you have more than just a few megabytes of text to manage and the stamina to learn SMART, it certainly is the best solution for your information retrieval needs. I do wish someone would write a comprehensive manual. In the meantime, you may be helped by my “tutorial for newbies” found at http://pi0959.kub.nl:2080/Paai/Onderw/Smart/hands.html.
This article was published previously in Issue 13 of the Linux Gazette.
Hans “Paai” Paijmans (firstname.lastname@example.org) is a University lecturer and researcher at Tilburg University and a regular contributor to several Dutch journals. Together with E. Maryniak, he wrote the first Dutch book on Linux—already two years ago. My, doesn't the time fly? His home page is at http://pi0959.kub.nl:2080/paai.html .
|Making Linux and Android Get Along (It's Not as Hard as It Sounds)||May 16, 2013|
|Drupal Is a Framework: Why Everyone Needs to Understand This||May 15, 2013|
|Home, My Backup Data Center||May 13, 2013|
|Non-Linux FOSS: Seashore||May 10, 2013|
|Trying to Tame the Tablet||May 08, 2013|
|Dart: a New Web Programming Experience||May 07, 2013|
- RSS Feeds
- New Products
- Making Linux and Android Get Along (It's Not as Hard as It Sounds)
- Drupal Is a Framework: Why Everyone Needs to Understand This
- A Topic for Discussion - Open Source Feature-Richness?
- Home, My Backup Data Center
- Readers' Choice Awards
- What's the tweeting protocol?
- Developer Poll
- New Products
- Reply to comment | Linux Journal
2 hours 2 min ago
- play with linux? i think you mean work-around linux
10 hours 28 min ago
- Where is Epistle?
10 hours 34 min ago
- You forgot OwnCloud
11 hours 4 min ago
- aplikasi free
14 hours 18 min ago
- Having a framework
14 hours 22 min ago
- Fix my computer
15 hours 2 min ago
19 hours 9 min ago
- Missed one
19 hours 28 min ago
- web Host
19 hours 37 min ago
Enter to Win an Adafruit Prototyping Pi Plate Kit for Raspberry Pi
It's Raspberry Pi month at Linux Journal. Each week in May, Adafruit will be giving away a Pi-related prize to a lucky, randomly drawn LJ reader. Winners will be announced weekly.
Fill out the fields below to enter to win this week's prize-- a Prototyping Pi Plate Kit for Raspberry Pi.
Congratulations to our winners so far:
- 5-8-13, Pi Starter Pack: Jack Davis
- 5-15-13, Pi Model B 512MB RAM: Patrick Dunn
- Next winner announced on 5-21-13!
Free Webinar: Linux Backup and Recovery
Most companies incorporate backup procedures for critical data, which can be restored quickly if a loss occurs. However, fewer companies are prepared for catastrophic system failures, in which they lose all data, the entire operating system, applications, settings, patches and more, reducing their system(s) to “bare metal.” After all, before data can be restored to a system, there must be a system to restore it to.
In this one hour webinar, learn how to enhance your existing backup strategies for better disaster recovery preparedness using Storix System Backup Administrator (SBAdmin), a highly flexible bare-metal recovery solution for UNIX and Linux systems.