Reading File Metadata with extract and libextractor
Listing 8. jpegextractor.c adds the MIME type to the list after parsing the file header.
if ( (data != 0xFF) || (data != 0xD8) ) return prev; /* not a JPEG */ addKeyword(&prev, strdup("image/jpeg"), EXTRACTOR_MIMETYPE); /* ... more parsing code here ... */ return prev;
libextractor is a simple extensible C library for obtaining metadata from documents. Its plugin architecture and broad support for formats set it apart from format-specific tools. The design is limited by the fact that libextractor cannot be used to update metadata, which more specialized tools typically support.
Resources for this article: /article/8207.
Christian Grothoff graduated from the University of Wuppertal in 2000 with a degree in mathematics. He currently is a PhD student in computer science at Purdue University, studying static program analysis and secure peer-to-peer networking. A Linux user since 1995, he has contributed to various free software projects and now is the maintainer of GNUnet and a member of the core team for libextractor. His home page can be found at grothoff.org/christian.
Webinar: 8 Signs You’re Beyond Cron
11am CDT, April 29th
Join Linux Journal and Pat Cameron, Director of Automation Technology at HelpSystems, as they discuss the eight primary advantages of moving beyond cron job scheduling. In this webinar, you’ll learn about integrating cron with an enterprise scheduler.Join us!
|Play for Me, Jarvis||Apr 16, 2015|
|Drupageddon: SQL Injection, Database Abstraction and Hundreds of Thousands of Web Sites||Apr 15, 2015|
|Non-Linux FOSS: .NET?||Apr 13, 2015|
|Designing Foils with XFLR5||Apr 08, 2015|
|diff -u: What's New in Kernel Development||Apr 07, 2015|
- Drupageddon: SQL Injection, Database Abstraction and Hundreds of Thousands of Web Sites
- Play for Me, Jarvis
- Non-Linux FOSS: .NET?
- Designing Foils with XFLR5
- Not So Dynamic Updates
- Flexible Access Control with Squid Proxy
- New Products
- diff -u: What's New in Kernel Development
- Users, Permissions and Multitenant Sites