eCrash: Debugging without Core Dumps
Now that we have clobbered to death what a backtrace is, how to produce one, the different methods of displaying one and how to debug a crash with one, it's time to change gears. A crash file can include a lot more information:
States of mutexes (who is holding the locks—useful for deadlock diagnosis).
Current error logs.
Most recent network packets.
Some of the above items could be useful information for post-mortem debugging. There is one caveat, however. Because we have encountered an exception, something has gone terribly wrong. Our data structures could be corrupt. We could be low on (or out of) memory.
Also, some threads could be deadlocked waiting on mutexes that our crashed thread was holding.
Because some of the data we want to display might generate another exception (if it is corrupted), we want to display the most important information first, then display more and more unsafe information. Also, to prevent information loss, buffers always should be flushed on FILE* streams.
Diagnosing a problem on a deployed embedded system can be a difficult task. But, choosing the right data to save or display in the case of an exception can make the task much easier.
With a relatively small amount of storage, or a remote server, you can save enough post-mortem information to be able to find a failure in your system.
Resources for this article: /article/9139.
David Frascone (firstname.lastname@example.org) works for Cisco Systems, Inc., in the Wireless Business Unit. He is currently working on Next Generation controller design.
Free DevOps eBooks, Videos, and more!
Regardless of where you are in your DevOps process, Linux Journal can help!
- Linux Journal
- Users, Permissions and Multitenant Sites
- New Products
- Flexible Access Control with Squid Proxy
- Security in Three Ds: Detect, Decide and Deny
- High-Availability Storage with HA-LVM
- Tighten Up SSH
- DevOps: Everything You Need to Know
- Solving ODEs on Linux
- Non-Linux FOSS: MenuMeters
- March 2015 Issue of Linux Journal: System Administration