Popcon - Are You In Or Out?
Those of you who regularly install Debian may have noticed a prompt that asks you if you would like to install Popcon, the Debian Popularity Contest. Popcon gathers statistics about package usage and periodically submits it to Debian. The anonymous statistics gathered by the script are freely available on the Debian website, and the script can be invoked manually to give a clearer idea of package usage on your own system.
I must admit that I had always declined to take part in the survey. Some people will object on privacy grounds, but personally, I trust that Debian aren't going to do anything devious with the info. I had opted out because it sounded like another possible point of failure and didn't actually know what the project did.
If you didn't select it when installing Debian, you can install Popcon at any time via the package manager, and this doesn't hamper the quality of the data. If you're installing it manually, bear in mind that it installation script prompts for user input, so make sure that you can view the text output of your package management system. The information that it is actually gathering is the installation date and most recent access date of every package on your system. By default, Popcon gathers the information and submits it once a week using a cron job.
Once installed, you can invoke it automatically by typing (as root)
You'll receive a long list of all of the packages on your system arranged in order of most recently accessed. Here is a sample of the output when I ran it on my Debian Sid box.
1290877204 1290877209 iptables /usr/sbin/ip6tables-apply OLD
1290877204 1290877339 ed /usr/bin/red OLD
1290877204 1290877401 laptop-detect /usr/sbin/laptop-detect OLD
1290877204 1290877230 libnfsidmap2 /usr/lib/libnfsidmap/static.so OLD
1290877204 1290877414 libruby1.8 /usr/lib/ruby/1.8/net/ftp.rb OLD
1290877204 1290877455 google-gadgets-gst /usr/lib/google-gadgets/modules/gst-audio-framework.so OLD
1290877204 1290877246 tcpd /usr/sbin/tcpd OLD
The first two numbers are the access and the creation time of the most recently accessed file within the library. The time is presented in Unix time format, that is, number of seconds elapsed since midnight January 1970. This is followed by the name of the library and the most recently accessed file in that library. The last piece of information is a tag which indicates if that library is considered old (not accessed for more than a month). There are tags to indicate if the library is recently installed or contains no runnable programs.
Obviously, the output for a typical system is going to be vast. For this reason, if you're invoking it from the command line, either piping to a file or grep is the best approach. For example, piping it to a file with
yielded a file that worked fine when dropped onto the Gnumeric spreadsheet application. It's worth noting that Gnumeric has a function convert Unix time into typical date format.
You can obtain the statistics that have been collated from all participating systems via the Debian website. Obviously, these results are tainted by the classic voluntary survey weakness of self selection. Who knows, perhaps people who choose to participate in Popcon are have different usage patterns to people who don't?
Personally, in future, I'm going to enable Popcon on my main system as I'm sure the data is useful to the Debian project. In addition, I've often wondered what stuff is installed on my system yet never actually used.
The Debian Popularity Contest website
The readme file, which gives detailed instruction on how to use Popcon.
The FAQ file which addresses potential concerns that users might have in terms of privacy issues etc.
UK based freelance writer Michael Reed writes about technology, retro computing, geek culture and gender politics.
|Non-Linux FOSS: libnotify, OS X Style||Jun 18, 2013|
|Containers—Not Virtual Machines—Are the Future Cloud||Jun 17, 2013|
|Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer||Jun 12, 2013|
|Weechat, Irssi's Little Brother||Jun 11, 2013|
|One Tail Just Isn't Enough||Jun 07, 2013|
|Introduction to MapReduce with Hadoop on Linux||Jun 05, 2013|
- Containers—Not Virtual Machines—Are the Future Cloud
- Non-Linux FOSS: libnotify, OS X Style
- Linux Systems Administrator
- Validate an E-Mail Address with PHP, the Right Way
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Senior Perl Developer
- Technical Support Rep
- UX Designer
- Introduction to MapReduce with Hadoop on Linux
- RSS Feeds
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?