Peter van der Linden's Guide to Linux: A Lesson in Encryption, Part 2
Editor's Note: The following is an excerpt from chapter 11, "Keeping Your Data Private", of Peter van der Linden's Guide to Linux, published August 2005 by Prentice Hall, ISBN 0-13-187284-2.
In Part 1 of this series, we reviewed how public key encryption (PKE) uses a pair of keys, one to encode and one to decode. As well as being the strongest known form of encryption, PKE is more secure because you no longer have to keep the encoding key secret--you can publish it openly. So it's easy for field agents to replace any headquarter keys that are compromised.
Now you're ready to encrypt a file. Use your favorite editor to create a text file to work on called myinfo.txt, containing some highly secret text. You equally well can create the file with the following command:
echo "Parmesan cheese smells funny" > myinfo.txt
It's quite helpful when learning about GPG to be able to look at the contents of any file. Normally you can't type out the contents of a binary file, such as an executable or an encrypted file. The bytes that don't contain printable ASCII codes are interpreted as terminal control characters. If the file happens to contain the wrong values, it can cause the terminal to freeze or behave strangely.
You can use the od utility to avoid the terminal going wild and output any file in printable form. To see a file as printable representation of bytes, use od -c filename. Here is the od (octal dump) output for our unencrypted file of secret information.
$ od -c myinfo.txt 0000000 P a r m e s a n c h e e s e 0000020 s m e l l s f u n n y \n 0000035
You can create an encrypted version of this file that no one can read without your secret key, with this command:
$ gpg --recipient email@example.com --encrypt myinfo.txt
You provide the name of the file you want encrypted. The output goes in a file of the same name but with ".gpg" appended. The encrypted file in this example is called myinfo.txt.gpg. Although the input file contains readable text, the output file contains binary data.
You can encrypt any kind of file this way, not only files of ASCII text. You can encrypt applications, images, spreadsheets, documents and so on. Use the fingerprint that identifies the public key of the person you want to be able to decrypt the file. If you're encrypting your own files, you will use your own identifying e-mail address where I have put "firstname.lastname@example.org" . This should be the e-mail address you gave when generating your keys.
The unencrypted file myinfo.txt is not changed by this operation, so if you truly want the information to be secret, you next must delete that file and any other files that contain the information in the clear. And don't use an ordinary old delete, but a secure delete that repeatedly overwrites the bits in the deleted file on disk. If you intend a really secure system, do a Web search for "secure delete Linux" to locate some free tools to help with this.
The encrypted file is longer than the plain-text file because GPG puts some extra housekeeping data, such as the program version number, in there to assist when decrypting. Here are the first few lines of results from running od on the encrypted file.
$ od -c myinfo.txt.gpg 0000000 205 001 016 003 332 027 o _ 202 331 252 7 020 003 375 033 0000020 w @ 244 333 245 024 P 271 ! 337 \n 333 t 205 200 \0 0000040 1 202 331 306 266 024 034 204 \0 ^ 375 " ( u 032 255 0000060 327 263 263 225 M 216 , 314 207 340 023 222 ? 207 203 337 0000100 i 205 006 200 Q 266 m 4 177 ~ 257 ; \a 5 W 205
As you can see, the encrypted file is a binary file. If you prefer an encrypted file expressed in the form of printable characters (typically because you want to email it), you can use the -a option for GPG. That ensures that the encrypted output is expressed in short lines of printable ASCII characters. The command is as follows:
$ gpg -a --recipient email@example.com --encrypt myinfo.txt
The new output file will be the input file name plus the extension ".asc" . A printable ASCII output file might be 50% bigger than the corresponding binary output file. Dump the encrypted ASCII file so that you can compare the contents with the binary version. You will see the first few lines are like the following:
$ od -c myinfo.txt.asc 0000000 - - - - - B E G I N P G P M 0000020 E S S A G E - - - - - \n V e r s 0000040 i o n : G n u P G v 1 . 2 . 0000060 6 ( G N U / L i n u x ) \n \n h 0000100 Q E O A 7 X F U y 1 w s Q 8 y E 0000120 A P + K / i 8 H p X U x K e O W 0000140 7 5 o 1 3 Q U q Z L Q g R e N f
That file represents my secret information that Parmesan cheese smells funny, but you'll never know that unless you have my private key. Or unless you've been in the same room with Parmesan cheese.
Remember that you encrypt a message using a specific public key. Only the one person with the private key that corresponds to that public key can decrypt that message.
The encrypted message is all in ASCII with short line lengths, so it can be sent in e-mail without loss. Never edit an encrypted file or change it in any way. If you do, the private key will not be able to transform it back into the original, readable file.
Sometimes you want to encrypt many files at once, such as an entire directory. To encrypt many files, first bundle up all the files into one gzip or bzip archive file.
$ tar -cvzf mydocs.tar.gz DocumentsFolder
That command creates a single compressed archive file, containing the files in the folder called DocumentsFolder. Then, encrypt that single archive file in the usual way:
# gpg --recipient firstname.lastname@example.org --encrypt mydocs.tar.gz
The encrypted output file will be "mydocs.tar.gz.gpg".
To decrypt a file, you have to know the secret key that goes with the public key that was used to encrypt it. You might have several key pairs for your different types of correspondence. You tell GPG which secret key to use by specifying the public key owner. Since this often occurs in the context of messages, the key owner is called a recipient.
GPG will use the secret key for the recipient that you name. But first GPG will challenge you to provide the corresponding passphrase. Successfully meeting that challenge tells GPG that you are entitled to use that secret key. You give the options first, including the -o somefilename argument to specify the name of the output file. The action to take, namely --decrypt somefilename, has to come last on the command line. Here is an example of the command line and its output:
$ gpg --recipient email@example.com -o plaintext.txt --decrypt myinfo.txt.gpg You need a passphrase to unlock the secret key for user: "Peter van der Linden (working on Linux)" 2048-bit ELG-E key, ID 68F3472B, created 2005-04-03 (main key ID 6C7C81B2) type your passphrase here gpg: encrypted with 2048-bit ELG-E key, ID 68F3472B, created 2005-04-03 "Peter van der Linden (working on Linux) "
That command recovers the file into plaintext.txt. You should confirm this by running the commands and examining the files.
You can see how you decrypt a file you get from someone else. They need to have encoded it with your public key before sending it to you. Then you decode it with your private key, as though it were a file that you encrypted in the first place. Your correspondents have to get your public key from a place that you can both trust. If it's from a Web page, it needs to be a Web page that cannot be changed by someone who is trying to learn your secret information or trying to sabotage your communications.
There's no security-related reason not to publish your public key widely. The more widely published the better, because it will allow more people to send you confidential files.
The GPG framework includes a number of searchable databases of public keys maintained by public-spirited organizations. You can load your key into one of these databases without charge, making it widely available.
The vital thing you need to do with your private key is to keep it out of unauthorized hands. That's not quite as easy as it may seem. To be really secure, the system that holds your private data, such as your private key file and the passphrase used to access it, should not be connected to any network.
Make one backup copy of the ~/.gnupg folder onto a CD and store it somewhere completely trustworthy. You should not include anything confidential like your private key in your regular system backups.
You will want to pull a copy of your public key out of the gnupg folder, so that you can publish it to other people. You can do that with this command. The -o argument directs output into the file name that appears after it.
gpg --export --armor -o ~/Desktop/my-public-key.asc
That creates what is called an armored public key file. The term armor doesn't mean that the file has any magical protective properties. It just means that the file holds your public key in short lines of ASCII, not binary data. ASCII data is often more convenient for passing around between systems.
Here are the contents of the my-public-key.asc file we just extracted. (I've removed a dozen lines from the middle to avoid wasting space--the point is to show you what a public key looks like in ASCII.)
$ cat my-public-key.asc -----BEGIN PGP PUBLIC KEY BLOCK----- Version: GnuPG v1.2.6 (GNU/Linux) mQGiBEGGx/8RBACri2RSuN0NIzYjlF7yqXXqBIQOJWtXfPnoKjV/GDW6lOAlcx+F pXqkJZlUrJ14qeai5hyICs8tAnBpGbOlPUiuEBBwiEkEGBECAAkFAkGGyAICGwwA ... CgkQJiBKzudtFdV75QCgi0LtN4P34WdP0S1bDglmccgKE2YAn2Glcvp8OLM/aNxh OfiYt5AMnd+4 =tucQ -----END PGP PUBLIC KEY BLOCK-----
When you use the --armor option to force the output to be in ASCII with short lines, the convention is to give the output file the extension ".asc". You can equally ask for the encryption to be put into an ASCII file, as in this example.
$ gpg --armor --recipient firstname.lastname@example.org --encrypt mysecret.txt
This command will put the encrypted output into ASCII file mysecret.txt.asc.
Give people a copy of your my_public_key.asc file (perhaps on a CD), and they can use your public key to send you encrypted mail. You can also place your public key on one of the servers that a number of public-spirited organizations run and retrieve others' keys from the servers.
If you want to send email@example.com an encrypted message, you first need to make her public key known to your GPG library. Perhaps you copied Alice's public key from her Web site. It's not secure for Alice to e-mail you her armored key. How can you be sure that mail really came from her or was not changed by someone else on the way?
The best way is to get Alice's public key on a CD from her in person. Put Alice's key in a file called something like alice-public-key.asc then type this command:
$ gpg --import alice-public-key.asc
Alice's key will now be available to GPG on your PC, and you can send her encrypted files. GPG stores the keys in a series of files that it collectively calls a keyring.
Another way to bring someone's public key onto your machine is to search the keyservers, stating the key ID you are interested in. The host pgp.mit.edu is a keyserver located at the Massachusetts Institute of Technology. There are other keyservers and most of them regularly exchange data with one another so that recipients often can obtain a public key by asking a different server than the one the key was originally sent to.
# gpg --keyserver pgp.mit.edu --recv-key 0F3BB819
That will import a public keys onto your PC, allowing you to encrypt files for that keyholder. The key ID identifies the key to import.
After you have added a few keys, you will find it useful to be able to list them. You will see a series of lines, two lines per key, showing the public keys that GPG has stored locally and the e-mail address associated with each. Here's an example of output from the command:
$ gpg --list-keys /home/peter/.gnupg/pubring.gpg ------------------------ pub 1024D/6C7C81B2 2005-04-03 [expires: never] uid Peter van der Linden (working on Linux) sub 2048g/68F3472B 2005-04-03 [expires: never] pub 1024D/09AC0A6A 1998-07-14 uid Alice Smith <firstname.lastname@example.org> sub 2048g/81451634 1998-07-14 pub 1024D/C94AEC02 2000-02-22 UID Harry Jones <email@example.com> sub 2048g/DAB1F6A4 2000-02-22
As you can see, the first line of each entry reads something like the following:
This means "public key that is 1024 digits long". The characters after pub 1024D are the last eight characters of the fingerprint, also called the key ID. You can see what e-mail address is associated with each fingerprint.
Although public key cryptography is one of the most secure code systems known, some factors make it less than perfect in practice. First, GPG relies on a passphrase, and passphrases can be stolen or overheard. Accounts can be broken into. Passphrases do not protect against physical access to the data. If an adversary can get access to your PC, they can often get to the data. You should never use GnuPG on a remote system because it is too easy to snoop on what you type as it travels over the network.
The key length that you select determines how break-resistant your encrypted data is. A key length of 1,024 digits is good enough for most purposes now, but in some years time, it may easily be broken by supercomputers. A few years after that, it may be broken by desktop PCs.
Although the core GPG system is secure, everything going into it and coming out of it needs to be very carefully considered. When you get a public key from Alice, how sure can you be that it really came from her and that it was not really from Bob who administers Alice's mail server? Bob could then intercept all your secret mail to Alice, read it, and re-encode it with her true key before sending it on to her. This is known as a "man in the middle" attack.
To cope with the uncertainties, or at least express them, the GPG program has the concept of levels of trust in keys. A key that someone leaves on a CD on your desk may have a low level of trust. Perhaps someone switched or copied the CD. A key that you yourself generated a moment ago can be trusted absolutely. You might notice that the output when we generated a key included the text "key marked as ultimately trusted."
Convey or change the level of assurance with the --edit-keys option to GPG. Make sure you use your e-mail address or fingerprint, not mine. You can change the level of trust for your own key and the public keys of others that you keep on your PC.
$ gpg --edit-key firstname.lastname@example.org a few lines of output, ignored Command> trust Your decision? 4 a couple of lines of output, ignored... Command> save
The levels of trust run from 1 to 5 with these meanings:
I don't know or don't want to specify what trust I place in this key.
I do not trust this key.
I trust this key marginally.
I trust this key fully.
I place ultimate trust in this key.
By confirming ultimate trust in a key, you avoid reminders from GPG saying that it is not certain that the key belongs to the person named in the user ID. But remember that cryptography is a serious business and always involves a tradeoff between security and convenience.
You might have noticed that during the key creation process, you just had to assert who you were. But anyone could create a public/private key pair and say that it belonged to Peter van der Linden, and there's nothing anyone can do about it. The imposter could even publish that key on the MIT key server and use coded messages and pretend to be me.
You can guard against this with certificates and signatures. A certificate is a guarantee for a public key, ideally from a trusted authority. You can go to a company like Thwaite or Verisign and persuade them of your identity (with a passport or driver's license) and give them some money. They will give you a certificate in digital form that is bundled with your public key.
The meaning of the certificate is "Verisign believes this key belongs to someone who has a lot of identity documents belonging to Peter van der Linden." If Verisign is doing their job, imposters cannot get such a certificate on a fraudulent key generated using someone else's name and address.
Most people who use encryption for personal files don't bother with this. To protect against impersonation, you talk to some of your friends and ask them to sign your key after they verify that it belongs to you. Those signatures move around with your public key. So after I've verified that the key I got from MIT really does belong to you (with a phone call, say, just in case someone has hijacked your e-mail and is faking the whole conversation), I sign it. If a third friend picks up your key, she can see that I've vouched for the fact that it's really your key, and maybe to her, my word is better than Verisign's. A chain of trust is thus built.
Encryption is a fascinating topic, and it sometimes raises passions. Up to the late 1990s, encryption software (like GPG) was restricted by the US government under the International Traffic in Arms Regulations. It was illegal for people in the United States to send such software overseas, just the way we cannot mail machine guns or nuclear submarines to our nephews.
The GPG software was implemented outside the United States precisely to avoid breaking this US restriction. Even today, some countries such as France have made it illegal for their citizens to use cryptography. Recently in Britain a regulation was written making it a criminal offense to refuse to give up encryption keys or plain-text versions of encrypted data. I don't know how they expect to enforce that for people who keep their collections of random numbers, wink wink, in disk files. Unless they plan to beat it out of people with "rubber hose" cryptography.
That completes the description of file encryption and key management. In the next article, I'll describe how to use the GnuPG program to send and receive encrypted e-mail.
Peter van der Linden currently works in Silicon Valley as a software consultant who specializes in Linux and open-source software. A graduate of Yale, van der Linden also is author of The Official Handbook of Practical Jokes, Expert C Programming and Just Java.