Part III: AFS—A Secure Distributed Filesystem
To make all these features work, AFS comes in several distinct parts: the AFS client software that has to run on each computer that wants access to the AFS file space. The AFS server software is separated into four basic parts. It uses Kerberos for authentication, PTS for authorization, a volume location server for location independence and two servers for data serving (file server and volume server). All of these different processes are managed on each AFS server by the basic overseer (BOS) server. In addition to these necessary components, more service dæmons are available for AFS server maintenance and backup. How to install an AFS server is beyond the scope of this article.
Due to all of these different server components, the learning curve for AFS is steep at the beginning. However, the payoff is rewarding and many sites cannot go without it any longer. Once a cell is installed, the day-to-day maintenance cost for AFS is in the 25% full-time equivalent ( FTE) range, even for large installations.
For more information how AFS is used at various sites, including Morgan Stanley and Intel, have a look at the presentations given at the recent AFS Best Practices Workshop (see the on-line Resources).
You do not need your own AFS servers to try AFS yourself. Simply installing the OpenAFS client software and starting the AFS client dæmon afsd with a special option allows users to access the publicly accessible AFS space of foreign AFS cells.
The most difficult part of installing an AFS client is obtaining the necessary kernel module. If you are using Red Hat or Fedora, you can download RPMs (see Resources). In addition to the kernel module, the AFS client needs a user-space dæmon (afsd) and the AFS command suite. These come in two additional RPMs.
Once you have these modules, the next step is to configure the AFS client for your needs. First, you need to define the cell your computer should be a member of. The AFS cell name is defined in the file /usr/vice/etc/ThisCell. If you do not have your own AFS servers, this name can be set to anything. Otherwise, it should be set to the name of the cell your AFS servers are serving. The next parameter to look at is the local AFS cache. Each AFS client should have a separate disk partition to contain the client software, but the cache can be put wherever you want. The location and size of the cache are defined in the file /usr/vice/etc/cacheinfo. The default location for the AFS cache is /usr/vice/cache, and a size of 100MB is plenty for a single user desktop or laptop computer. This is the setting as it comes with the openafs-client RPM. The cacheinfo file for this setting should look like this:
/afs:/usr/vice/cache:100000
Next, configure the parameters for afsd, the AFS client dæmon. They are defined in /etc/sysconfig/afs. Add the -dynroot parameter to the OPTIONS definition. This allows you to start the AFS client without your own AFS servers.
Another option to add is -fakestat. This parameter tells afsd to fake the stat(3) information of all entries in the /afs/ directory. Without this parameter, the AFS client would go out and contact each single AFS cell known to it. That currently is 133 cells, as seen if you do a long listing (/bin/ls -l) in the /afs/ directory.
Because AFS is using Kerberos for authentication, time needs to be synchronized on your machine(s). AFS used to have its own mechanism for synchronization, but it is outdated and should not be used anymore. To switch it off, the option -nosettime needs to be added to the OPTIONS definition in /etc/sysconfig/afs. If you don't have a time sync method, use Network Time Protocol (see Resources).
After all the changes have been made, the new OPTIONS definition in /etc/sysconfig/afs should look like this:
OPTIONS="$MEDIUM -dynroot -fakestat -nosettime"
The last step is to create the mount point for the AFS filesystem, which is accomplished by entering % sudo mkdir /afs. Now, you can start the AFS client with % sudo /etc/init.d/afs start. This part takes a few seconds, because afsd needs to populate the local cache directory before it can start. Because the cache is persistent over reboots, subsequent starts will be faster.
Without your own AFS servers but with an AFS client configured as described above, you can familiarize yourself with some AFS commands and explore the global AFS space yourself. A quick test shows that you are not authenticated in any AFS cell:
% tokens Tokens held by the Cache Manager: --End of list--
No credentials are listed. See above for an example where credentials are present.
The first thing you should do is retrieve a long listing of the /afs/ directory. It shows all AFS cells known to your AFS client. Now, change into the directory /afs/openafs.org/software/openafs and do a directory listing. You should see this:
% ls -l total 10 drwxrwxrwx 3 root root 2048 Jan 7 2003 delta drwxr-xr-x 8 100 wheel 2048 Jun 23 2001 v1.0 drwxr-xr-x 4 100 wheel 2048 Jul 19 2001 v1.1 drwxrwxr-x 17 100 101 2048 Oct 24 12:36 v1.2 drwxrwxr-x 4 100 101 2048 Nov 26 21:49 v1.3
Go deeper into one of these directories. For example:
% cd v1.2/1.2.10/binary/fedora-1.0
Have a look at the ACLs in this directory with:
% fs listacl . Access list for . is Normal rights: openafs:gatekeepers rlidwka system:administrators rlidwka system:anyuser rl
This shows that two groups have all seven possible privileges: read (r), lookup (l), insert (i), write (w), full file advisory lock (k) and ACL change right (a). The special group system:anyuser that comes with AFS has read (r) and lookup (l) rights, which allow access literally to anybody.
To list the members of a group, use the pts (protection server) command:
% pts member openafs:gatekeepers -cell openafs.org -noauth Members of openafs:gatekeepers (id: -207) are: shadow rees zacheiss.admin jaltman
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Sponsored by AMD
If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.
Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.
Sponsored by ActiveState
| Containers—Not Virtual Machines—Are the Future Cloud | Jun 17, 2013 |
| Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer | Jun 12, 2013 |
| Weechat, Irssi's Little Brother | Jun 11, 2013 |
| One Tail Just Isn't Enough | Jun 07, 2013 |
| Introduction to MapReduce with Hadoop on Linux | Jun 05, 2013 |
| Android's Limits | Jun 04, 2013 |
- Containers—Not Virtual Machines—Are the Future Cloud
- Lock-Free Multi-Producer Multi-Consumer Queue on Ring Buffer
- Linux Systems Administrator
- Introduction to MapReduce with Hadoop on Linux
- Senior Perl Developer
- Technical Support Rep
- Weechat, Irssi's Little Brother
- UX Designer
- One Tail Just Isn't Enough
- Android's Limits
Free Webinar: Hadoop
How to Build an Optimal Hadoop Cluster to Store and Maintain Unlimited Amounts of Data Using Microservers
Realizing the promise of Apache® Hadoop® requires the effective deployment of compute, memory, storage and networking to achieve optimal results. With its flexibility and multitude of options, it is easy to over or under provision the server infrastructure, resulting in poor performance and high TCO. Join us for an in depth, technical discussion with industry experts from leading Hadoop and server companies who will provide insights into the key considerations for designing and deploying an optimal Hadoop cluster.
Some of key questions to be discussed are:
- What is the “typical” Hadoop cluster and what should be installed on the different machine types?
- Why should you consider the typical workload patterns when making your hardware decisions?
- Are all microservers created equal for Hadoop deployments?
- How do I plan for expansion if I require more compute, memory, storage or networking?




15 min 26 sec ago
15 min 53 sec ago
2 hours 40 min ago
6 hours 51 min ago
6 hours 55 min ago
1 day 2 hours ago
1 day 3 hours ago
1 day 3 hours ago
1 day 7 hours ago
1 day 8 hours ago