Economical Fault-Tolerant Networks

We present a software solution which achieves fault tolerance by capitalizing on redundant replication of data and elimination of any single point of failure and with transparent switchover.
Security

If unchecked, any alien machine can generate an election call and the servers will move into an unwanted election state. For this reason, security and integrity of the signals exchanged between servers is highly stressed. We have implemented three levels of security for safe operation.

  • Level 1: a list of all servers taking part in the election is maintained at all servers. Only signals from these machines are accepted; all other messages related to the election process are discarded.

  • Level 2: the servers are state-oriented. That is, they acquire certain states, e.g., election state, master state, etc. In every state, some signals are anticipated and only these are accepted. Any other signal received, even originating from listed servers, is discarded.

  • Level 3: an encryption scheme is used to encrypt all the signals. A random key is created for encryption and is valid for only one message. Therefore, even if a signal is intercepted and cracked, the encryption key will not be valid at any other time.

Results

For our implementation, we empirically found that weekly synchronization of large data suffices, while the password databases are replicated on every election. Other scenarios may require a more frequent synchronization.

The level of reliability and fault tolerance of the cluster increases in proportion to the size of the cluster. Increasing the number of servers increases the maintenance time for synchronization and unduly lengthens the election process. In our situation, we determined from experimenting that three to four servers are enough to guarantee a practical working solution.

On average, the slave servers are negligibly loaded, while the master servers are not loaded for more than 0.1% of the CPU usage. Network load is also very low, except during heavy synchronization, which is therefore run as a scheduled process.

Our test bed for this implementation is the Digital Computer Laboratory, UET, Lahore, Pakistan. The lab consists of 10 Pentium-based servers and 60 diskless workstations connected by 10MBps Ethernet.

Future Directions

There is a need for development and implementation of techniques that provide for an immediate synchronization. One method could be that whenever a file is updated, all the servers update their versions of the file. In this way, all data on all servers at all times is perfectly synchronized, thus eliminating heavy network/server loads during scheduled synchronization.

Conclusions

This technique is a practical and feasible implementation of fault tolerance for low-budget LANs running open-source operating systems, such as in developing countries and resource-scarce academic institutions where expensive commercial solutions are just that—expensive.

Acknowledgements

Resources

Ali Raza Butt (drewrobb@mediaone.net), is a final-year student of EE-Communication Systems at UET, Lahore, Pakistan. Ali and the other authors are project group fellows for implementing this work and system administrators of the computer lab. Ali loves to program for network management and build PC-controlled gizmos.

Jahangir Hasan is a final-year student of EE-Communication Systems at UET, Lahore, Pakistan.

Kamran Khalid is a final-year student of EE-Communication Systems at UET, Lahore, Pakistan.

Farhan-ud-din Mirza is a final-year student of EE-Communication Systems at UET, Lahore, Pakistan.

______________________

Comments

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Re: Economical Fault-Tolerant Networks

Anonymous's picture

Raza Bhai please murad bhai ko bhi kuch samjha deen....He is such a nela:::so please guide him and lead him:::we will be very thanks full to u:::thanks::From Lahore Office ...<|_|_| | Hafiz

White Paper
Linux Management with Red Hat Satellite: Measuring Business Impact and ROI

Linux has become a key foundation for supporting today's rapidly growing IT environments. Linux is being used to deploy business applications and databases, trading on its reputation as a low-cost operating environment. For many IT organizations, Linux is a mainstay for deploying Web servers and has evolved from handling basic file, print, and utility workloads to running mission-critical applications and databases, physically, virtually, and in the cloud. As Linux grows in importance in terms of value to the business, managing Linux environments to high standards of service quality — availability, security, and performance — becomes an essential requirement for business success.

Learn More

Sponsored by Red Hat

White Paper
Private PaaS for the Agile Enterprise

If you already use virtualized infrastructure, you are well on your way to leveraging the power of the cloud. Virtualization offers the promise of limitless resources, but how do you manage that scalability when your DevOps team doesn’t scale? In today’s hypercompetitive markets, fast results can make a difference between leading the pack vs. obsolescence. Organizations need more benefits from cloud computing than just raw resources. They need agility, flexibility, convenience, ROI, and control.

Stackato private Platform-as-a-Service technology from ActiveState extends your private cloud infrastructure by creating a private PaaS to provide on-demand availability, flexibility, control, and ultimately, faster time-to-market for your enterprise.

Learn More

Sponsored by ActiveState