Interview with the MAASK Team
Several barriers exist in the world of clustering, and they need clever solutions. One of them concerns expanding memory allocation throughout the nodes of a cluster, also called distributed shared memory (DSM). Using this method, any process that uses memory sharing for interprocess communications (IPC) no longer is limited and is free to roam (read: migrate). Such a solution, MigShm, now exists in openMosix.
MigShm was created by five female college students from India. MAASK stands for Maya, Anu, Asmita, Snehal and Krushna, the five authors. They created it as final project at Cummins College of Engineering, University of Pune. MigShm quickly drew responses from people in the openMosix community and showed a bright future. Although Dr Moshe Bar still considers MigShm to be a separate add-on, he would like MigShm to develop to a more mature level. Sometime in the future, MigShm most likely will be fully merged with openMosix to extend openMosix's features. Now, let's hear MAASK speak about its projects and ideas.
Linux Journal: What is the educational background of the MAASK team?
MAASK: Maya did her schooling in Aurangabad and the rest of us studied here in Pune. After finishing with junior college, we all received Bachelor of Engineering degrees in Information Technology, from the Cummins College of Engineering at the University of Pune.
LJ: What made you become interested in computer science in general and Linux specifically? Many female students seem to be more interested in economics or management--no offense meant.
MAASK: With computer technology advancing rapidly, it is obvious that more and more people will get involved in the software industry. In our early days, the concept of free software attracted us. We struggled a bit with Linux initially, but when we got into the core of it, we were simply thrilled.
The best part is that you have the source with you and you can play with it. That is the best way to learn any system. The global user/developer community, the number of projects on top of it (such as openMosix), the kernel versioning system, coding style--everything about it is just so cool. It lacked a few things, such as a true threading model, a preemptive kernel and so on, but with the 2.6 release these issues no longer exist. Also maybe some work needs to be done on the GUI front to attract more users. With so many people working on various distributions, this front also should improve. And who said women can't enter and succeed in this field?
LJ: In India, IT field is growing fast. Can you tell me some of latest news about the IT field in India? Does the Indian government have specific programs to support IT development?
MAASK: The Indian government actively is involved in the promotion of the IT sector, because IT now is one of our largest sources of foreign revenue generation. There are various software technology parks (STPs) being set up at various cities--Pune, Bangalore, Hyderabad and Chennai being at the forefront--equipped with the latest infrastructure.
Even more exciting for Linux users, the government recognizes that in a developing country such as ours, the introduction of a free OS will galvanize penetration of PCs. Our President, Mr. A. J. Kalam, recently gave a speech in which he encouraged government agencies as well as individual users to actively use open-source software.
The Indian government has a separate department for the information technology sector that comes under the Ministry of Communications & Information Technology. Further details about this department and the Indian government's role in IT can be found at www.mit.gov.in.
LJ: Many Indian citizens go abroad (US, Europe and other Asia countries) to develop careers in IT. Is this because few jobs are available for them inside India? Or are there other reasons?
MAASK: No. There are good IT jobs here in India. In fact the trend is changing now. Many software MNCs are starting their operations here in India, and with the advent of global Indian players, such as Infosys, Wipro and TCS, good, challenging jobs now can be found here as well.
LJ: Do MAASK use Linux everyday? What is your general comment about GNU/Linux and open source?
MAASK: Of course! We dig Linux and the Open Source community! GNU/Linux and other free software movements are doing a great job. The number of Linux users and developers is increasing exponentially in India, and they also are contributing to open source.
LJ: What is your primary programming language; specifically, what did you use for your college projects? Why did you choose it?
MAASK: We generally use C. It's a powerful language and ideal for Linux projects. We have been using it for kernel programming and driver development in Linux.
LJ: Can you tell me the reason why clustering become your project? Much clustering middleware is available, so what made you join the openMosix community?
MAASK: We wanted to work on the latest technology, and clustering was a concept that appealed to all of us. We believe that clustering and/or other related upcoming initiatives, such as grid computing, represents the future of computing. We came across other clustering technologies, but we found openMosix to be the most impressive. The most striking feature is its transparency and dynamic load balancing. And soon after joining the mailing lists, we realized how active and helpful the openMosix community is.
LJ: In terms of openMosix, there are many things to improve, including DSM, socketmigration, checkpointing and so on. Why did you choose to work on DSM? Do you have any previous experience with DSM, such as Treadmarks?
MAASK: Our main aim in taking up this project was to learn about Linux kernel internals, openMosix internals and to solve a problem. We saw that many users were asking for a DSM on openMosix and thus we started with MigShm. We did study a lot of DSM systems and Treadmarks was one of them. But, unfortunately, none of us have worked with Treadmarks or any other DSMs.
LJ: Kernel programming is still considered one of the most difficult aspects of programming, at least by me. How does MAASK conquer it? Any special tips and trick to share with others?
MAASK: Yes, kernel programming is a bit difficult initially, but once you get into the groove, it is thrilling. An ideal thing to do is to read the code and experiment with it by modifying it to try new things. When it comes to kernel coding, the best way is to get it right the first time. Otherwise printk is your friend. There also are good debuggers, such as kgdb.
LJ: You've heard about User Mode Linux, right? Do you use User Mode Linux to ease the debugging phase?
MAASK: We know about User Mode Linux. It's an amazing concept and very useful too, but we haven't had a chance to use it for debugging.
LJ: In my opinion, there is a possibility for User Mode Linux to take advantage of MigShm. By doing so, anyone could create extensive virtual clusters on top of real clusters, for example, experimenting with several types of topology for parallel communication. What do you think about such an idea?
MAASK: Did you mean having something like User Mode openMosix? About experimenting with weird topologies on a virtual cluster...sounds interesting.
LJ: Many commercial products use shared memory, including database servers. Do you have any plans to make MigShm strongly support database servers or the like? Does it require some special tuning so database servers can work together perfectly with MigShm?
MAASK: Typically database servers use shared memory for keeping metadata or temporary data in memory. Unfortunately, no existing database servers can benefit from MigShm, because most of them use threads, something MigShm currently does not support. MySQL uses POSIX threads. PostGreSQL doesn't use threads, but it doesn't follow some other MigShm constraints. Any application that satisfies all the MigShm constraints will work successfully with MigShm.
LJ: You said MigShm is your first project in the Linux kernel area. Can you tell us some of your experiences, starting the MigShm project? How did you split the tasks among MAASK team?
MAASK: The whole thing started with a lot of zeal and enthusiasm. We had great guidance from Mr Amit Shah and the extremely talented team from Codito Technologies, the place where we worked on MigShm. We started by working on several things, including studying Linux and openMosix source code, reading documentation for Linux and openMosix, studying several DSM implementations and issues and so on simultaneously by splitting work among us. The initial phase was one of learning, reading and discussing. The brainstorming sessions that we had with the people at Codito were of immense help. Later on came the coding and testing, and then we released the first version of MigShm. It was a thrilling experience.
LJ: Your mentor is Mr Amit Shah, right? I read his CV and he really is a great person. Can you tell me about working with him to create MigShm?
MAASK: Amit is a very nice person. It was a great experience working with him. You can call him our friend, philosopher and guide.
LJ: You mention that Codito gave some support to the MigShm project. Exactly what is the core business of Codito? Did you send a sponsorship proposal to Codito for funding this project? Any plan to join Codito?
MAASK: Yes. Every year Codito supports two or three project groups. The Codito team gave us immense support and extremely valuable guidance throughout this project. We did our project on the Codito premises. The company's core business, though, does not relate to our project. You can get more details about its work on the company Web site.
LJ: I looked at your notes on the project and noticed that there are difficulties in migrating POSIX threads. Can you give some deeper explanations about this? Does this mean that, in the meantime, there will be no perfect DSM solution for shared memory applications?
MAASK: Well, MigShm and the concept of DSM basically is meant for shared memory applications. Migration of clone() threads simply was an addition, and the next step was POSIX threads. The basic problem is that programs using pthreads do not use Linux semaphores for locking. This is a requirement for the thread migration module of MigShm to handle consistency. As it is not satisfied, migration of pthreads is not allowed. But these issues are not really related to the DSM part of MigShm.
LJ: You've heard about Red Hat's NPTL (Native Posix Threading Linux) library, right? Or maybe you've tried it. What is your opinion of it? Is there any possibility that the structure of NPTL made the migration of shared memory applications easier? Or was it the contrary, did it make the process harder?
MAASK: Yes, we had done some investigations on NPTL while doing the thread migration module. The structure of NPTL should not affect migration of shared memory applications.
LJ: In DSM, some memory regions are accessed by processes/threads almost every time, while others are accessed rarely. In other words, some areas are strongly coupled with processes. Migrating the processes within this scenario somehow increases the task of keeping them synchronized. What is your method for overcoming this kind of scenario--let's assume the regions have 95% of the overall access).
MAASK: MigShm nicely takes care of such a scenario. First, only dirty pages are synchronized to the owner copy of the DSM. As an optimization, MigShm also logs the access patterns of all processes and identifies strongly linked processes and weakly linked processes. This information is used when a process is migrated, so as to minimize remote accesses to the segment and thus reduce network traffic. More details on this can be found in the project report on the MigShm Web site.
LJ: Usual programs or processes look up memory across the cluster as separate blocks. You have made a good start of implementing DSM as a kernel patch on top of openMosix. But it is still lacks several things because of the current implementation, such as POSIX threads. Do you think this is a sign that we need a more comprehensive solution for taking care of complete DSM, especially if we want to leave the programs unmodified? What is your proposal for overcoming them?
MAASK: As we said before, threads have no relation to DSM. And MigShm is a patch for shared memory; thread migration was an additional feature. It doesn't migrate POSIX threads due to the synchronization primitives they use. Currently, there is no thought of implementing this.
LJ: In my opinion, MigShm will be more useful if it is loosely related to openMOSIX. In other words, MigShm should be shaped as an independent kernel patch, apart from openMosix. In this way, people who do not need the entire capability of current openMosix can use only MigShm. Any opinion on this?
MAASK: Well, MigShm is a patch to the openMosix kernel and not directly to the Linux kernel. It's not a clustering technology in itself but an additional feature to the existing openMosix clustering technology. So openMosix has to be there to have MigShm.
LJ: Related to the previous question, are there any plans to port MigShm to other platforms, say *BSD, MacOS or others?
MAASK: Yes, we might be porting the thread migration of MigShm to openSSI clusters. openSSI supports migration of shared memory processes but not of threads. We will be starting the work shortly.
LJ: When we talk about clustering, we always closely look at communication latency between nodes. In terms of DSM, I bet that this will be the primary thing, for example, when you need to synchronize a block of memory when accessed by several remote processes. How can we reduce it? How much would a caching algorithm help? Do we need an advanced medium, such as Infiniband, to accommodate busy DSM?
MAASK: Well, this truly is an important issue. MigShm handles this with some optimizations, including migration of the shared memory ownership itself, depending on the access frequency of several processes using it. As for advanced medium like Infiniband, it definitely will help but it isn't necessary.
LJ: After you have finished your studies, what are your future plans?
MAASK: Currently, Maya is pursuing an MS in Technology Management at UMIST, Manchester. Anu also is pursuing her MS at SUNY Stony Brook in the US. Asmita is working with VERITAS Software in Pune, India. Snehal is working with Cognizant Technology Solutions in Pune, India. Krushna is working with Persistent System Private Ltd. in Pune, India.
Yes, we do dream of setting up MAASK Technologies some day, once we have done some other work, completed further studies and gained enough experience.
LJ: I'm sure beside clustering and kernel things, each member of MAASK has her own interest.
Maya: I love eating, sleeping and partying. I also read a lot...anything and everything.
Anu: I love painting, reading books and poetry, swimming and making friends. I also write poems.
Asmita: I make portraits in pencil. You can see some of them at absolutearts.com/portfolios/n/neetasmile. The link has not been updated for quite some time, though. Apart from that I love dancing, trekking, playing outdoor sports and traveling.
Snehal: I like playing sitar, an Indian musical instrument. Apart from that, my interests include reading books and traveling.
Krushna: I like swimming, dancing, listening to Hindi music and reading non-fiction books. Also, I am a food freak and like hanging out with friends.
LJ: Who are your idols?
Maya: My grandfather.
Anu: Indira Gandhi.
Asmita: Dr APJ Abdul Kalam, President of India.
Snehal: Kiran Bedi
Krushna: Myself. Seriously speaking, Bill Gates. He managed to make so much money in spite of the number of bugs in Microsoft products. A great businessman.
LJ: Open source is creating team work all over the world without boundaries. In this way, the only thing that matters is how and what you can contribute to an open-source project. Do you think this still is feasible in a world filled with war, politics intrigue and the like?
MAASK: Why not? I mean there is some sort of war going on always, somewhere in the world. If someone wishes to contribute, he/she will contribute anyways.
LJ: Indonesia and India share some common characteristics, such as agricultural economics and social culture. They also share some IT similarities. Maybe this is true of most Asian countries. Do you have any advice or suggestions about how Asian people can play a major part in IT growth worldwide ?
MAASK: For the growth of IT, it is very important to start with research projects or technical work at an early stage. Apart from hard work and technical skills, proper guidance is a must. Also, a good response from people in the field, say the Open Source community in the case of open source projects, acts as a further motivating factor.
LJ: If you were given the choice of making MigShm closed source and getting a lot of money or keeping it open source, what would you choose? Why?
MAASK: We already had the choice. And as you can see, MigShm is released as open source under the GNU General Public License. We chose this option because we all believe in the free software movement.
Mulyadi Santosa is a freelance writer and IT consultant. His specialties are high performance computing, clustering and networking.