Interview with Google's Sergey Brin
Google, http://www.google.com/, is the hottest search engine being used on the Internet today (see Doc Searls' “Google Gains...”article, page 10). It's fast and consistently returns relevant links. The company was founded in 1998 by Sergey Brin and Larry Page. The two collaborated on a new search engine technology called PageRank™. Since then, Google has gathered quite a following—Yahoo! recently hired Google to power its search engine—and now boasts the ability to link to over 1 billion URLs. I talked briefly with Sergey Brin, Google's co-founder and President.
Jason: What led to Google's decision to use Linux? When did that start?
Sergey: Well, Larry Page and I were in the Stanford PhD program in Computer Science. And we developed Google there. The way the computer science program worked is there was a hodgepodge of computer equipment lying around, and we would grab whatever scraps we could. We had all kinds of computers: HPs, Suns, Alphas and Intel's running Linux. So, we gained a lot of experience with all of those platforms.
When we started Google, we had to make the decision of what we wanted to use. Of course we chose Linux, because it is the most cost effective solution.
PCs are not only much cheaper these days, but we can also get them very quickly, because they're such a commodity item. That's an incredible benefit. We just installed another 1,000 computers and we got that done in a few weeks. That's really hard to do with any other kind of workstation. I think that's an advantage that people don't entirely realize.
Jason: Did you view it as being better, or was cost the main reason?
Sergey: It was better in some ways. Certainly for our purposes, we felt the support was better. For example, the actual kernel authors will respond to problems pretty quickly. They are especially responsive to Google nowadays, since we're so widely used. We can have a 15 minute turnaround. You can't really beat that for support.
That was an important factor, but frankly, the cost was a bigger issue. PCs are so cheap, which is very important. Sun's Solaris is probably more stable than Linux on PCs. It's hard to determine the blame, whether it's the hardware or the operating system. But, it's a minor difference.
Jason: Then, does all of your support come from newsgroups or do you actually pay for it through Red Hat?
Sergey: We have an operations team of about ten people, which helps a lot. And other than that we check newsgroups and e-mail the authors of the code. Usually, if it's a problem we can't figure out, we go straight to the authors.
Jason: Is Linux used on desktops at Google?
Sergey: It depends. Engineering mostly runs Linux. Business development/marketing runs Windows. Actually, I use Linux with VMWare running Windows. Some people have two computers, particularly some people in engineering who do UI development and need to test things out on Windows platforms. I find it better to just use VmWare and have one computer.
Jason: In a technical sense, what does Linux lack? What does it not provide?
Sergey: The 64-bit file system, which I know they are working on. It's slowly coming around. I think there are still occasionally some stability issues. I'm not saying Linux is unique in that respect, but you definitely want to have reliability. There are some issues dealing with higher memory systems. If you get to 2GB, and you try to push it past that, we encounter various problems. I know we've had some trouble with the network stack when we really push it hard. In terms of having lost most connections from lots of different machines.
Jason: Well, you're getting quite a few hits per day, aren't you?
Sergey: Yes, we are. We do about ten millions searches per day at Google.com. And another six million or so from OEM customers. So, we get a lot of hits. And when we crawl the Web, we crawl it pretty quickly, which can really stress the system.
Jason: Has your system been down entirely?
Sergey: No, but we certainly have individual computers go down. Our system has a lot of redundancy built into it, so the users don't see it from the outside.
Jason: I've read that you have developed your own network installation tools ...
Sergey: Yeah. We've re-used various components of things that people have built; we've had to now re-do them quite a bit ourselves. We have 5,000 computers now, and that's actually a fair amount of work to install. So we have our own network install system—where we can bring up 80 computers at a time. And we have our own testing software and monitoring tools to keep track of what the computers are doing, what state they're in. So, we've had to do a fair amount of development.
Jason: Of the 5,000 computers used by Google, can you roughly breakdown what they are used for, i.e., 3000 perform searches, 1000 do OEMs, 500 do web crawling, etc.?
Sergey: Without giving specific numbers, we can say approximately 80% of the machines are used for performing searches (google.com and partners); about 10% of the machines are used for Research and Development and another 10% of the machines are used for preproduction (crawling and indexing the web).
Jason: Are the tools worth releasing to the Open Source community?
Sergey: That's an interesting question. I mean, I don't know of too many installations that are of comparable size to ours, but it certainly is, now that you mention it, something we would consider. I don't think that any of them are robust enough or clean enough at this point and time. But, I think we can get them to that state if other people would take over the maintenance and contribute. I just don't think that there are too many people who would end up using them.
Jason: Could you briefly tell us something about yourself and how you came to work at Google?
Sergey: I was born in Moscow and came to the United States at the age of six. I grew up in Maryland, then went to the computer science program at Stanford. I started there in 1993, where I worked on data mining, which basically involves analyzing vast amounts of data to find interesting correlations and patterns. Then Larry joined in 1995. He started downloading the Web and we analyzed its link structure. We've worked together from then on.
Jason: Well, thanks so much for your time. Take care.
Sergey: Thank you.
Practical Task Scheduling Deployment
One of the best things about the UNIX environment (aside from being stable and efficient) is the vast array of software tools available to help you do your job. Traditionally, a UNIX tool does only one thing, but does that one thing very well. For example, grep is very easy to use and can search vast amounts of data quickly. The find tool can find a particular file or files based on all kinds of criteria. It's pretty easy to string these tools together to build even more powerful tools, such as a tool that finds all of the .log files in the /home directory and searches each one for a particular entry. This erector-set mentality allows UNIX system administrators to seem to always have the right tool for the job.
Cron traditionally has been considered another such a tool for job scheduling, but is it enough? This webinar considers that very question. The first part builds on a previous Geek Guide, Beyond Cron, and briefly describes how to know when it might be time to consider upgrading your job scheduling infrastructure. The second part presents an actual planning and implementation framework.
Join Linux Journal's Mike Diehl and Pat Cameron of Help Systems.
Free to Linux Journal readers.View Now!
|The Firebird Project's Firebird Relational Database||Jul 29, 2016|
|Stunnel Security for Oracle||Jul 28, 2016|
|SUSE LLC's SUSE Manager||Jul 21, 2016|
|My +1 Sword of Productivity||Jul 20, 2016|
|Non-Linux FOSS: Caffeine!||Jul 19, 2016|
|Murat Yener and Onur Dundar's Expert Android Studio (Wrox)||Jul 18, 2016|
- The Firebird Project's Firebird Relational Database
- Stunnel Security for Oracle
- My +1 Sword of Productivity
- Non-Linux FOSS: Caffeine!
- SUSE LLC's SUSE Manager
- Managing Linux Using Puppet
- Murat Yener and Onur Dundar's Expert Android Studio (Wrox)
- Parsing an RSS News Feed with a Bash Script
- Google's SwiftShader Released
- Doing for User Space What We Did for Kernel Space
With all the industry talk about the benefits of Linux on Power and all the performance advantages offered by its open architecture, you may be considering a move in that direction. If you are thinking about analytics, big data and cloud computing, you would be right to evaluate Power. The idea of using commodity x86 hardware and replacing it every three years is an outdated cost model. It doesn’t consider the total cost of ownership, and it doesn’t consider the advantage of real processing power, high-availability and multithreading like a demon.
This ebook takes a look at some of the practical applications of the Linux on Power platform and ways you might bring all the performance power of this open architecture to bear for your organization. There are no smoke and mirrors here—just hard, cold, empirical evidence provided by independent sources. I also consider some innovative ways Linux on Power will be used in the future.Get the Guide