Friday, April 28, 2006

Microsoft is building a Google cluster

Benjamin Romano at the Seattle Times reports that Microsoft "plans to plow perhaps $2 billion more than expected -- a meaningful sum even for the world's largest software company -- into new technologies, marketing for its most significant wave of product launches in a decade, and the fight for online supremacy against Yahoo! and Google."

This spending is an explicit part of Microsoft's strategy in the search war. In a Fortune article, Microsoft CTO Ray Ozzie said that the cost of building these massive online clusters is a huge barrier to entry and that "the people who could build a viable [Web] services infrastructure of scale are companies that have both the will and the capacity to invest staggering amounts of money."

Microsoft is trying to build a Google-sized cluster, belatedly recognizing that massive computing resources are "major force multipliers" for those who have them and an insurmountable barrier for those that do not.

As Ozzie said, few others have the resources to build this massive online computing infrastructure. Who else can build, maintain, and exploit a cluster of millions of servers? Who else can spend the billions required? Not Amazon. Not Ask. Not any venture-funded startup. Probably not Yahoo.

The search war is now an arms race. The buildup in computing power for the battles ahead will be remarkable to watch.

16 comments:

mb said...

Microsoft has enough cash to win the arms race, but does it have the know-how?

Google's been building a massive, globally distributed network of datacenters for years, using commodity hardware. They developed the Google File System and Big Table to manage petabyte-sized data sets.

If Microsoft is just now starting to invest the billions required to build this out, how much of an engineering head start do you think Google has?

Greg Linden said...

Good point, mb.

Dr. Chadblog said...

What about the whole p2p indexing approach? couldn't that at some point provide another viable competitor?

Greg Linden said...

Hi, Chad. That would be cool, wouldn't it? If all the excess CPU, disk, and bandwidth on idle machines everywhere could be repurposed as a massive general purpose cluster?

The progress so far toward this goal has been impressive, with successful large scale systems like BitTorrent for file sharing and Seti@Home for idle CPU.

But, it is still a long way from a Google-like cluster, and many nasty challenges stand in the way of building a more general purpose, P2P, Google-like cluster.

Sure is a fun idea though.

Tom said...

Will it be running windows?

Anonymous said...

Two points:

1) Why bother building this thing if
it's not attracting any users? Most
of the scale that's required scales
linearly with user requests. You
don't need a billion to run the Google
File System (or whatever MS's version
will be).

Without user requests, Microsoft is
just building a research machine a la
Deep Blue. Very cool, but no reason
for Goog to worry.

2) Tom's got a good point. I wonder
what MSN's uptime/admin costs are like,
assuming they run Windows.

Anonymous said...

excellent point by anonymous. Microsoft does not get the traffic that can justify the amount of infrastructure needed...

So, it is either one of the following:

1) they expect search traffic to grow a lot in the next year???

2) they will essentially provide "unlimited" free online storage along with windows vista

1) is improbable because investments are not done that way...

Any comments on these? What are some of the other R&D items that 2B can buy? What's the shock-n-awe going to be? How are they going to get a return is a whole new question?

Anonymous said...

(Anonymous #1 here)

I'd guess that Anonymous#2 is on the
right track. MS wants to offer an
online product that will be somehow
integrated with the huge
Windows deployed base.

Or perhaps they want to offer an
advertising product coupled with
some kind of major promotion deal
with AOL, Dell, etc.

At any rate, neither R&D nor organic
user growth can justify that kind
of spending. They're using either
Windows or their cash to obtain
an immediate (and huge) installed
base of users for their online services.

It's the ability to obtain that
installed base, not the cluster
itself, that is a huge barrier to
entry for other firms.

Scott said...

Anonymous said: excellent point by anonymous. Microsoft does not get the traffic that can justify the amount of infrastructure needed...

I tend to disagree. Isn't Hotmail the #1 online email provider? (Maybe #2 behind Yahoo Mail, but it does have tremendous load requirements, I'm certain.) I agree MSN gets much less search traffic than Google, but I think it's a mistake to assume that MS doesn't attract a lot of users.

And, yes, I'm certain all of this would run Windows. There are some good videos on Channel 9 that talk about the infrastructure for Hotmail, Windows Update, etc.

Anonymous said...

(this is anonymous #2)

to scott:

well, microsoft is not spending in response to user-growth.

With email, Gmail changed the game and forced yahoo and MS to spend more to match the storage. So, if that were the only increase, the capex increase for yahoo (after subracting Y's search growth) should be the same as the capex growth for MS. Clearly, it is not.

So, MS is building new stuff for all the *live services that they are going to provide...

here are some numbers for email:
(500G disks/1000MB quota) = 500 users per machine...excluding replication

so in order to support 100M MS users they need 100M/500 = 200K machines

add to that compute power need to do also those ajax stuff.

200K*2000 per machine = 400M + datacenter costs + network-costs = 1B...the other billion may be for the other 200K machines :-)..what is that for??

just throwing out a bunch of numbers ...email is just one thing...online storage looks to be the next stop...

they are basically saying this - we have much more money...that is the one thing that we can do better than google/yahoo. Of course, they need all the software infrastructure too... I am assuming they are furiously working on that.

the problem with microsoft is that they can spend all this dough but where is the return going to come from. display ads + search ads...

they cant stuff a lot of display ads ...people will be annoyed otherwise

they cant get many search ads...they dont have much traffic.

so in essense, what i am saying is that they are betting strongly on integrating vista with online storage...i.e. (buy vista and get unlimited *live access...howmuch ever sucky it is). How does that sound?

Gen Kanai said...

Greg the whole Rick Sherlund- Chris Liddell discussion is available on the Seeking Alpha transcript of the Microsoft earnings call.

http://seekingalpha.com/article/9705

Anonymous said...

Don't they have some experience in this with their terraserver project? Good video on their system using SQL and cheap hardware.

http://channel9.msdn.com/ShowPost.aspx?PostID=50998#50998

Martindale said...

Microsoft wouldn't kill themselves.

Seriously, what web apps would Microsoft create, provide for free, and put on their servers to compete with Google?

Surely Web Outlook is great. But Gmail's better. So is Google Calendar.

What about Writely? Are you telling me the Microsoft is going to release a free web version of Microsoft Word?! That's their best selling product! (Atleast Office is.) Microsoft's true power is in the Office Suite. That's where they make their money. Would they really kill their biggest money maker?

Or is Microsoft completely rebuilding their business model?

Anonymous said...

>> massive computing resources are "major force multipliers".

No they're not. Throwing hardware at the problem is usually a mistake and today's $2 billion in HP servers will be in tomorrows landfill.

Guys, we need better direction than "buy a butt-load of servers" to crush Google and Yahoo.

Anonymous said...

not joking - i wonder when higher energy costs are going to translate back into higher ad charges for all these players. "green" server tech is still relatively new, i am sure 99% of the hardware G/Y/M are running are the plain old power hogs, and the entire architecture of a colo is incredibly wasteful...only now are they looking at DC. the amount of power wasted on empty cycles is mind boggling...so far energy costs haven't made this an issue. how long can that last?

RobotsThink said...

Microsoft gets always unrealistic ,but now they are thinking that they can fool the users/analysts with their overambitious plans. The truth is as transparent as it looks, they will be killing themselves ,if they try to follow sth like GooGle .I think they should concentrate only on their OS and other menial applications like Office Suites, otherwise soon GooGle will kick them out of there too.