tag:blogger.com,1999:blog-6569681.post334374869733025627..comments2024-01-15T13:17:33.771-08:00Comments on Geeking with Greg: Insights into the performance of Microsoft's big clustersGreg Lindenhttp://www.blogger.com/profile/09216403000599463072noreply@blogger.comBlogger10125tag:blogger.com,1999:blog-6569681.post-59224370028961694062010-09-18T10:41:28.762-07:002010-09-18T10:41:28.762-07:00Bing does seem to use some pretty intensive learne...Bing does seem to use some pretty intensive learned ranking metrics (see the recent <a href="http://learningtorankchallenge.yahoo.com/workshop.php" rel="nofollow">ICML workshop</a> and the <a href="http://learningtorankchallenge.yahoo.com/presentations/yahooChallenge10.pptx" rel="nofollow">slides</a> from the Microsost team). Those kind of models tend to be memory-hungry, but I would guess probably aren't the the main cause of CPU-load.Markhttp://www.educatingsilicon.com/noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-27114320255084045582010-09-17T23:15:37.686-07:002010-09-17T23:15:37.686-07:00regarding the comment about being CPU bound due to...regarding the comment about being CPU bound due to "neural networks". This would only be the case if they were performing training each time which wouldn't make any sense. A neural net is a supervised learning technique and classification is fast/low cost.<br /><br />More than likely they are CPU bound due to Greg's comment about compression.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-87946530801020642832010-09-07T07:22:28.741-07:002010-09-07T07:22:28.741-07:00Hi, Joe. I'd expect them to have their server...Hi, Joe. I'd expect them to have their servers maxed out, but it is the way they are maxed out that I find surprising.<br /><br />The Cosmos servers should be streaming huge amounts of data as fast as they can off disk, but they aren't able to that because the disk and network data have to wait for CPU to be free.<br /><br />The Bing servers should be quick in-memory caches, doing nothing but responding immediately to a request for index data, but instead are waiting on CPU.<br /><br />The Hotmail servers should be fast random access lookups for data, but instead spend all their time waiting on disk seeks.<br /><br />Yes, I would expect the servers to be maxed out, but it is the way they are maxed out that I think is odd and requires more explanation.Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-20289557583361010692010-09-07T07:09:19.979-07:002010-09-07T07:09:19.979-07:00Maybe you should turn you questions around. Why wo...Maybe you should turn you questions around. Why wouldn't Microsoft want to have all of their servers maxed out? <br /><br />1) Given that CPU use is the de facto measurement of utilisation then they are actually running a very efficient operation. AFAIK, anything above 50% utilisation is pretty much gold standard.<br /><br />2) Windows. Seriously, we know they dogfood Windows as much as possible and we know that Windows carries a performance burden. Presumably they're running the fabled "MinWin" but it's still going to cost them.joeharris76https://www.blogger.com/profile/09242541409318280541noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-80962930569439617172010-09-04T07:39:21.137-07:002010-09-04T07:39:21.137-07:00Thanks, Eas and Checiovan, it's a good point t...Thanks, Eas and Checiovan, it's a good point that it may just be too expensive to use anything other than disk for Hotmail. Nevertheless, I'd like to see more discussion of that in the article, including some thought about alternatives. I'd particularly like to see more justification of why massive arrays of 40 disks make sense when the system appears to have an I/O bottleneck.<br /><br />By the way, Google apparently had problems with their mail system, GMail, too for exactly the same reason, that they layered a workload requiring random access on top of a data storage layer built for sequential access. More details on that here:<br /><br /><a href="http://glinden.blogspot.com/2010/03/gfs-and-its-evolution.html" rel="nofollow">http://glinden.blogspot.com/2010/03/gfs-and-its-evolution.html</a>Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-29132120026406589512010-09-04T02:11:39.599-07:002010-09-04T02:11:39.599-07:00For mail storage there's the part of email met...For mail storage there's the part of email metadata and the blob data. You can't keep blob data in memory since you're talking of multiple terabytes per cluster vs. a handful couple gigabytes of ram. Your suggestion does makes sense for frequently accessed data, but email is unique for every single user. Even a chain mail which multiple people receive has very different header data per user ,and usually people reads mail once and archive or delete. It's not a common case (and hard to predict) an email that a user keeps reading many times a day, so that it is kept in some sort of memcached storage.checoivanhttps://www.blogger.com/profile/07686370934380125691noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-70590308111864466432010-09-04T02:07:14.340-07:002010-09-04T02:07:14.340-07:00My guess on the hotmail servers: most of the data ...My guess on the hotmail servers: most of the data is ice cold, and the rest never gets accessed enough to be worth caching. I/O ops per GB of data is probably quite low. If they are hooking 40 disks up per server, it seems pretty clear that they are optimizing for storage cost. At the same time, all those spindles should allow for fairly high total IOs/s though I'd guess that the controllers would be the bottleneck.<br /><br />Not sure what to think about the rest of it.Eashttp://Geekfun.comnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-11832379783781811042010-09-03T23:43:22.057-07:002010-09-03T23:43:22.057-07:00Regarding Bing being CPU-bound: I attended a prese...Regarding Bing being CPU-bound: I attended a presentation by a researcher who did a project at Microsoft studying Bing's performance on a couple different types of CPUs: low-power, low-performance and high-power, high-performance. I can't remember his name, unfortunately, but I think he was from Harvard. He said the reason why Bing's index servers are so CPU-hungry is that they use neural networks to do ranking once the candidate results have been fetched. From the presentation it sounded like each query has a deadline and the CPU time devoted to each query is somewhat adaptive, where using more cycles when they are available yields better ranking.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-6569681.post-36147549187949111032010-09-03T18:59:08.144-07:002010-09-03T18:59:08.144-07:00Ah, right, I was worried about that. The descript...Ah, right, I was worried about that. The description was "dual CPU socket, about 2-3 Gbytes per core of DRAM".<br /><br />I may have misinterpreted that. You're saying these are quad core CPUs, total of 8 cores, and a total of 16-24G of memory. That makes more sense.<br /><br />Thanks, I'll add a note to the post.Greg Lindenhttps://www.blogger.com/profile/09216403000599463072noreply@blogger.comtag:blogger.com,1999:blog-6569681.post-88366163344625759352010-09-03T17:33:33.314-07:002010-09-03T17:33:33.314-07:002 cpus and 2-3 gb/core does not mean 4-6 gbytes/se...2 cpus and 2-3 gb/core does not mean 4-6 gbytes/server.Anonymousnoreply@blogger.com