The sky’s the limit

... for our new open academic research partnership

Today we’ve made a major announcement that gives us a lot to be proud of and a lot to be excited about. As the Yahoo! Academic Relations team traveled around to the many great universities we’ve been working with, one of the most frequent desires we’ve heard is the wish for access by faculty and students to the kind of Internet-scale computing environment that is our bread and butter, but which is almost impossible to find on a university campus.

Well, after several months of hard work and a huge “OneYahoo!” effort by multiple teams, we’ve been able to deliver exactly that. M45, as it's called, is a 4,000-processor cluster supercomputer that runs Hadoop and other open-source distributed computing software. To put it in context, it's one of the fifty most powerful computers in the world. Today we begin our journey to make it available to the academic research community.

To get the ball rolling, we’ve engaged in a significant partnership with Carnegie Mellon University, which will be the first to benefit from our cluster and our large-scale distributed systems expertise, complementing their own incredible expertise in the theory and practice of distributed computing. This a major first step for us on the road to facilitating a worldwide open-source software research program in real-world large-scale supercomputing environments. This is a first-of-its-kind effort in the industry. Instead of merely giving academics computers to run software applications for coursework, we will enable researchers to change the systems software that sits between the application and the hardware. By making the system open for experimentation and research at all levels, we will be helping the worldwide research community get to the next level in its understanding of large-scale computing systems.

As you may know, Yahoo! has been a leader in the open-source community with our contributions to Hadoop and now the incubation of the Pig parallel programming environment within the Apache Software Foundation. Given our interest in open collaboration, we can all engage in research on a common software base.

Given the growing popularity of Hadoop, Yahoo! and Carnegie Mellon also plan to co-host a Hadoop Summit in the first half 2008, inviting major Hadoop users to participate in this open, collaborative community. Major companies such as Facebook and leading research universities such as the University of California, Berkeley, are heavy users of Hadoop. We would certainly like to invite them and others to participate in this open community.

We called our cluster “M45” after one of the best known open star clusters (the Pleiades). It’s up and running Hadoop jobs, and with its 3 terabytes of memory and 1.5 petabytes of disk, we hope it will provide a major boost to the worldwide university research community. We love being out there in front and supporting the open-source community, and we are eager to reach for the stars with Carnegie Mellon, and soon, the entire academic computing research community.

I want to offer my personal thanks and congratulations to the many Yahoos from our Engineering, Site Operations, Research, Legal, PR, and Academic Relations teams for incredible work in getting this going.

Ron Brachman
VP, Worldwide Research Operations, Yahoo! Research
Head, Yahoo! Academic Relations

The Yahoo! M45 team