|By Jeremy Geelan||
|October 3, 2009 07:00 PM EDT||
"Ultimately, we believe that advancement in cloud computing technology will be driven by open source initiatives where large communities of engineers can collaborate and develop new code for the new applications and demands posed by the cloud model," says Shelton Shugar, SVP Cloud Computing at Yahoo! - and upcoming Keynote Speaker at SYS-CON's 4th International Cloud Computing Expo, in this Exclusive Q&A with Cloud Computing Expo Conference Chair Jeremy Geelan.
Jeremy Geelan: What are the chief advantages of Cloud Computing from the point of view of Yahoo!’s customers worldwide?
Shelton Shugar: Yahoo! has more than 500 million unique users per month across the world. Yahoo! Cloud services enable us to provide superior user experiences and deliver targeted content to our enormous audience. Examples include faster content access around the globe, real-time sports updates, a personalized homepage experience, targeted news feeds, geo-specific ads and many more.
In addition, Yahoo! Cloud technologies enable us to innovate faster based on common, global and scalable platforms, thus enabling consumers to gain access to innovative features and products faster than ever before.
As one of the largest providers of consumer Internet services in the world, Yahoo’s cloud operates at virtually unprecedented scale, making it a unique environment and testing ground for cloud computing technologies.
Geelan: You have been quoted as saying that “Cloud is pushing up the Operational Excellence curve” – what exactly do you mean by that?
Shugar: Almost no other company can boast of having to tune its infrastructure to deal with the technical requirements and high standards of performance that are involved in serving more than 500 million unique users per month across the world. To meet this challenge, Yahoo!’s Cloud includes a collection of infrastructure and functional services targeted at dramatically improving the company’s efficiency throughout the entire product development cycle, from gathering user feedback and insight, to feature testing and iteration to ongoing product operations.
Cloud technologies allow us to achieve higher agility and quality while maintaining scale to meet the needs of our users. When I say “Cloud is pushing up the Operational Excellence (OE) Curve,” I mean simply that cloud enable our developers to focus more on creating great products for our users and less on the “heavy-lifting” of building complex infrastructure.
Geelan: As we approach the zettabyte (and perhaps even yottabyte) age, can value really be extracted from the voluminous data that is now in existence? How is that possible?
Shugar: Yes, it’s possible using technologies such as Hadoop. Hadoop is an open source distributed file system and parallel execution environment. Yahoo! runs Hadoop on tens of thousands of servers, enabling us to process and extract value from massive amounts of data. Apache Hadoop is an open source project of the Apache Software Foundation. Yahoo! is the largest contributor to Hadoop technology as well as its largest user.
Yahoo! recognized that the next-generation Web-scale services demand large distributed systems, and a growing number of other companies and organizations are likely to need similar capabilities. In addition to contributing most of the Hadoop code base, Yahoo! provides open source resources to the academic research community enabling them to access Internet-scale supercomputers for conducting systems and applications research.
Geelan: So exploring data-intensive computing in industry is clearly gaining momentum; what specific initiatives has Yahoo! been taking to encourage expertise in Hadoop and progress toward faster supercomputers? (Just a few, maybe – I know there are many!!)
Shugar: Earlier this year, Yahoo! announced the Yahoo! Distribution of Hadoop, in response to frequent requests from the community. Yahoo! is opening up its investment in Hadoop quality engineering via the Yahoo! Distribution of Hadoop, which has been tested and deployed at Yahoo! on the largest Hadoop clusters in the world and is based entirely on code available from Apache Hadoop. By making the Yahoo! Distribution of Hadoop generally available, Yahoo! is contributing back to the Apache Hadoop community so that the ecosystem can benefit from Yahoo!’s quality and scale investments.
In addition to Hadoop, Yahoo! is heavily investing in other cloud-related technologies such as storage, distributed caching, and serving to solve massive data-intensive computing challenges with serving our consumers.
Yahoo! also has very strong expertise in next generation cloud computing and data management technologies, and is leveraging its leadership in open source software, including Hadoop and Pig, to contribute to global, collaborative efforts around Internet-scale computing.
Over the past few years, we have made significant partnerships and have contributed technology to some of the leading research and development entities worldwide:
- November 2007: Deployment of a supercomputing-class data center, called M45, for cloud computing research (first deployed at Carnegie Mellon University).
- July 2008: Open Cirrus™ testbed formed with Hewlett Packard, Intel, the University of Illinois at Urbana-Champaign, the Infocomm Development Authority (IDA) in Singapore, and the Karlsruhe Institute of Technology (KIT) in Germany
- April 2009: University of California at Berkeley, Cornell University and the University of Massachusetts at Amherst join Carnegie Mellon University to take advantage of Yahoo!’s cloud computing resources
- June 2009: Participation at Open Cirrus Summit, three new sites to join Open Cirrus: the Russian Academy of Sciences, Korean Electronics and Telecommunications Research Institute (ETRI), and Malaysian Institute of Microelectronic Systems (MIMOS).
Geelan: When Yahoo! contributes to open source software such as Pig and Hadoop, what does the company get back?
Shugar: Yahoo! is a long-time supporter and contributor to open source software. Yahoo! is a platinum sponsor of The Apache Software Foundation, and the leading contributor to Hadoop to date, for example. Several members of Yahoo!'s development teams are active, long-term code contributors to Apache Hadoop, and we are committed to advancing the state-of-the-art in distributed computing through the incubation of new Apache projects.
We currently, and will continue to, actively collaborate with the industry, academia and the open source community, including through our Open Cirrus consortium, involvement with Hadoop, Pig and gradually other cloud-related technologies, and support of Apache. We are contributing back to the community so that the ecosystem can benefit from Yahoo!’s quality and scale investments, renowned technologists and innovation in next-generation Web technologies.
Yahoo! benefits from the contributions of others to open source projects and is able to collaborate with and sometimes hire the most talented researchers and engineers in the world, based on their interest in large scale application of open source technologies.
Geelan: And how does helping developers help Yahoo! in bottom line terms?
Shugar: We believe that the developer community is a key component in making Yahoo! a success. The challenges the industry is facing today in terms of large-scale, global cloud solutions are bigger than any one company (big or small) is able to solve on its own. As we contribute to the community, we also learn from the community, and third party developers are a valuable resource helping to speed innovation
Yahoo! is gradually open sourcing its cloud technology for this exact reason. We want to provide developers out there with an open source framework of scalable cloud technology that will enable them to build tools and solutions that in return will help Yahoo! (and the entire industry) to address these complex challenges.
In addition we believe this ecosystem of solutions “powered by” the core cloud technology will directly benefit Yahoo! consumers. This will provide consumers with customized web-applications (targeted at solving specific user needs) faster and better than ever before.
Geelan: What is it technically that makes Hadoop so powerful for large scale data processing? Are there alternatives?
Shugar: Hadoop provides the software infrastructure used across Yahoo! for large scale distributed computing and back-end data analytics, including fighting spam in Yahoo! Mail, content optimization for the Yahoo! homepage, and better ad targeting based on data analysis.
Hadoop provides a framework that distributes data and processing across thousands of computers, which working in parallel allows us to process and analyze enormous amounts of data in a very efficient manner.
We find Hadoop the ideal solution for Y! based on its scalability and flexible programming environment.
Geelan: How about enterprises, are they going to be adopters of Hadoop too?
Shugar: Yahoo! was the first major technology company to adopt Hadoop. Since then, many other companies, large and small, have begun using Hadoop. You can find more information about how others using Hadoop here: http://wiki.apache.org/hadoop/PoweredBy.
Geelan: What happens to Yahoo!’s Distribution of Hadoop and growth of the Apache Hadoop project overall in light of the deal with Microsoft?
Shugar: Hadoop is our de facto standard scalable data processing platform at Yahoo!. In addition to Search, Hadoop is used to support display advertizing, content platforms, personalization, machine learning for filtering email spam, research, and all other large scale data analysis and mining.
We remain as committed as ever to developing and using Hadoop, and in contributing our code to the open source community. The Yahoo! Hadoop development team will be busy delivering on short and long term roadmaps.
We look forward to continuing to work with the wider Hadoop community to build an increasingly better Hadoop that will support Yahoo! and the industry’s needs in this area. We have by far the largest team of developers and testers working on the project and hundreds of internal customers who use tens of thousands of computers in large Hadoop clusters. We plan to continue to work on improving the Hadoop core to make it faster, more scalable, reliable and secure.
Geelan: What exactly is the Yahoo! Cloud Serving Platform?
Shugar: For the time being, Yahoo!’s is focused on developing a “private cloud”, focused on making the Yahoo! experience as extraordinary, effective and productive as possible for consumers and advertisers across the world. We see this as a multi-year effort that will provide significant advantages for Yahoo! now and in the future.
The Yahoo! Cloud Serving Platform provides Yahoo! developers, engineering and operations with a technology to build, test, deploy, and manage application in an elastic cloud infrastructure that can grow and shrink based on changing workloads.
The cloud serving architecture is based on the following key concepts:
• Abstracts concerns of the underlying infrastructure and network communication from developers and operations
• Uses virtualization and commodity hardware
• Employs a declarative programming language for defining services and applications • Automates the deployment process
• Standardizes software stacks and packaging
The Yahoo! cloud serving architecture represents a shift in the programming paradigm and, as such, can require a fundamental shift in application-design thought processes.
Geelan: Are interoperability and openness important in Cloud Computing, as far as Yahoo! is concerned?
Shugar: Ultimately, we believe that advancement in cloud computing technology will be driven by open source initiatives where large communities of engineers can collaborate and develop new code for the new applications and demands posed by the cloud model. Yahoo! is a leader in supporting open and collaborative research and development as a competitive advantage, enabling the open source community to drive forward the pace of innovation. We believe open source and collaborative innovation is the way to address the complex challenges that large-scale cloud solution presents. Yahoo! will continue to participate and contribute to such efforts wherever appropriate.
Geelan: What other plans does Yahoo! have, moving forward, for helping deliver a cloud at the scale of the Internet?
Shugar: We are investing internally in further building out and deploying cloud computing technologies and services across the global Yahoo! operation so as to help our product teams innovate faster and deliver high-quality experiences to our customers across the globe.
Over time, we may consider exposing our cloud services in a more comprehensive manner through the Yahoo! Developer Network, which serves as Yahoo!’s front door for third parties seeking to engage with our developer tools and web services. However, we have nothing specific to share at this time.
Geelan: Lastly, 2009 has so far been a year of obvious challenges, from both a CapEx and an OpEx perspective, for anyone involved with Enterprise IT. So, what’s your top tip, as a seasoned software executive, to those other IT execs out there right now – especially CTOs of embattled start-ups who may be looking for some magic bullet to ensure they’re alive (and well) as a company in 2010?
Shugar: For startups and small companies, there are public cloud offerings coupled with other commercial cloud vendors that can provide a pay as you go infrastructure minimizing CapEx investments and reducing the risk in deploying new products. For larger or enterprise companies, a hybrid model maintaining existing compute infrastructure while leveraging public cloud offerings for less critical or experimental projects may provide more flexibility at reduced CapEx.
Keeping pace with advancements in software delivery processes and tooling is taxing even for the most proficient organizations. Point tools, platforms, open source and the increasing adoption of private and public cloud services requires strong engineering rigor - all in the face of developer demands to use the tools of choice. As Agile has settled in as a mainstream practice, now DevOps has emerged as the next wave to improve software delivery speed and output. To make DevOps work, organization...
Mar. 28, 2017 02:30 AM EDT Reads: 2,010
My team embarked on building a data lake for our sales and marketing data to better understand customer journeys. This required building a hybrid data pipeline to connect our cloud CRM with the new Hadoop Data Lake. One challenge is that IT was not in a position to provide support until we proved value and marketing did not have the experience, so we embarked on the journey ourselves within the product marketing team for our line of business within Progress. In his session at @BigDataExpo, Sum...
Mar. 28, 2017 02:15 AM EDT Reads: 3,084
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex softw...
Mar. 28, 2017 02:00 AM EDT Reads: 3,887
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Mar. 28, 2017 01:00 AM EDT Reads: 2,335
What sort of WebRTC based applications can we expect to see over the next year and beyond? One way to predict development trends is to see what sorts of applications startups are building. In his session at @ThingsExpo, Arin Sime, founder of WebRTC.ventures, will discuss the current and likely future trends in WebRTC application development based on real requests for custom applications from real customers, as well as other public sources of information,
Mar. 28, 2017 12:45 AM EDT Reads: 975
"My role is working with customers, helping them go through this digital transformation. I spend a lot of time talking to banks, big industries, manufacturers working through how they are integrating and transforming their IT platforms and moving them forward," explained William Morrish, General Manager Product Sales at Interoute, in this SYS-CON.tv interview at 18th Cloud Expo, held June 7-9, 2016, at the Javits Center in New York City, NY.
Mar. 27, 2017 09:45 PM EDT Reads: 3,652
Apache Hadoop is emerging as a distributed platform for handling large and fast incoming streams of data. Predictive maintenance, supply chain optimization, and Internet-of-Things analysis are examples where Hadoop provides the scalable storage, processing, and analytics platform to gain meaningful insights from granular data that is typically only valuable from a large-scale, aggregate view. One architecture useful for capturing and analyzing streaming data is the Lambda Architecture, represent...
Mar. 27, 2017 08:15 PM EDT Reads: 6,329
SYS-CON Events announced today that Ocean9will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Ocean9 provides cloud services for Backup, Disaster Recovery (DRaaS) and instant Innovation, and redefines enterprise infrastructure with its cloud native subscription offerings for mission critical SAP workloads.
Mar. 27, 2017 07:45 PM EDT Reads: 2,207
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
Mar. 27, 2017 07:30 PM EDT Reads: 4,600
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
Mar. 27, 2017 02:45 PM EDT Reads: 1,984
SYS-CON Events announced today that Linux Academy, the foremost online Linux and cloud training platform and community, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Linux Academy was founded on the belief that providing high-quality, in-depth training should be available at an affordable price. Industry leaders in quality training, provided services, and student certification passes, its goal is to c...
Mar. 27, 2017 02:30 PM EDT Reads: 4,038
SYS-CON Events announced today that CA Technologies has been named “Platinum Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY, and the 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business – from apparel to energy – is being rewritten by software. From ...
Mar. 27, 2017 02:00 PM EDT Reads: 2,093
In his session at @ThingsExpo, Eric Lachapelle, CEO of the Professional Evaluation and Certification Board (PECB), will provide an overview of various initiatives to certifiy the security of connected devices and future trends in ensuring public trust of IoT. Eric Lachapelle is the Chief Executive Officer of the Professional Evaluation and Certification Board (PECB), an international certification body. His role is to help companies and individuals to achieve professional, accredited and worldw...
Mar. 27, 2017 01:15 PM EDT Reads: 760
SYS-CON Events announced today that Loom Systems will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Founded in 2015, Loom Systems delivers an advanced AI solution to predict and prevent problems in the digital business. Loom stands alone in the industry as an AI analysis platform requiring no prior math knowledge from operators, leveraging the existing staff to succeed in the digital era. With offices in S...
Mar. 27, 2017 01:00 PM EDT Reads: 1,506
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
Mar. 27, 2017 12:45 PM EDT Reads: 1,426
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
Mar. 27, 2017 11:15 AM EDT Reads: 2,405
SYS-CON Events announced today that HTBase will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. HTBase (Gartner 2016 Cool Vendor) delivers a Composable IT infrastructure solution architected for agility and increased efficiency. It turns compute, storage, and fabric into fluid pools of resources that are easily composed and re-composed to meet each application’s needs. With HTBase, companies can quickly prov...
Mar. 27, 2017 10:30 AM EDT Reads: 3,033
SYS-CON Events announced today that Infranics will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Since 2000, Infranics has developed SysMaster Suite, which is required for the stable and efficient management of ICT infrastructure. The ICT management solution developed and provided by Infranics continues to add intelligence to the ICT infrastructure through the IMC (Infra Management Cycle) based on mathemat...
Mar. 27, 2017 10:30 AM EDT Reads: 3,183
SYS-CON Events announced today that Cloudistics, an on-premises cloud computing company, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloudistics delivers a complete public cloud experience with composable on-premises infrastructures to medium and large enterprises. Its software-defined technology natively converges network, storage, compute, virtualization, and management into a ...
Mar. 27, 2017 09:30 AM EDT Reads: 2,149
There are 66 million network cameras capturing terabytes of data. How did factories in Japan improve physical security at the facilities and improve employee productivity? Edge Computing reduces possible kilobytes of data collected per second to only a few kilobytes of data transmitted to the public cloud every day. Data is aggregated and analyzed close to sensors so only intelligent results need to be transmitted to the cloud. Non-essential data is recycled to optimize storage.
Mar. 27, 2017 08:15 AM EDT Reads: 3,135