| By David Smith | Article Rating: |
|
| July 13, 2012 02:18 PM EDT | Reads: |
1,206 |
At a talk I saw at the useR!2012 conference last month, Googler Karl Millar estimated that there are at least 200 active R users at Google, plus another 300+ occasional users participating in Google's internal R support list. But what are all these Google employees doing with R? A post from the Google Research team published on Google+ yesterday sheds some light:
At Google we use Statistics every day to improve products, optimize infrastructure, and understand users. We’ve built a number of engineering systems to process and store massive amounts of data. These systems often use thousands of computers in parallel to process and manipulate the data. For many of our statisticians and data analysts, however, such systems provide only the first step of an interactive data analysis workflow that also involves filtering, classifying, modeling, visualizing, and forecasting quantitative data across all aspects of our business.
R is the main Statistics language at Google, according to Karl Millar. Here are some of the specific applications of R at Google mentioned in the post:
Large-scale parallel statistical forecasting in R is used to improve the effectiveness of online display advertising for Google's customers.
The same framework is used to study the effectiveness of search advertising at Google, to reveal that search ads drive an additional 89% of web traffic (compared to organic search results alone).
Google uses R for large-scale, computationally intensive forecasting in R (as presented in a talk at the R/Finance 2012 conference)
Google uses an integration of R and FlumeJava to do very large-scale structured data analysis. (At his presentation at useR!2012 Karl Millar said such analyses are at the terabyte-scale today, and will be at the petabyte scale within two years.) This allows Googlers to do large-scale statistical analysis with code that "reads like R, and scales like Map-Reduce", and runs at 90% of the speed of hand-coding in JavaMR directly. (Karl will be talking about Scaling R to Internet Scale Data at the JSM 2012 conference.)
Google participates in many R-related user conferences, user groups, and coding projects.
To read the full Google+ post from the Google Research team, follow the link below.
Google+: Research at Google Read the original blog entry...
Published July 13, 2012 Reads 1,206
Copyright © 2012 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By David Smith
David Smith is Vice President of Marketing and Community at Revolution Analytics. He has a long history with the R and statistics communities. After graduating with a degree in Statistics from the University of Adelaide, South Australia, he spent four years researching statistical methodology at Lancaster University in the United Kingdom, where he also developed a number of packages for the S-PLUS statistical modeling environment. He continued his association with S-PLUS at Insightful (now TIBCO Spotfire) overseeing the product management of S-PLUS and other statistical and data mining products.< David smith is the co-author (with Bill Venables) of the popular tutorial manual, An Introduction to R, and one of the originating developers of the ESS: Emacs Speaks Statistics project. Today, he leads marketing for REvolution R, supports R communities worldwide, and is responsible for the Revolutions blog. Prior to joining Revolution Analytics, he served as vice president of product management at Zynchros, Inc. Follow him on twitter at @RevoDavid
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York Speaker Profile: Jill T. Singer – Federal CIO Emeritus
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- CollabNet and UC4 Announce General Availability of Joint Enterprise DevOps Platform
- How Can Green Web Hosting Benefit Your Business?
- Big Data Isn’t About the Database, It’s About the Application
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- BEA Updates WebLogic SOA Portal for Web 2.0 Era
- UNIT4 Business Software: Three Retail Accounting Tips to Help Retailers Leverage the Cloud and Back Office Systems
- Cloud Expo NY: Best Practices for Architecting Your Cloud Infrastructure
- The Rise of the Thin Client
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York Speaker Profile: Jill T. Singer – Federal CIO Emeritus
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- CollabNet and UC4 Announce General Availability of Joint Enterprise DevOps Platform
- How Can Green Web Hosting Benefit Your Business?
- Big Data Isn’t About the Database, It’s About the Application
- Upcoming Bloomberg BNA Webinar Focuses on COPPA Compliance
- NASA's Twitter Account Wins Back-To-Back Shorty Awards
- Cloud Expo New York: Basics of SSD Technology and Its Use in Cloud
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Success, Arrogance, Rise and Fall
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Personal Branding Checklist
- The Top 250 Players in the Cloud Computing Ecosystem
- i-Technology Viewpoint: Attack of the Blogs
- Exclusive Q&A with Jeff Haynie, Co-Founder & CEO, Appcelerator
- Web 2.0 News and Wrapping Up "Real-World AJAX" Seminar
- Passing Parameters to Flex That Works
- i-Technology Viewpoint: It's Time to Take the Quotation Marks Off "Web 2.0"





















