| By Jerry Held | Article Rating: |
|
| August 5, 2008 01:15 PM EDT | Reads: |
25,352 |
There will soon be a myriad of announcements of DBMS offerings in the cloud. Many of these will NOT be marriages made in heaven. However, the most innovative new DBMS software combined with new cloud computing services are here today and truly take advantage of the cloud architecture in order to change the economics and the responsiveness of business analytics.My belief is that cloud computing will change the economics of business intelligence (BI) and enable a variety of new analytic data management projects and business possibilities. It does so by making the hardware, networking, security, and software needed to create data marts and data warehouses available on demand with a pay-as-you-go approach to usage and licensing.
A computing cloud, such as the Amazon Elastic Compute Cloud, is composed of thousands of commodity servers running multiple virtual machine instances (VMs) of the applications hosted in the cloud. As customer demand for those applications changes, new servers are added to the cloud or idled and new VMs are instantiated or terminated.
CIO, CTO & Developer Resources
Cloud computing infrastructure differs dramatically from the infrastructure underlying most in-house data warehouses and data marts. There are no high-end servers with dozens of CPU cores, SANs, replicated systems, or proprietary data warehousing appliances available in the cloud. Therefore, a new DBMS software architecture is required to enable large volumes of data to be analyzed quickly and reliably on the cloud's commodity hardware. Recent DBMS innovations make this a reality today, and the best cloud DBMS architectures will include:
- Shared-nothing, massively parallel processing (MPP) architecture. In order to drive down the cost of creating a utility computing environment, the best cloud service providers use huge grids of identical (or similar) computing elements. Each node in the grid is typically a compute engine with its own attached storage. For a cloud database to successfully "scale out" in such an environment, it is essential that the database have a shared-nothing architecture utilizing the resources (CPU, memory, and disk) found in server nodes added to the cluster. Most databases popularly used in BI today have shared-everything or shared-storage architectures, which will limit their ability to scale in the cloud.
- Automatic high availability. Within a cloud-based analytic database cluster, node failures, node changes, and connection disruptions can occur. Given the vast number of processing elements within a cloud, these failures can be made transparent to the end user if the database has the proper built-in failover capabilities. The best cloud databases will replicate data automatically across the nodes in the cloud cluster, be able to continue running in the event of 1 or more node failures ("k-safety"), and be capable of restoring data on recovered nodes automatically -- without DBA assistance. Ideally, the replicated data will be made "active" in different sort orders for querying to increase performance.
- Ultra-high performance. One of the game-changing advantages of the cloud is the ability to get an analytic application up quickly (without waiting for hardware procurement). However, there can be some performance penalty due to Internet connectivity speeds and the virtualized cloud environment. If the analytic performance is disappointing, the advantage is lost. Fortunately, the latest shared-nothing columnar databases are designed specifically for analytic workloads, and they have demonstrated dramatic performance improvements over traditional, row-oriented databases (as verified by industry experts, such as Gartner and Forrester, and by customer benchmarks). This software performance improvement, coupled with the hardware economies of scale provided by the cloud environment, results in a new economic model and competitive advantage for cloud analytics.
- Aggressive compression. Since cloud costs are typically driven by charges for processor and disk storage utilization, aggressive data compression will result in very large cost savings. Row-oriented databases can achieve compression factors of about 30% to 50%; however, the addition of necessary indexes and materialized views often swells databases to 2 to 5 times the size of the source data. But since the data in a column tends to be more similar and repetitive than attributes within rows, column databases often achieve much higher levels of compression. They also don't require indexes. The result is normally a 4x to 20x reduction in the amount of storage needed by columnar databases and a commensurate reduction in storage costs.
- Standards-based connectivity. While there are a number of special-purpose file systems that have been developed for the cloud environment that can provide high performance, they lack the standard connectivity needed to support general-purpose business analytics. The broad base of analytic users will use existing commercial ETL and reporting software that depend on SQL, JDBC, ODBC, and other DBMS connectivity standards to load and query cloud databases. Therefore, it's imperative for cloud databases to support these connection standards to enable widespread use of analytic applications.
- "Scaling out," as the cloud itself does
- Running fast without high-end or custom hardware
- Providing high availability in a fluid computing environment
- Minimizing data storage, transfer, and CPU utilization (to keep cloud computing fees low)
Published August 5, 2008 Reads 25,352
Copyright © 2008 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
Related Stories
- Opinion: Cloud Computing Makes Me Nervous
- Cloud Computing Expo: How to Port an Application to EC2
- Merrill Lynch Estimates "Cloud Computing" To Be $100 Billion Market
- Creating a Common Cloud Computing Reference API - Part One
- Five Key Challenges of Enterprise Cloud Computing
- Is Cloud Computing the Wave of the Future?
- Google Chrome and Business Intelligence in the Cloud
- The Vocabulary of Cloud Computing
- Citrix CEO "The Industry Needs Time"
- Meta-data in the Cloud
- Future of Block Storage in the Cloud
- Challenges of Running Apps in “The Cloud”
- IBM Strengthens BI Strategy
- Databases in the Cloud
- Cloud BI & Amazon VPC – Low Hanging Fruit for the Enterprise
- Hardware Scaling in the Cloud - Part 3 of 5
- Cloud Analytics
- Cloud Analytics Checklist
- Thirty-One Days of Servers (VMs) in the Cloud
More Stories By Jerry Held
Jerry Held is Executive Chairman of Vertica and CEO of the Held Consulting Group, a firm that provides strategic consulting to CEOs and senior executives of technology firms ranging from startups to very large organizations and private equity firms. Prior to his current position, Held was a senior executive at both Oracle Corp. and Tandem Computers.
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York Speaker Profile: Jill T. Singer – Federal CIO Emeritus
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- CollabNet and UC4 Announce General Availability of Joint Enterprise DevOps Platform
- How Can Green Web Hosting Benefit Your Business?
- Big Data Isn’t About the Database, It’s About the Application
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- BEA Updates WebLogic SOA Portal for Web 2.0 Era
- UNIT4 Business Software: Three Retail Accounting Tips to Help Retailers Leverage the Cloud and Back Office Systems
- Cloud Expo NY: Best Practices for Architecting Your Cloud Infrastructure
- The Rise of the Thin Client
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York Speaker Profile: Jill T. Singer – Federal CIO Emeritus
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- CollabNet and UC4 Announce General Availability of Joint Enterprise DevOps Platform
- How Can Green Web Hosting Benefit Your Business?
- Big Data Isn’t About the Database, It’s About the Application
- Upcoming Bloomberg BNA Webinar Focuses on COPPA Compliance
- NASA's Twitter Account Wins Back-To-Back Shorty Awards
- Cloud Expo New York: Basics of SSD Technology and Its Use in Cloud
- Session Topics: 12th Cloud Expo / Cloud Expo New York
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Success, Arrogance, Rise and Fall
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Personal Branding Checklist
- The Top 250 Players in the Cloud Computing Ecosystem
- i-Technology Viewpoint: Attack of the Blogs
- Exclusive Q&A with Jeff Haynie, Co-Founder & CEO, Appcelerator
- Web 2.0 News and Wrapping Up "Real-World AJAX" Seminar
- Passing Parameters to Flex That Works
- i-Technology Viewpoint: It's Time to Take the Quotation Marks Off "Web 2.0"























