Welcome!

Agile Computing Authors: Yeshim Deniz, Elizabeth White, Pat Romanski, Liz McMillan, Carmen Gonzalez

Related Topics: SYS-CON ITALIA, Containers Expo Blog

SYS-CON ITALIA: Article

Cloud Computing: Making Analytics in the Cloud a Reality

There will soon be a myriad of announcements of DBMS offerings in the cloud

There will soon be a myriad of announcements of DBMS offerings in the cloud. Many of these will NOT be marriages made in heaven. However, the most innovative new DBMS software combined with new cloud computing services are here today and truly take advantage of the cloud architecture in order to change the economics and the responsiveness of business analytics.

My belief is that cloud computing will change the economics of business intelligence (BI) and enable a variety of new analytic data management projects and business possibilities. It does so by making the hardware, networking, security, and software needed to create data marts and data warehouses available on demand with a pay-as-you-go approach to usage and licensing.

A computing cloud, such as the Amazon Elastic Compute Cloud, is composed of thousands of commodity servers running multiple virtual machine instances (VMs) of the applications hosted in the cloud. As customer demand for those applications changes, new servers are added to the cloud or idled and new VMs are instantiated or terminated.

Cloud computing infrastructure differs dramatically from the infrastructure underlying most in-house data warehouses and data marts. There are no high-end servers with dozens of CPU cores, SANs, replicated systems, or proprietary data warehousing appliances available in the cloud. Therefore, a new DBMS software architecture is required to enable large volumes of data to be analyzed quickly and reliably on the cloud's commodity hardware. Recent DBMS innovations make this a reality today, and the best cloud DBMS architectures will include:

  1. Shared-nothing, massively parallel processing (MPP) architecture. In order to drive down the cost of creating a utility computing environment, the best cloud service providers use huge grids of identical (or similar) computing elements. Each node in the grid is typically a compute engine with its own attached storage. For a cloud database to successfully "scale out" in such an environment, it is essential that the database have a shared-nothing architecture utilizing the resources (CPU, memory, and disk) found in server nodes added to the cluster. Most databases popularly used in BI today have shared-everything or shared-storage architectures, which will limit their ability to scale in the cloud.

  2. Automatic high availability. Within a cloud-based analytic database cluster, node failures, node changes, and connection disruptions can occur. Given the vast number of processing elements within a cloud, these failures can be made transparent to the end user if the database has the proper built-in failover capabilities. The best cloud databases will replicate data automatically across the nodes in the cloud cluster, be able to continue running in the event of 1 or more node failures ("k-safety"), and be capable of restoring data on recovered nodes automatically -- without DBA assistance. Ideally, the replicated data will be made "active" in different sort orders for querying to increase performance.

  3. Ultra-high performance. One of the game-changing advantages of the cloud is the ability to get an analytic application up quickly (without waiting for hardware procurement). However, there can be some performance penalty due to Internet connectivity speeds and the virtualized cloud environment. If the analytic performance is disappointing, the advantage is lost. Fortunately, the latest shared-nothing columnar databases are designed specifically for analytic workloads, and they have demonstrated dramatic performance improvements over traditional, row-oriented databases (as verified by industry experts, such as Gartner and Forrester, and by customer benchmarks). This software performance improvement, coupled with the hardware economies of scale provided by the cloud environment, results in a new economic model and competitive advantage for cloud analytics.

  4. Aggressive compression. Since cloud costs are typically driven by charges for processor and disk storage utilization, aggressive data compression will result in very large cost savings. Row-oriented databases can achieve compression factors of about 30% to 50%; however, the addition of necessary indexes and materialized views often swells databases to 2 to 5 times the size of the source data. But since the data in a column tends to be more similar and repetitive than attributes within rows, column databases often achieve much higher levels of compression. They also don't require indexes. The result is normally a 4x to 20x reduction in the amount of storage needed by columnar databases and a commensurate reduction in storage costs.

  5. Standards-based connectivity. While there are a number of special-purpose file systems that have been developed for the cloud environment that can provide high performance, they lack the standard connectivity needed to support general-purpose business analytics. The broad base of analytic users will use existing commercial ETL and reporting software that depend on SQL, JDBC, ODBC, and other DBMS connectivity standards to load and query cloud databases. Therefore, it's imperative for cloud databases to support these connection standards to enable widespread use of analytic applications.
In summary, cloud databases with the architectural characteristics described above will be able to not just run in the cloud, but thrive there by:

  • "Scaling out," as the cloud itself does
  • Running fast without high-end or custom hardware
  • Providing high availability in a fluid computing environment
  • Minimizing data storage, transfer, and CPU utilization (to keep cloud computing fees low)

More Stories By Jerry Held

Jerry Held is Executive Chairman of Vertica and CEO of the Held Consulting Group, a firm that provides strategic consulting to CEOs and senior executives of technology firms ranging from startups to very large organizations and private equity firms. Prior to his current position, Held was a senior executive at both Oracle Corp. and Tandem Computers.

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that Juniper Networks (NYSE: JNPR), an industry leader in automated, scalable and secure networks, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Juniper Networks challenges the status quo with products, solutions and services that transform the economics of networking. The company co-innovates with customers and partners to deliver automated, scalable and secure network...
Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the USA and Europe, we work with a variety of customers from emerging startups to Fortune 1000 companies.
Financial Technology has become a topic of intense interest throughout the cloud developer and enterprise IT communities. Accordingly, attendees at the upcoming 20th Cloud Expo at the Javits Center in New York, June 6-8, 2017, will find fresh new content in a new track called FinTech.
SYS-CON Events announced today that Interoute, owner-operator of one of Europe's largest networks and a global cloud services platform, has been named “Bronze Sponsor” of SYS-CON's 20th Cloud Expo, which will take place on June 6-8, 2017 at the Javits Center in New York, New York. Interoute is the owner-operator of one of Europe's largest networks and a global cloud services platform which encompasses 12 data centers, 14 virtual data centers and 31 colocation centers, with connections to 195 add...
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Val...
@GonzalezCarmen has been ranked the Number One Influencer and @ThingsExpo has been named the Number One Brand in the “M2M 2016: Top 100 Influencers and Brands” by Analytic. Onalytica analyzed tweets over the last 6 months mentioning the keywords M2M OR “Machine to Machine.” They then identified the top 100 most influential brands and individuals leading the discussion on Twitter.
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
The age of Digital Disruption is evolving into the next era – Digital Cohesion, an age in which applications securely self-assemble and deliver predictive services that continuously adapt to user behavior. Information from devices, sensors and applications around us will drive services seamlessly across mobile and fixed devices/infrastructure. This evolution is happening now in software defined services and secure networking. Four key drivers – Performance, Economics, Interoperability and Trust ...
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Val...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deli...
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
SYS-CON Events announced today that Grape Up will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company specializing in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the U.S. and Europe, Grape Up works with a variety of customers from emergi...
@ThingsExpo has been named the Most Influential ‘Smart Cities - IIoT' Account and @BigDataExpo has been named fourteenth by Right Relevance (RR), which provides curated information and intelligence on approximately 50,000 topics. In addition, Right Relevance provides an Insights offering that combines the above Topics and Influencers information with real time conversations to provide actionable intelligence with visualizations to enable decision making. The Insights service is applicable to eve...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Bert Loomis was a visionary. This general session will highlight how Bert Loomis and people like him inspire us to build great things with small inventions. In their general session at 19th Cloud Expo, Harold Hannon, Architect at IBM Bluemix, and Michael O'Neill, Strategic Business Development at Nvidia, discussed the accelerating pace of AI development and how IBM Cloud and NVIDIA are partnering to bring AI capabilities to "every day," on-demand. They also reviewed two "free infrastructure" pr...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet th...
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Busine...