|By Bob Gourley||
|January 5, 2013 09:40 AM EST||
By Doug Laney
In the late 1990s, while a META Group analyst (Note: META is now part of Gartner), it was becoming evident that our clients increasingly were encumbered by their data assets. While many pundits were talking about, many clients were lamenting, and many vendors were seizing the opportunity of these fast-growing data stores, I also realized that something else was going on. Sea changes in the speed at which data was flowing mainly due to electronic commerce, along with the increasing breadth of data sources, structures and formats due to the post Y2K-ERP application boom were as or more challenging to data management teams than was the increasing quantity of data.
In an attempt to help our clients get a handle on how to recognize, and more importantly, deal with these challenges I began first speaking at industry conferences on this 3-dimensional data challenge of increasing data volume, velocity and variety. Then in late 2000 I drafted a research note published in February 2001 entitled 3-D Data Management: Controlling Data Volume, Velocity and Variety.
Fast forward to today: The “3V’s” framework for understanding and dealing with Big Data has now become ubiquitous. In fact, other research firms, major vendors and consulting firms have even posited the 3Vs (or an unmistakable variant) as their own concept. Since the original piece is no longer available in Gartner archives but is in increasing demand, I wanted to make it available here for anyone to reference and cite:
Original Research Note PDF: 3-D Data Management: Controlling Data Volume, Velocity and Variety
Date: 6 February 2001 Author: Doug Laney
3-D Data Management: Controlling Data Volume, Velocity and Variety. Current business conditions and mediums are pushing traditional data management principles to their limits, giving rise to novel and more formalized approaches.
META Trend: During 2001/02, leading enterprises will increasingly use a centralized data warehouse to define a common business vocabulary that improves internal and external collaboration. Through 2003/04, data quality and integration woes will be tempered by data profiling technologies (for generating metadata, consolidated schemas, and integration logic) and information logistics agents. By 2005/06, data, document, and knowledge management will coalesce, driven by schema-agnostic indexing strategies and portal maturity.
The effect of the e-commerce surge, a rise in merger & acquisition activity, increased collaboration, and the drive for harnessing information as a competitive catalyst is driving enterprises to higher levels of consciousness about how data is managed at its most basic level. In 2001-02, historical, integrated databases (e.g. data warehouses, operational data stores, data marts), will be leveraged not only for intended analytical purposes, but increasingly for intra-enterprise consistency and coordination. By 2003-04, these structures (including their associated metadata) will be on par with application portfolios, organization charts and procedure manuals for defining a business to its employees and affiliates.
Data records, data structures, and definitions commonly accepted throughout an enterprise reduce fiefdoms pulling against each other due to differences in the way each perceives where the enterprise has been, is presently, and is headed. Readily accessible current and historical records of transactions, affiliates (partners, employees, customers, suppliers), business processes (or rules), along with definitional and navigational metadata (see ADS Delta 896, 21st Century Metadata: Mapping the Enterprise Genome, 7 Aug 2000) enable employees to paddle in the same direction. Conversely, application-specific data stores (e.g. accounts receivable versus order status), geographic-specific data stores (e.g. North American sales vs. International sales), offer conflicting, or insular views of the enterprise, that while important for feeding transactional systems, provide no “single version of the truth,” giving rise to inconsistency in the way enterprise factions function.
While enterprises struggle to consolidate systems and collapse redundant databases to enable greater operational, analytical, and collaborative consistencies, changing economic conditions have made this job more difficult. E-commerce, in particular, has exploded data management challenges along three dimensions: volumes, velocity and variety. In 2001/02, IT organizations must compile a variety of approaches to have at their disposal for dealing with each.
E-commerce channels increase the depth and breadth of data available about a transaction (or any point of interaction). The lower cost of e-channels enables and enterprise to offer its goods or services to more individuals or trading partners, and up to 10x the quantity of data about an individual transaction may be collected—thereby increasing the overall volume of data to be managed. Furthermore, as enterprises come to see information as a tangible asset, they become reluctant to discard it.
Typically, increases in data volume are handled by purchasing additional online storage. However as data volume increases, the relative value of each data point decreases proportionately—resulting in a poor financial justification for merely incrementing online storage. Viable alternates and supplements to hanging new disk include:
- Implementing tiered storage systems (see SIS Delta 860, 19 Apr 2000) that cost effectively balance levels of data utility with data availability using a variety of media.
- Limiting data collected to that which will be leveraged by current or imminent business processes
- Limiting certain analytic structures to a percentage of statistically valid sample data.
- Profiling data sources to identify and subsequently eliminate redundancies
- Monitoring data usage to determine “cold spots” of unused data that can be eliminated or offloaded to tape (e.g. Ambeo, BEZ Systems, Teleran)
- Outsourcing data management altogether (e.g. EDS, IBM)
E-commerce has also increased point-of-interaction (POI) speed, and consequently the pace data used to support interactions and generated by interactions. As POI performance is increasingly perceived as a competitive differentiator (e.g. Web site response, inventory availability analysis, transaction execution, order tracking update, product/service delivery, etc.) so too is an organization’s ability to manage data velocity. Recognizing that data velocity management is much more than a physical bandwidth and protocol issue, enterprises are implementing architectural solutions such as:
- Operational data stores (ODSs) that periodically extract, integrate and re-organize production data for operational inquiry or tactical analysis
- Caches that provide instant access to transaction data while buffering back-end systems from additional load and performance degradation. (Unlike ODSs, caches are updated according to adaptive business rules and have schemas that mimic the back-end source.)
- Point-to-point (P2P) data routing between databases and applications (e.g. D2K, DataMirror) that circumvents high-latency hub-and-spoke models that are more appropriate for strategic analysis
- Designing architectures that balance data latency with application data requirements and decision cycles, without assuming the entire information supply chain must be near real-time.
Through 2003/04, no greater barrier to effective data management will exist than the variety of incompatible data formats, non-aligned data structures, and inconsistent data semantics. By this time, interchange and translation mechanisms will be built into most DBMSs. But until then, application portfolio sprawl (particularly when based on a “strategy” of autonomous software implementations due to e-commerce solution immaturity), increased partnerships, and M&A activity intensifies data variety challenges. Attempts to resolve data variety issues must be approached as an ongoing endeavor encompassing the following techniques:
- Data profiling (e.g. Data Mentors, Metagenix) to discover hidden relationships and resolve inconsistencies across multiple data sources (see ADS898)
- XML-based data format “universal translators” that import data into standard XML documents for export into another data format (e.g. infoShark, XML Solutions)
- Enterprise application integration (EAI) predefined adapters (e.g. NEON, Tibco, Mercator) for acquiring and delivering data between known applications via message queues, or EAI development kits for building custom adapters.
- Data access middleware (e.g. Information Builders’ EDA/SQL, SAS Access, OLE DB, ODBC) for direct connectivity between applications and databases
- Distributed query management (DQM) software (e.g. Enth, InfoRay, Metagon) that adds a data routing and integration intelligence layer above “dumb” data access middleware
- Metadata management solutions (i.e. repositories and schema standards) to capture and make available definitional metadata that can help provide contextual consistency to enterprise data
- Advanced indexing techniques for relating (if not physically integrating) data of various incompatible types (e.g. multimedia, documents, structured data, business rules).
As with any sufficiently fashionable technology, users should expect the data management market place ebb-and-flow to yield solutions that consolidate multiple techniques and solutions that are increasingly application/environment specific. (See Figure 1 – Data Management Solutions) In selecting a technique or technology, enterprises should first perform an information audit assessing the status of their information supply chain to identify and prioritize particular data management issues.
Business Impact: Attention to data management, particularly in a climate of e-commerce and greater need for collaboration, can enable enterprises to achieve greater returns on their information assets.
Bottom Line: In 2001/02, IT organizations must look beyond traditional direct brute force physical approaches to data management. Through 2003/04, practices for resolving e-commerce accelerated data volume, velocity and variety issues will become more formalized and diverse. Increasingly, these techniques involve trade-offs and architectural solutions that involve and impact application portfolios and business strategy decisions.
Over the past decade, Gartner analysts including Regina Casonato, Anne Lapkin, Mark A. Beyer, Yvonne Genovese and Ted Friedman have continued to expand our research on this topic, identifying and refining other “big data” concepts. In September 2011 they published the tremendous research note Information Management in the 21st Century. And in 2012, Mark Beyer and I developed and published Gartner’s updated definition of Big Data to reflect its value proposition and requirements for “new innovative forms of processing.” (See The Importance of ‘Big Data’: A Definition)
Doug Laney is a research vice president for Gartner Research, where he covers business analytics solutions and projects, information management, and data-governance-related issues. He is considered a pioneer in the field of data warehousing and created the first commercial project methodology for business intelligence/data warehouse projects. Mr. Laney is also originated the discipline of information economics (infonomics).
Follow Doug on Twitter: @Doug_Laney
WebRTC is about the data channel as much as about video and audio conferencing. However, basically all commercial WebRTC applications have been built with a focus on audio and video. The handling of “data” has been limited to text chat and file download – all other data sharing seems to end with screensharing. What is holding back a more intensive use of peer-to-peer data? In her session at @ThingsExpo, Dr Silvia Pfeiffer, WebRTC Applications Team Lead at National ICT Australia, will look at different existing uses of peer-to-peer data sharing and how it can become useful in a live session to...
Oct. 7, 2015 05:00 PM EDT Reads: 555
NHK, Japan Broadcasting, will feature the upcoming @ThingsExpo Silicon Valley in a special 'Internet of Things' and smart technology documentary that will be filmed on the expo floor between November 3 to 5, 2015, in Santa Clara. NHK is the sole public TV network in Japan equivalent to the BBC in the UK and the largest in Asia with many award-winning science and technology programs. Japanese TV is producing a documentary about IoT and Smart technology and will be covering @ThingsExpo Silicon Valley. The program, to be aired during the peak viewership season of the year, will have a major impac...
Oct. 7, 2015 04:45 PM EDT Reads: 190
The buzz continues for cloud, data analytics and the Internet of Things (IoT) and their collective impact across all industries. But a new conversation is emerging - how do companies use industry disruption and technology enablers to lead in markets undergoing change, uncertainty and ambiguity? Organizations of all sizes need to evolve and transform, often under massive pressure, as industry lines blur and merge and traditional business models are assaulted and turned upside down. In this new data-driven world, marketplaces reign supreme while interoperability, APIs and applications deliver un...
Oct. 7, 2015 04:13 PM EDT
Internet of Things (IoT) will be a hybrid ecosystem of diverse devices and sensors collaborating with operational and enterprise systems to create the next big application. In their session at @ThingsExpo, Bramh Gupta, founder and CEO of robomq.io, and Fred Yatzeck, principal architect leading product development at robomq.io, discussed how choosing the right middleware and integration strategy from the get-go will enable IoT solution developers to adapt and grow with the industry, while at the same time reduce Time to Market (TTM) by using plug and play capabilities offered by a robust IoT ...
Oct. 7, 2015 04:00 PM EDT Reads: 2,132
Through WebRTC, audio and video communications are being embedded more easily than ever into applications, helping carriers, enterprises and independent software vendors deliver greater functionality to their end users. With today’s business world increasingly focused on outcomes, users’ growing calls for ease of use, and businesses craving smarter, tighter integration, what’s the next step in delivering a richer, more immersive experience? That richer, more fully integrated experience comes about through a Communications Platform as a Service which allows for messaging, screen sharing, video...
Oct. 7, 2015 04:00 PM EDT Reads: 1,066
Developing software for the Internet of Things (IoT) comes with its own set of challenges. Security, privacy, and unified standards are a few key issues. In addition, each IoT product is comprised of at least three separate application components: the software embedded in the device, the backend big-data service, and the mobile application for the end user's controls. Each component is developed by a different team, using different technologies and practices, and deployed to a different stack/target - this makes the integration of these separate pipelines and the coordination of software upd...
Oct. 7, 2015 04:00 PM EDT Reads: 204
Can call centers hang up the phones for good? Intuitive Solutions did. WebRTC enabled this contact center provider to eliminate antiquated telephony and desktop phone infrastructure with a pure web-based solution, allowing them to expand beyond brick-and-mortar confines to a home-based agent model. It also ensured scalability and better service for customers, including MUY! Companies, one of the country's largest franchise restaurant companies with 232 Pizza Hut locations. This is one example of WebRTC adoption today, but the potential is limitless when powered by IoT.
Oct. 7, 2015 03:30 PM EDT Reads: 7,417
SYS-CON Events announced today that Luxoft Holding, Inc., a leading provider of software development services and innovative IT solutions, has been named “Bronze Sponsor” of SYS-CON's @ThingsExpo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. Luxoft’s software development services consist of core and mission-critical custom software development and support, product engineering and testing, and technology consulting.
Oct. 7, 2015 03:15 PM EDT Reads: 610
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Oct. 7, 2015 03:00 PM EDT Reads: 6,905
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Oct. 7, 2015 02:00 PM EDT Reads: 5,848
You have your devices and your data, but what about the rest of your Internet of Things story? Two popular classes of technologies that nicely handle the Big Data analytics for Internet of Things are Apache Hadoop and NoSQL. Hadoop is designed for parallelizing analytical work across many servers and is ideal for the massive data volumes you create with IoT devices. NoSQL databases such as Apache HBase are ideal for storing and retrieving IoT data as “time series data.”
Oct. 7, 2015 01:45 PM EDT Reads: 482
There are so many tools and techniques for data analytics that even for a data scientist the choices, possible systems, and even the types of data can be daunting. In his session at @ThingsExpo, Chris Harrold, Global CTO for Big Data Solutions for EMC Corporation, will show how to perform a simple, but meaningful analysis of social sentiment data using freely available tools that take only minutes to download and install. Participants will get the download information, scripts, and complete end-to-end walkthrough of the analysis from start to finish. Participants will also be given the pract...
Oct. 7, 2015 01:45 PM EDT Reads: 111
Clearly the way forward is to move to cloud be it bare metal, VMs or containers. One aspect of the current public clouds that is slowing this cloud migration is cloud lock-in. Every cloud vendor is trying to make it very difficult to move out once a customer has chosen their cloud. In his session at 17th Cloud Expo, Naveen Nimmu, CEO of Clouber, Inc., will advocate that making the inter-cloud migration as simple as changing airlines would help the entire industry to quickly adopt the cloud without worrying about any lock-in fears. In fact by having standard APIs for IaaS would help PaaS expl...
Oct. 7, 2015 01:30 PM EDT Reads: 611
SYS-CON Events announced today that ProfitBricks, the provider of painless cloud infrastructure, will exhibit at SYS-CON's 17th International Cloud Expo®, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. ProfitBricks is the IaaS provider that offers a painless cloud experience for all IT users, with no learning curve. ProfitBricks boasts flexible cloud servers and networking, an integrated Data Center Designer tool for visual control over the cloud and the best price/performance value available. ProfitBricks was named one of the coolest Clo...
Oct. 7, 2015 01:00 PM EDT Reads: 751
Organizations already struggle with the simple collection of data resulting from the proliferation of IoT, lacking the right infrastructure to manage it. They can't only rely on the cloud to collect and utilize this data because many applications still require dedicated infrastructure for security, redundancy, performance, etc. In his session at 17th Cloud Expo, Emil Sayegh, CEO of Codero Hosting, will discuss how in order to resolve the inherent issues, companies need to combine dedicated and cloud solutions through hybrid hosting – a sustainable solution for the data required to manage I...
Oct. 7, 2015 01:00 PM EDT Reads: 457
Mobile messaging has been a popular communication channel for more than 20 years. Finnish engineer Matti Makkonen invented the idea for SMS (Short Message Service) in 1984, making his vision a reality on December 3, 1992 by sending the first message ("Happy Christmas") from a PC to a cell phone. Since then, the technology has evolved immensely, from both a technology standpoint, and in our everyday uses for it. Originally used for person-to-person (P2P) communication, i.e., Sally sends a text message to Betty – mobile messaging now offers tremendous value to businesses for customer and empl...
Oct. 7, 2015 12:15 PM EDT Reads: 196
Nowadays, a large number of sensors and devices are connected to the network. Leading-edge IoT technologies integrate various types of sensor data to create a new value for several business decision scenarios. The transparent cloud is a model of a new IoT emergence service platform. Many service providers store and access various types of sensor data in order to create and find out new business values by integrating such data.
Oct. 7, 2015 12:00 PM EDT Reads: 471
SYS-CON Events announced today that IBM Cloud Data Services has been named “Bronze Sponsor” of SYS-CON's 17th Cloud Expo, which will take place on November 3–5, 2015, at the Santa Clara Convention Center in Santa Clara, CA. IBM Cloud Data Services offers a portfolio of integrated, best-of-breed cloud data services for developers focused on mobile computing and analytics use cases.
Oct. 7, 2015 12:00 PM EDT Reads: 676
Scott Guthrie's keynote presentation "Journey to the intelligent cloud" is a must view video. This is from AzureCon 2015, September 29, 2015 I have reproduced some screen shots in case you are unable to view this long video for one reason or another. One of the highlights is 3 datacenters coming on line in India.
Oct. 7, 2015 12:00 PM EDT Reads: 259
Apps and devices shouldn't stop working when there's limited or no network connectivity. Learn how to bring data stored in a cloud database to the edge of the network (and back again) whenever an Internet connection is available. In his session at 17th Cloud Expo, Bradley Holt, Developer Advocate at IBM Cloud Data Services, will demonstrate techniques for replicating cloud databases with devices in order to build offline-first mobile or Internet of Things (IoT) apps that can provide a better, faster user experience, both offline and online. The focus of this talk will be on IBM Cloudant, Apa...
Oct. 7, 2015 11:45 AM EDT Reads: 494