|By Robert Eve||
|December 6, 2011 09:30 AM EST||
Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility is the first book published on the topic of data virtualization. Along with an overview of data virtualization and its advantages, it presents ten case studies of organizations that have adopted data virtualization to significantly improve business decision making, decrease time-to-solution and reduce costs. This article describes data virtualization adoption at one of the enterprises profiled, Pfizer Inc.
Pfizer Inc. is a biopharmaceutical company that develops, manufactures and markets medicines for both humans and animals. As the world's largest drug manufacturer, Pfizer operates globally with 111,500 employees and a presence in over 100 countries.
Worldwide Pharmaceutical Sciences (PharmSci) is a group of scientists responsible for enabling what drugs Pfizer will bring to market. This group designs, synthesizes and manufactures all drugs that are part of clinical trials and toxicology testing within Pfizer.
For this case study, we interviewed Dr. Michael C. Linhares, Ph.D and Research Fellow. Linhares heads up the Business Information Systems (BIS) team within PharmSci.
BIS is responsible for portfolio and resource management across all of PharmSci's projects. This involves designing, building and supporting systems that deliver data to executive teams and staff to help them make decisions regarding how to allocate available resources - both people and dollars - across the overall portfolio of over 100 projects annually.
The Business Problem
A major challenge for PharmSci is the fact that it has a complex portfolio of projects that is constantly changing.
According to Linhares, "Every week, something new comes up and we need to ensure that the right information is communicated to the right people. The people making decisions about resource allocation need easy and simple methods for obtaining that information. One aspect of this is that some people learn the information first and they need to communicate it to others who are responsible for making decisions based on the information. This creates an information-sharing challenge."
Linhares estimates that there are 80 to 100 information producers within PharmSci and over 1,000 information consumers, including the executives who seek a full picture of the project portfolio - financial data, project data, people data and data about the pharmaceutical compounds themselves.
The Technical Problem
The data required is created in and managed by different applications, each developed by a different team, stored in multiple sources managed by different technologies, and the applications don't talk to each other.
This makes it very difficult to access summary information across all projects. Examples would be identifying how much money is being spent on all projects in the project management system, what the next milestones are and when each will be met, and who is working on each project. "We needed a solution that would allow us to pull all this information together in an agile way."
When Linhares joined PharmSci, there was very little in the way of effective information integration. Most integration was done manually by exporting data from various systems into Excel spreadsheets and then either combining spreadsheets or taking the spreadsheet data and moving it into Access or SQL Server databases. With no real security controls, this approach also lacked scalability and opportunities for reuse, generated multiple copies of the spreadsheets (with various changes), and it often took weeks to build a spreadsheet with only a 50% chance that it would include all of the data required.
To be successful, the solution to these data integration and reporting problems had to provide the following:
- A single, integrated view of all data sources with a common set of naming conventions
- A flexible middle layer that would be independent of both the data sources on the back end and the reporting tools on the front end to facilitate easy change management
- Shared metadata and business rule functionality so there would be a single point for managing and monitoring the solution
- A development platform that supported fast, iterative development and, therefore, continuous process improvement
Three Options Considered
BIS considered three solution architectures to meet their business and technical challenges.
- Traditional Information Factory: The first option was a traditional approach of an integrated, scalable information factory. Pfizer had already implemented information factories in the division using a combination of Informatica ETL tools, Oracle databases and custom-built reporting applications. However, according to Linhares, an information factory "seemed like overkill. We didn't have high volumes of data, nor did we need the inherent complexity of using ETL tools to transform and move data while making sure we included all the detailed data we might possibly ever need over time." Furthermore, because of the way the information factories were managed within Pfizer, change management entailed significant overhead. However, the architectural concepts of an information factory were not going to be ignored in the final solution.
- Single Vendor Stack: A second possible approach was to implement the solution in a single integrated technology (SQL Server with integration services). Major disadvantages were the lack of access to multiple data source types, the need to move data multiple times and the lack of an integrated metadata repository for understanding and organizing the data model.
- Data Virtualization: The third option was to create a federated data virtualization layer that integrated and accessed the underlying data sources through virtual views of the data. By leaving the source data in place, this approach would eliminate the issues inherent in copying and moving all the data (which Linhares described as unnecessary, "non-value added" activities). With the right technology and mix of products, data virtualization would enable PharmSci to migrate from inefficient, off-line spreadmarts to online access to integrated information that could be rapidly tailored and reused to dramatically increase its value to the organization.
The Data Virtualization Solution - Architecture
Pfizer's solution is the PharmSci Portfolio Database (PSPD), a federated data delivery framework implemented with the Composite Data Virtualization Platform.
Data virtualization enables the integration of all PharmSci data sources into a single reporting schema of information that can be accessed by all front-end tools and users. The solution architecture includes the following components:
Trusted Data Sources: There are many sources of data for PSPD; they are geographically dispersed, store data in a variety of formats across a multivendor, heterogeneous data environment. Here are some examples:
- Enterprise Project Management (EPM) is a SQL Server database of WRD's drug portfolio project plans. It includes detailed project schedules and milestones.
- The Global Information Factory (GIF) is an Oracle-based data warehouse of monthly finance data.
- OneSource, a database of corporate-level drug portfolio information is itself a unified set of Composite views across several different sources built by another group within Pfizer.
- Flat files are provided by the Finance Department on actual resource use.
- SharePoint lists are small SharePoint databases accessed using a web service.
- There are other data sources as well, including custom-built systems. As Linhares pointed out, "It doesn't matter what data sources we have. With a virtual approach, we are not limited by the types of data we need to access."
Data Virtualization Layer: The Composite Data Virtualization Platform forms the data virtualization layer that enables the solution to be independent of the data sources and front-end tools. It provides abstracted access to all of the data sources and delivers the data through virtual views. These views effectively present the PharmSci Portfolio Database as subject-specific data marts. The Composite metadata repository manages data lineage and business rules.
Consuming Applications: The flexibility of the platform is demonstrated by the varied reporting applications that use the information in PSPD. Examples include:
- SAP Business Objects for ad hoc queries, standard reports and dashboards.
- TIBCO Spotfire for analytics and access to data through standard presentation reports.
- Web services for parameterized queries.
- Data services to provide data for downstream applications.
- QuickViews (web pages built using DevExpress, a .NET toolkit) for access to live data.
SharePoint Portal: Branded as "InfoSource," this team collaboration web portal is the front-end interface that provides integrated access to PSPD data for all PharmSci customers through the consuming applications described above.
The Data Virtualization Solution - Best Practices
Linhares and team applied a number of data virtualization best practices when implementing the architecture described above.
Two Layers of Abstraction: Linhares stressed the importance of building two clear levels of abstraction into the data virtualization architecture. The first level abstracts Sources (the information abstraction layer), the second consumers (the reporting abstraction layer).
"We built a representation of the data in Composite. If a source is ever changed by the owner, which often happens, we can update the representation in the information abstraction layer quickly. This allows control of all downstream data in one location."
The second level of abstraction is the one between the reporting schema and the front-end reporting tools. A consolidated and integrated set of information is exposed as a single schema. This allows BIS to be system agnostic and support the use of whatever tool is best for the customer. All of the reporting tools use the same reporting abstraction layer; they always get the same answer to the same question because there is only a single source of data.
Consolidated Business Rules: Another key piece of the solution is the ability to include the business rules about how PharmSci manages its data within these abstraction layers. The business rules are embedded in the view definitions and are applied consistently at the same point.
Rapid Application Development Process: Prior to data virtualization, data integration was the slowest step for BIS in fulfilling a customer request for information. Now it's typically the fastest. "For example, a request that came in Friday morning and was completed by that afternoon. The customer's response was an amazed, ‘What do you mean you already have it done?'"
BIS uses a simple development process. The first step is what Linhares calls "triage" - looking at what the customer wants, estimating how long it will take and communicating that to the customer.
BIS does not spend a lot of time documenting the requirements of the solution. Instead, the group first creates a prototype on paper in the form of a simple data flow, then creates the necessary virtual views, gives the customer web access to the views and asks: "Is this what you wanted?"
The customer can then play with the result and respond with any changes or additions needed. BIS arrives at the final solution working with the customer in an iterative process.
Summary of Benefits
Linhares described several major benefits of the data virtualization solution.
The ability to provide integrated data in context: Data virtualization has enabled BIS to replace isolated silos of data with a data delivery platform that integrates different types and sources of data into a comprehensive package of value-added information. Instead of only the team leader and a core group of eight to ten people knowing about a project, the entire organization has access to relevant project information.
The independence of the data virtualization layer: "This is one of the huge benefits of data virtualization. It allows me to manage and monitor everything in one place and it makes change management easy for BIS and transparent to users."
Fast, iterative development environment: The data delivery infrastructure already exists in the data virtualization layer (defined data sources, standard naming conventions, access methods, etc.) so when a request for information comes in, BIS can quickly put it together for the customer.
Elimination of manual effort throughout PharmSci: According to Linhares, people initially resisted going away from their spreadsheets. But once there was a single source for the data and it was all available through InfoSource, there was a dramatic reduction in the need to have meetings to reconcile spreadsheet data among teams.
• • •
Editor's Note: Robert Eve is the co-author, along with Judith R. Davis, of Data Virtualization: Going Beyond Traditional Data Integration to Achieve Business Agility, the first book published on the topic of data virtualization. The complete Pfizer case study, along with nine others enterprise are available in the book.
In 2015, 4.9 billion connected "things" will be in use. By 2020, Gartner forecasts this amount to be 25 billion, a 410 percent increase in just five years. How will businesses handle this rapid growth of data? Hadoop will continue to improve its technology to meet business demands, by enabling businesses to access/analyze data in real time, when and where they need it. Cloudera's Chief Technologist, Eli Collins, will discuss how Big Data is keeping up with today's data demands and how in the future, data and analytics will be pervasive, embedded into every workflow, application and infra...
Apr. 20, 2015 10:45 PM EDT Reads: 920
The best mobile applications are augmented by dedicated servers, the Internet and Cloud services. Mobile developers should focus on one thing: writing the next socially disruptive viral app. Thanks to the cloud, they can focus on the overall solution, not the underlying plumbing. From iOS to Android and Windows, developers can leverage cloud services to create a common cross-platform backend to persist user settings, app data, broadcast notifications, run jobs, etc. This session provides a high level technical overview of many cloud services available to mobile app developers, includi...
Apr. 20, 2015 10:00 PM EDT Reads: 563
Participants will reach the final if their IoT solution is liked. A community vote will determine the best solutions submitted in each country, after which an expert jury will select the national winners and the best international IoT solution. Each country's best solution can win a national marketing campaign worth up to €30,000 and become a partner in Deutsche Telekom's participating markets. The winning international solution can become partner of Deutsche Telekom Group across all eight countries and reach out to a potential of 10,8 million business customers. Deutsche Telekom Group has a...
Apr. 20, 2015 05:00 PM EDT Reads: 841
Health care systems across the globe are under enormous strain, as facilities reach capacity and costs continue to rise. M2M and the Internet of Things have the potential to transform the industry through connected health solutions that can make care more efficient while reducing costs. In fact, Vodafone's annual M2M Barometer Report forecasts M2M applications rising to 57 percent in health care and life sciences by 2016. Lively is one of Vodafone's health care partners, whose solutions enable older adults to live independent lives while staying connected to loved ones. M2M will continue to gr...
Apr. 20, 2015 03:00 PM EDT Reads: 1,109
Dave will share his insights on how Internet of Things for Enterprises are transforming and making more productive and efficient operations and maintenance (O&M) procedures in the cleantech industry and beyond. Speaker Bio: Dave Landa is chief operating officer of Cybozu Corp (kintone US). Based in the San Francisco Bay Area, Dave has been on the forefront of the Cloud revolution driving strategic business development on the executive teams of multiple leading Software as a Services (SaaS) application providers dating back to 2004. Cybozu's kintone.com is a leading global BYOA (Build Your O...
Apr. 20, 2015 02:00 PM EDT Reads: 1,136
SYS-CON Events announced today that Vicom Computer Services, Inc., a provider of technology and service solutions, will exhibit at SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. They are located at booth #427. Vicom Computer Services, Inc. is a progressive leader in the technology industry for over 30 years. Headquartered in the NY Metropolitan area. Vicom provides products and services based on today’s requirements around Unified Networks, Cloud Computing strategies, Virtualization around Software defined Data Ce...
Apr. 20, 2015 02:00 PM EDT Reads: 1,482
VoxImplant has announced full WebRTC support in the newest versions of its Android SDK and iOS SDK. The updated SDKs, which enable audio and video calls on mobile devices, are now compatible with the WebRTC standard to allow any mobile app to communicate with WebRTC-enabled browsers, including Google Chrome, Mozilla Firefox, Opera, and, when available, Microsoft Spartan. The WebRTC-updated SDKs represent VoxImplant's continued leadership in simplifying the development of real-time communications (RTC) services for app developers. VoxImplant (built by Zingaya, the real-time communication servi...
Apr. 20, 2015 12:45 PM EDT Reads: 1,963
What exactly is a cognitive application? In her session at 16th Cloud Expo, Ashley Hathaway, Product Manager at IBM Watson, will look at the services being offered by the IBM Watson Developer Cloud and what that means for developers and Big Data. She'll explore how IBM Watson and its partnerships will continue to grow and help define what it means to be a cognitive service, as well as take a look at the offerings on Bluemix. She will also check out how Watson and the Alchemy API team up to offer disruptive APIs to developers.
Apr. 20, 2015 12:00 PM EDT Reads: 1,588
The IoT Bootcamp is coming to Cloud Expo | @ThingsExpo on June 9-10 at the Javits Center in New York. Instructor. Registration is now available at http://iotbootcamp.sys-con.com/ Instructor Janakiram MSV previously taught the famously successful Multi-Cloud Bootcamp at Cloud Expo | @ThingsExpo in November in Santa Clara. Now he is expanding the focus to Janakiram is the founder and CTO of Get Cloud Ready Consulting, a niche Cloud Migration and Cloud Operations firm that recently got acquired by Aditi Technologies. He is a Microsoft Regional Director for Hyderabad, India, and one of the f...
Apr. 20, 2015 12:00 PM EDT Reads: 1,300
The 17th International Cloud Expo has announced that its Call for Papers is open. 17th International Cloud Expo, to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, APM, APIs, Microservices, Security, Big Data, Internet of Things, DevOps and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal today!
Apr. 20, 2015 12:00 PM EDT Reads: 2,172
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
Apr. 20, 2015 10:30 AM EDT Reads: 2,218
With IoT exploding, massive data will transform businesses with opportunities to monetize almost anything that can be measured. In this C-Level Roundtable Discussion at @ThingsExpo, Brendan O’Brien, Aria Systems Co-founder and Chief Evangelist, will lead an expert panel of consultants, thought leaders and practitioners who will look at these new monetization trends, discuss the implications, and detail lessons learned from their collective experience. Finally, the panel will point the way forward for enterprises who wish to leverage the resulting complex recurring revenue models, adding valu...
Apr. 20, 2015 10:30 AM EDT Reads: 1,519
The WebRTC Summit 2015 New York, to be held June 9-11, 2015, at the Javits Center in New York, NY, announces that its Call for Papers is open. Topics include all aspects of improving IT delivery by eliminating waste through automated business models leveraging cloud technologies. WebRTC Summit is co-located with 16th International Cloud Expo, @ThingsExpo, Big Data Expo, and DevOps Summit.
Apr. 20, 2015 10:15 AM EDT Reads: 2,312
“In the past year we've seen a lot of stabilization of WebRTC. You can now use it in production with a far greater degree of certainty. A lot of the real developments in the past year have been in things like the data channel, which will enable a whole new type of application," explained Peter Dunkley, Technical Director at Acision, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Apr. 20, 2015 10:00 AM EDT Reads: 3,934
From telemedicine to smart cars, digital homes and industrial monitoring, the explosive growth of IoT has created exciting new business opportunities for real time calls and messaging. In his session at @ThingsExpo, Ivelin Ivanov, CEO and Co-Founder of Telestax, shared some of the new revenue sources that IoT created for Restcomm – the open source telephony platform from Telestax. Ivelin Ivanov is a technology entrepreneur who founded Mobicents, an Open Source VoIP Platform, to help create, deploy, and manage applications integrating voice, video and data. He is the co-founder of TeleStax, a...
Apr. 20, 2015 09:30 AM EDT Reads: 4,707
As enterprises move to all-IP networks and cloud-based applications, communications service providers (CSPs) – facing increased competition from over-the-top providers delivering content via the Internet and independently of CSPs – must be able to offer seamless cloud-based communication and collaboration solutions that can scale for small, midsize, and large enterprises, as well as public sector organizations, in order to keep and grow market share. The latest version of Oracle Communications Unified Communications Suite gives CSPs the capability to do just that. In addition, its integration ...
Apr. 20, 2015 09:15 AM EDT Reads: 4,031
Can call centers hang up the phones for good? Intuitive Solutions did. WebRTC enabled this contact center provider to eliminate antiquated telephony and desktop phone infrastructure with a pure web-based solution, allowing them to expand beyond brick-and-mortar confines to a home-based agent model. It also ensured scalability and better service for customers, including MUY! Companies, one of the country's largest franchise restaurant companies with 232 Pizza Hut locations. This is one example of WebRTC adoption today, but the potential is limitless when powered by IoT.
Apr. 20, 2015 09:00 AM EDT Reads: 5,028
SYS-CON Events announced today that Ciqada will exhibit at SYS-CON's @ThingsExpo, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Ciqada™ makes it easy to connect your products to the Internet. By integrating key components - hardware, servers, dashboards, and mobile apps - into an easy-to-use, configurable system, your products can quickly and securely join the internet of things. With remote monitoring, control, and alert messaging capability, you will meet your customers' needs of tomorrow - today! Ciqada. Let your products take flight. For more inform...
Apr. 20, 2015 09:00 AM EDT Reads: 1,700
SYS-CON Events announced today that SoftLayer, an IBM company, has been named “Gold Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place June 9-11, 2015 at the Javits Center in New York City, NY, and the 17th International Cloud Expo®, which will take place November 3–5, 2015 at the Santa Clara Convention Center in Santa Clara, CA. SoftLayer operates a global cloud infrastructure platform built for Internet scale. With a global footprint of data centers and network points of presence, SoftLayer provides infrastructure as a service to leading-edge customers ranging from ...
Apr. 20, 2015 08:45 AM EDT Reads: 2,510
SYS-CON Events announced today that Cisco, the worldwide leader in IT that transforms how people connect, communicate and collaborate, has been named “Gold Sponsor” of SYS-CON's 16th International Cloud Expo®, which will take place on June 9-11, 2015, at the Javits Center in New York City, NY. Cisco makes amazing things happen by connecting the unconnected. Cisco has shaped the future of the Internet by becoming the worldwide leader in transforming how people connect, communicate and collaborate. Cisco and our partners are building the platform for the Internet of Everything by connecting the...
Apr. 20, 2015 08:30 AM EDT Reads: 5,684