Welcome!

Agile Computing Authors: Elizabeth White, Liz McMillan, Stackify Blog, ManageEngine IT Matters, Carmen Gonzalez

Blog Feed Post

How Many Data Scientists are out there?

By

Editor’s note: This post by Gregory Piatetsky first appeared at KDnuggets.comIt it he dives into a key question regarding the possible shortage of data scientists. -bg

Many people have read the McKinsey report on Big Data (May 2011) which predicted 

The United States alone faces a shortage of 140,000 to 190,000 people with analytical expertise and 1.5 million managers and analysts with the skills to understand and make decisions based on the analysis of big data.


However, it seems that so far the shortage is much less. 

The job title “Data Scientist” has grown tremendously in popularity, according to job siteindeed.com 

Job trend<br /><br /><br />
      for Data Scientist positions, 2006-2014 

However, notice that the demand stopped increasing sometime in 2013. 

As of March 13, 2014, Search for “Data Scientist” jobs (US-based) on indeed.com gives only 1,000 positions. We find about 10,000 jobs when searching for Data Scientist - without quotes, but many of these jobs have title “Scientist” or something to do with data, and not necessarily represent “Data Scientist” positions. 

Of course, many people may do similar work without having the title of “data scientist”. 

Several estimates may be relevant. 

Kaggle is the leading platform for data science competitions and claims to be world’s largest community of data scientists. Kaggle reached 100,000 in July 2013, reported110,000 in Sep 2013, 120,000 members on Oct 23, 2013, reported to have 140,000 on Feb 24, 2014. 
Latest numbers, from Kaggle CEO Anthony Goldbloom are: 157,142 Kaggle members, of whom 67,776 active in the last 6 months. 

A quick examination of the top 10 ranked Kagglers shows that only one has a title of “Data Scientist”. Top 10 include neuroscience researchers, PhD mathematicians and physicists, and while they are clearly talented competitors on Kaggle, their actual job may not involve data science. 

LinkedIn has many groups related to data science, Big Data and Analytics – see my analysis Top 2013 LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science

The two largest of these groups are:


Most members of these groups do not have the job title “Data scientists”. There is a “Data Scientists” LinkedIn group, but it has at present only 6,750 members. 

LinkedIn Data Scientist Peter Skomoroch, @PeteSkomoroch wrote 

Using the public LinkedIn search interface, with the job title in quotes – I see 12,170 members with the phrase “data scientist” anywhere their profile. Using the advanced search facet to look only at profiles with a current or past title containing the phrase “data scientist”, I see 6,896 results. Doing a plain keyword search will return many members that mention the words “data” or “scientist” anywhere in their profile, but the majority of those people have nothing to do with data science.


He further estimated that perhaps 150-250K people would be a match for a data scientist based on their skills and education. 

I remain optimistic that data scientist is a great profession, but I doubt that there is a demand for 100,000 new data scientist positions. There may be a re-branding of existing positions, or creation of teams which collectively do the data science job.

 

Gregory Piatetsky-Shapiro, Ph.D., is a well-known expert in Business Analytics, Data Mining, and Data Science. Gregory is the Editor and Publisher of KDnuggets.com, a Business Analytics “Guru” on Twitter, and a Top Influencer in Big Data, Data Mining, and Data Science. Gregory is a co-founder of KDD (Knowledge Discovery and Data mining conferences) and SIGKDD, professional organization for Knowledge Discovery and Data Mining. Gregory has over 60 publications and edited several books and collections on data mining and knowledge discovery.

Read the original blog entry...

More Stories By Bob Gourley

Bob Gourley writes on enterprise IT. He is a founder and partner at Cognitio Corp and publsher of CTOvision.com

@ThingsExpo Stories
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with the 21st International Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. @ThingsExpo Silicon Valley Call for Papers is now open.
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo | @ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
We build IoT infrastructure products - when you have to integrate different devices, different systems and cloud you have to build an application to do that but we eliminate the need to build an application. Our products can integrate any device, any system, any cloud regardless of protocol," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA
The hot topics in the industry today seem to center around Digital Transformation and Mobile Apps. While a digital transformation strategy is crucial to keep up with the chaos in your industry, customer demands, and other disruptions, the need to create mobile apps to remain relevant in your market and to your customers is equally a no-brainer. Regardless of the approach, the next question always seems to pop up: What architecture should I chose? Native? Hybrid? Managed? Hosted?
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deli...
SYS-CON Events announced today that EARP Integration will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. EARP Integration is a passionate software house. Since its inception in 2009 the company successfully delivers smart solutions for cities and factories that start their digital transformation. EARP provides bespoke solutions like, for example, advanced enterprise portals, business intelligence systems an...
The 21st International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Digital Transformation, Machine Learning and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding busin...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet th...
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.
SYS-CON Events announced today that Ocean9will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Ocean9 provides cloud services for Backup, Disaster Recovery (DRaaS) and instant Innovation, and redefines enterprise infrastructure with its cloud native subscription offerings for mission critical SAP workloads.
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
SYS-CON Events announced today that Hitachi Data Systems, a wholly owned subsidiary of Hitachi LTD., will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City. Hitachi Data Systems (HDS) will be featuring the Hitachi Content Platform (HCP) portfolio. This is the industry’s only offering that allows organizations to bring together object storage, file sync and share, cloud storage gateways, and sophisticated search and...
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
SYS-CON Events announced today that Progress, a global leader in application development, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Enterprises today are rapidly adopting the cloud, while continuing to retain business-critical/sensitive data inside the firewall. This is creating two separate data silos – one inside the firewall and the other outside the firewall. Cloud ISVs oft...
SYS-CON Events announced today that Peak 10, Inc., a national IT infrastructure and cloud services provider, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Peak 10 provides reliable, tailored data center and network services, cloud and managed services. Its solutions are designed to scale and adapt to customers’ changing business needs, enabling them to lower costs, improve performance and focus intern...
SYS-CON Events announced today that Cloud Academy will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud computing technologies. Ge...
Existing Big Data solutions are mainly focused on the discovery and analysis of data. The solutions are scalable and highly available but tedious when swapping in and swapping out occurs in disarray and thrashing takes place. The resolution for thrashing through machine learning algorithms and support nomenclature is through simple techniques. Organizations that have been collecting large customer data are increasingly seeing the need to use the data for swapping in and out and thrashing occurs ...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
SYS-CON Events announced today that Interoute has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Interoute is the owner operator of Europe's largest network and a global cloud services platform, which encompasses over 70,000 km of lit fiber, 15 data centers, 17 virtual data centers and 33 colocation centers, with connections to 195 additional partner data centers. Our full-service Unifie...