|By Srinivasan Sundara Rajan||
|December 28, 2011 07:15 AM EST||
In this article I would like to look at a few tools which are overlooked when it comes to Big Data analytics. Organizations that have already heavy investment on Mainframe and would like to continue with the utilization of Mainframe can consider these tools for further expanding their Big Data Analytics reach.
DFSORT- Sorting & Merging Large Data Sets :
- Much before RDBMS have taken their place, Cobol programs have 2 major file manipulation operations namely:
- SORT operation accepts un-sequenced input and produces output in specified sequence
- The Merge operation compares records from two or more files and combines them in order
- DFSORT adds the ability to do faster and easier sorting, merging, copying, reporting and analysis of your business information, as well as versatile data handling at the record, fixed position/length or variable position/length field, and bit level.
- DFSORT is designed to optimize the efficiency and speed with which operations are completed through synergy with processor, device, and system features
- A Cobol program will typically act as a intermediary in handling the FILE inputs and passing them to DFSORT
- After all the input records have been passed to DFSORT, the sorting operation is executed. This operation arranges the entire set of records in the sequence specified by keys.
- Much like a SORT , MERGE statement is also called from a COBOL job
- The MERGE statement execution begins the MERGE processing. This operation compares keys with the records of the input files, and passes the sequenced records to create a MERGED output file
- As per the documentation from the vendor , there is no maximum number of keys which can support the needs for Big Data Analytics processing
- Some of the advanced options of DFSORT also facilitates parallel sort processing which goes well with needs of Big Data Analytics
- With the work loads of Big Data Analytical jobs can span multiple physical and virtual servers including mainframe, it is good to see that DFSORT has the option to sort records either in EBCDIC or ASCII or another collating sequence. This can result in uniformity of massively parallel sorting jobs if they run on heterogeneous systems
- The Job Control Language (JCL), which gives Hadoop like management of large file processing jobs in Mainframe have good features to specify multiple input and output file options for SORT and MERGE jobs
- As evident this article does not aim as a tutorial for DFSORT and various performance features can be looked from Mainframe manuals or can ask Mainframe Gurus in your organization.
- REXX (Restructured eXtended eXecutor) is another programming language that is used in the same eco system of Cobol and DFSORT and can considerably contribute to the Big Data Analytical needs of the enterprises
- REXX has advantages in string manipulation, Dynamic data typing, Storage Management and is generally considered to be very reliable and robust
- One of the most important strengths of REXX that is of relevance to Bigdata Analytics is its ‘'character string" handling ability.
- There are some useful string manipulation functions like COPIES (), WORDS(), STRIP(), TRANSLATE(), which can go a long way in the Map Reduce functionality needs of typical big data analytical jobs
- PARSE instruction is also used frequently in REXX programs. It is able to take strings from a number of sources and break them apart into constituent parts using a fairly natural notation
- Probably PARSE could be one of the highly useful feature of REXX in its positioning as a Big Data Analytical tool
- The REXX parse statement divides a source string into constituent parts and assigns these to symbols as directed by the governing parsing template
- REXX, DFSORT and Cobol programs can be inter operable such that we could call a REXX program from Cobol , and all these can be tied together with JCL
- Again this note is meant as a tutorial for REXX and lot of good documentation is available on utilizing the String manipulation features of REXX.
Summary : There is a strong need for enterprises to adopt Big Data Analytics and start mining the huge sets of unstructured data which has been ignored so far to arrive at meaningful business decisions. While newer frameworks like Hadoop or the new breed of analytical databases are going to satisfy this need, however enterprises should not be spending their time on picking up the tools and languages when it comes to Big Data Analytics.
If there is a significant investment and organization direction is to use the legacy platforms like Cobol, JCL, REXX, DFSORT it is only prudent to utilize best of their capabilities in arriving at options for Big Data Analytics.
We are seeing that Big Data Analytics is mainly dependent on Map / Reduce algorithms, these functions are aimed at crunching large data sets, like reading the input files and create key/value pair and map functions take these key/value pairs and generates another key/value pair. Further Reducer function also depends on sorted key/value pairs and iterate them and reduce the output further.
If we look at the way this logic works, there is a heavy need for sorting, merging, string manipulation and parsing all the way. Hence the tools mentioned above like DFSORT, REXX along with Cobol will likely to satisfy the Big Data needs of large enterprises if they have already invested on Mainframe compute capacity.
While great strides have been made relative to the video aspects of remote collaboration, audio technology has basically stagnated. Typically all audio is mixed to a single monaural stream and emanates from a single point, such as a speakerphone or a speaker associated with a video monitor. This leads to confusion and lack of understanding among participants especially regarding who is actually speaking. Spatial teleconferencing introduces the concept of acoustic spatial separation between conference participants in three dimensional space. This has been shown to significantly improve comprehe...
May. 23, 2015 10:00 AM EDT Reads: 3,000
SYS-CON Events announced today that the "First Containers & Microservices Conference" will take place June 9-11, 2015, at the Javits Center in New York City. The “Second Containers & Microservices Conference” will take place November 3-5, 2015, at Santa Clara Convention Center, Santa Clara, CA. Containers and microservices have become topics of intense interest throughout the cloud developer and enterprise IT communities.
May. 23, 2015 10:00 AM EDT Reads: 2,131
Buzzword alert: Microservices and IoT at a DevOps conference? What could possibly go wrong? In this Power Panel at DevOps Summit, moderated by Jason Bloomberg, the leading expert on architecting agility for the enterprise and president of Intellyx, panelists will peel away the buzz and discuss the important architectural principles behind implementing IoT solutions for the enterprise. As remote IoT devices and sensors become increasingly intelligent, they become part of our distributed cloud environment, and we must architect and code accordingly. At the very least, you'll have no problem fil...
May. 23, 2015 10:00 AM EDT Reads: 1,914
The 4th International Internet of @ThingsExpo, co-located with the 17th International Cloud Expo - to be held November 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA - announces that its Call for Papers is open. The Internet of Things (IoT) is the biggest idea since the creation of the Worldwide Web more than 20 years ago.
May. 23, 2015 09:00 AM EDT Reads: 1,662
The Domain Name Service (DNS) is one of the most important components in networking infrastructure, enabling users and services to access applications by translating URLs (names) into IP addresses (numbers). Because every icon and URL and all embedded content on a website requires a DNS lookup loading complex sites necessitates hundreds of DNS queries. In addition, as more internet-enabled ‘Things' get connected, people will rely on DNS to name and find their fridges, toasters and toilets. According to a recent IDG Research Services Survey this rate of traffic will only grow. What's driving t...
May. 23, 2015 09:00 AM EDT Reads: 5,050
The Internet of Things promises to transform businesses (and lives), but navigating the business and technical path to success can be difficult to understand. In his session at @ThingsExpo, Sean Lorenz, Technical Product Manager for Xively at LogMeIn, demonstrated how to approach creating broadly successful connected customer solutions using real world business transformation studies including New England BioLabs and more.
May. 23, 2015 08:00 AM EDT Reads: 5,887
Since 2008 and for the first time in history, more than half of humans live in urban areas, urging cities to become “smart.” Today, cities can leverage the wide availability of smartphones combined with new technologies such as Beacons or NFC to connect their urban furniture and environment to create citizen-first services that improve transportation, way-finding and information delivery. In her session at @ThingsExpo, Laetitia Gazel-Anthoine, CEO of Connecthings, will focus on successful use cases.
May. 23, 2015 08:00 AM EDT Reads: 4,967
Sensor-enabled things are becoming more commonplace, precursors to a larger and more complex framework that most consider the ultimate promise of the IoT: things connecting, interacting, sharing, storing, and over time perhaps learning and predicting based on habits, behaviors, location, preferences, purchases and more. In his session at @ThingsExpo, Tom Wesselman, Director of Communications Ecosystem Architecture at Plantronics, will examine the still nascent IoT as it is coalescing, including what it is today, what it might ultimately be, the role of wearable tech, and technology gaps stil...
May. 23, 2015 06:00 AM EDT Reads: 4,398
One of the biggest impacts of the Internet of Things is and will continue to be on data; specifically data volume, management and usage. Companies are scrambling to adapt to this new and unpredictable data reality with legacy infrastructure that cannot handle the speed and volume of data. In his session at @ThingsExpo, Don DeLoach, CEO and president of Infobright, will discuss how companies need to rethink their data infrastructure to participate in the IoT, including: Data storage: Understanding the kinds of data: structured, unstructured, big/small? Analytics: What kinds and how responsiv...
May. 23, 2015 06:00 AM EDT Reads: 4,490
Today’s enterprise is being driven by disruptive competitive and human capital requirements to provide enterprise application access through not only desktops, but also mobile devices. To retrofit existing programs across all these devices using traditional programming methods is very costly and time consuming – often prohibitively so. In his session at @ThingsExpo, Jesse Shiah, CEO, President, and Co-Founder of AgilePoint Inc., discussed how you can create applications that run on all mobile devices as well as laptops and desktops using a visual drag-and-drop application – and eForms-buildi...
May. 23, 2015 06:00 AM EDT Reads: 5,454
17th Cloud Expo, taking place Nov 3-5, 2015, at the Santa Clara Convention Center in Santa Clara, CA, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy. Meanwhile, 94% of enterprises are using some form of XaaS – software, platform, and infrastructure as a service.
May. 23, 2015 05:00 AM EDT Reads: 2,418
Advanced Persistent Threats (APTs) are increasing at an unprecedented rate. The threat landscape of today is drastically different than just a few years ago. Attacks are much more organized and sophisticated. They are harder to detect and even harder to anticipate. In the foreseeable future it's going to get a whole lot harder. Everything you know today will change. Keeping up with this changing landscape is already a daunting task. Your organization needs to use the latest tools, methods and expertise to guard against those threats. But will that be enough? In the foreseeable future attacks w...
May. 23, 2015 05:00 AM EDT Reads: 5,699
Cloud is not a commodity. And no matter what you call it, computing doesn’t come out of the sky. It comes from physical hardware inside brick and mortar facilities connected by hundreds of miles of networking cable. And no two clouds are built the same way. SoftLayer gives you the highest performing cloud infrastructure available. One platform that takes data centers around the world that are full of the widest range of cloud computing options, and then integrates and automates everything. Join SoftLayer on June 9 at 16th Cloud Expo to learn about IBM Cloud's SoftLayer platform, explore se...
May. 23, 2015 04:45 AM EDT Reads: 3,120
15th Cloud Expo, which took place Nov. 4-6, 2014, at the Santa Clara Convention Center in Santa Clara, CA, expanded the conference content of @ThingsExpo, Big Data Expo, and DevOps Summit to include two developer events. IBM held a Bluemix Developer Playground on November 5 and ElasticBox held a Hackathon on November 6. Both events took place on the expo floor. The Bluemix Developer Playground, for developers of all levels, highlighted the ease of use of Bluemix, its services and functionality and provide short-term introductory projects that developers can complete between sessions.
May. 23, 2015 04:00 AM EDT Reads: 6,256
The 3rd International @ThingsExpo, co-located with the 16th International Cloud Expo – to be held June 9-11, 2015, at the Javits Center in New York City, NY – is now accepting Hackathon proposals. Hackathon sponsorship benefits include general brand exposure and increasing engagement with the developer ecosystem. At Cloud Expo 2014 Silicon Valley, IBM held the Bluemix Developer Playground on November 5 and ElasticBox held the DevOps Hackathon on November 6. Both events took place on the expo floor. The Bluemix Developer Playground, for developers of all levels, highlighted the ease of use of...
May. 23, 2015 04:00 AM EDT Reads: 3,431
The explosion of connected devices / sensors is creating an ever-expanding set of new and valuable data. In parallel the emerging capability of Big Data technologies to store, access, analyze, and react to this data is producing changes in business models under the umbrella of the Internet of Things (IoT). In particular within the Insurance industry, IoT appears positioned to enable deep changes by altering relationships between insurers, distributors, and the insured. In his session at @ThingsExpo, Michael Sick, a Senior Manager and Big Data Architect within Ernst and Young's Financial Servi...
May. 23, 2015 04:00 AM EDT Reads: 4,835
In the consumer IoT, everything is new, and the IT world of bits and bytes holds sway. But industrial and commercial realms encompass operational technology (OT) that has been around for 25 or 50 years. This grittier, pre-IP, more hands-on world has much to gain from Industrial IoT (IIoT) applications and principles. But adding sensors and wireless connectivity won’t work in environments that demand unwavering reliability and performance. In his session at @ThingsExpo, Ron Sege, CEO of Echelon, will discuss how as enterprise IT embraces other IoT-related technology trends, enterprises with i...
May. 23, 2015 03:00 AM EDT Reads: 4,290
Enthusiasm for the Internet of Things has reached an all-time high. In 2013 alone, venture capitalists spent more than $1 billion dollars investing in the IoT space. With "smart" appliances and devices, IoT covers wearable smart devices, cloud services to hardware companies. Nest, a Google company, detects temperatures inside homes and automatically adjusts it by tracking its user's habit. These technologies are quickly developing and with it come challenges such as bridging infrastructure gaps, abiding by privacy concerns and making the concept a reality. These challenges can't be addressed w...
May. 23, 2015 02:45 AM EDT Reads: 6,696
We’re no longer looking to the future for the IoT wave. It’s no longer a distant dream but a reality that has arrived. It’s now time to make sure the industry is in alignment to meet the IoT growing pains – cooperate and collaborate as well as innovate. In his session at @ThingsExpo, Jim Hunter, Chief Scientist & Technology Evangelist at Greenwave Systems, will examine the key ingredients to IoT success and identify solutions to challenges the industry is facing. The deep industry expertise behind this presentation will provide attendees with a leading edge view of rapidly emerging IoT oppor...
May. 23, 2015 02:30 AM EDT Reads: 4,941
The industrial software market has treated data with the mentality of “collect everything now, worry about how to use it later.” We now find ourselves buried in data, with the pervasive connectivity of the (Industrial) Internet of Things only piling on more numbers. There’s too much data and not enough information. In his session at @ThingsExpo, Bob Gates, Global Marketing Director, GE’s Intelligent Platforms business, to discuss how realizing the power of IoT, software developers are now focused on understanding how industrial data can create intelligence for industrial operations. Imagine ...
May. 23, 2015 02:00 AM EDT Reads: 5,197