Welcome!

Agile Computing Authors: Jyoti Bansal, Yeshim Deniz, Pat Romanski, Carmen Gonzalez, Jason Bloomberg

Related Topics: @CloudExpo

@CloudExpo: Blog Post

Cousins of Cobol in Big Data Analytics

How DFSORT, REXX Support Big Data Analytics

In this  article  I would  like to look at a few tools which are overlooked when it comes to Big Data analytics. Organizations that  have  already  heavy investment  on Mainframe  and  would like to continue  with the utilization of Mainframe can consider these  tools for further  expanding their Big Data Analytics reach.

DFSORT-  Sorting & Merging Large Data Sets :

  • Much before RDBMS have taken their place, Cobol programs have 2 major file manipulation operations namely:
  • SORT operation accepts un-sequenced input and produces output in specified sequence
  • The Merge operation compares records from two or more files and combines them in order
  • DFSORT adds the ability to do faster and easier sorting, merging, copying, reporting and analysis of your business information, as well as versatile data handling at the record, fixed position/length or variable position/length field, and bit level.
  • DFSORT is designed to optimize the efficiency and speed with which operations are completed through synergy with processor, device, and system features
  • A Cobol program will typically act as a intermediary in handling the FILE inputs and passing them to DFSORT
  • After all the input records have been passed to DFSORT, the sorting operation is executed. This operation arranges the entire set of records in the sequence specified by keys.
  • Much like a SORT , MERGE statement is also called from a COBOL job
  • The MERGE statement execution begins the MERGE processing. This operation compares keys with the records of the input files, and passes the sequenced records to create a MERGED output file
  • As per the documentation from the vendor , there is no maximum number of keys which can support the needs for Big Data Analytics processing
  • Some of the advanced options of DFSORT also facilitates parallel sort processing which goes well with needs of Big Data Analytics
  • With the work loads of Big Data Analytical jobs can span multiple physical and virtual servers including mainframe, it is good to see that DFSORT has the option to sort records either in EBCDIC or ASCII or another collating sequence. This can result in uniformity of massively parallel sorting jobs if they run on heterogeneous systems
  • The Job Control Language (JCL), which gives Hadoop like management of large file processing jobs in Mainframe have good features to specify multiple input and output file options for SORT and MERGE jobs
  • As evident this article does not aim as a tutorial for DFSORT and various performance features can be looked from Mainframe manuals or can ask Mainframe Gurus in your organization.

REXX :

  • REXX (Restructured eXtended eXecutor) is another programming language that is used in the same eco system of Cobol and DFSORT and can considerably contribute to the Big Data Analytical needs of the enterprises
  • REXX has advantages in string manipulation, Dynamic data typing, Storage Management and is generally considered to be very reliable and robust
  • One of the most important strengths of REXX that is of relevance to Bigdata Analytics is its ‘'character string" handling ability.
  • There are some useful string manipulation functions like COPIES (), WORDS(), STRIP(), TRANSLATE(), which can go a long way in the Map Reduce functionality needs of typical big data analytical jobs
  • PARSE instruction is also used frequently in REXX programs. It is able to take strings from a number of sources and break them apart into constituent parts using a fairly natural notation
  • Probably PARSE could be one of the highly useful feature of REXX in its positioning as a Big Data Analytical tool
  • The REXX parse statement divides a source string into constituent parts and assigns these to symbols as directed by the governing parsing template
  • REXX, DFSORT and Cobol programs can be inter operable such that we could call a REXX program from Cobol , and all these can be tied together with JCL
  • Again this note is meant as a tutorial for REXX and lot of good documentation is available on utilizing the String manipulation features of REXX.

Summary : There is  a strong  need for enterprises  to  adopt Big Data  Analytics  and start mining the  huge sets  of  unstructured data which has been ignored so far to arrive at meaningful business decisions.  While  newer  frameworks like Hadoop  or  the new breed of  analytical databases are going to satisfy  this need,  however   enterprises  should not be spending their time on picking up the tools and languages when it comes to Big Data Analytics.

If there is a significant  investment  and organization direction is to use the legacy  platforms like Cobol, JCL, REXX, DFSORT  it is only prudent  to utilize best  of their capabilities  in arriving  at options for Big Data Analytics.

We are seeing   that  Big Data Analytics  is mainly dependent on Map / Reduce algorithms,  these  functions are aimed  at  crunching  large data sets, like reading the input files  and  create key/value pair   and map functions take these  key/value pairs  and generates  another  key/value pair.  Further Reducer function  also depends on  sorted  key/value pairs  and iterate them and reduce the output further.

If we look at the way this logic works,  there is a  heavy need for sorting, merging, string  manipulation and parsing all the way. Hence  the tools mentioned  above like DFSORT,  REXX  along with Cobol  will likely to satisfy  the Big Data needs  of large enterprises  if  they  have already invested  on Mainframe compute capacity.

 

More Stories By Srinivasan Sundara Rajan

Highly passionate about utilizing Digital Technologies to enable next generation enterprise. Believes in enterprise transformation through the Natives (Cloud Native & Mobile Native).

Comments (0)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.


@ThingsExpo Stories
SYS-CON Events announced today that CollabNet, a global leader in enterprise software development, release automation and DevOps solutions, will be a Bronze Sponsor of SYS-CON's 20th International Cloud Expo®, taking place from June 6-8, 2017, at the Javits Center in New York City, NY. CollabNet offers a broad range of solutions with the mission of helping modern organizations deliver quality software at speed. The company’s latest innovation, the DevOps Lifecycle Manager (DLM), supports Value S...
The Internet of Things is clearly many things: data collection and analytics, wearables, Smart Grids and Smart Cities, the Industrial Internet, and more. Cool platforms like Arduino, Raspberry Pi, Intel's Galileo and Edison, and a diverse world of sensors are making the IoT a great toy box for developers in all these areas. In this Power Panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists discussed what things are the most important, which will have the most profound e...
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists will look at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deli...
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
SYS-CON Events announced today that Grape Up will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct. 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company specializing in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market across the U.S. and Europe, Grape Up works with a variety of customers from emergi...
The age of Digital Disruption is evolving into the next era – Digital Cohesion, an age in which applications securely self-assemble and deliver predictive services that continuously adapt to user behavior. Information from devices, sensors and applications around us will drive services seamlessly across mobile and fixed devices/infrastructure. This evolution is happening now in software defined services and secure networking. Four key drivers – Performance, Economics, Interoperability and Trust ...
@ThingsExpo has been named the Most Influential ‘Smart Cities - IIoT' Account and @BigDataExpo has been named fourteenth by Right Relevance (RR), which provides curated information and intelligence on approximately 50,000 topics. In addition, Right Relevance provides an Insights offering that combines the above Topics and Influencers information with real time conversations to provide actionable intelligence with visualizations to enable decision making. The Insights service is applicable to eve...
DevOps is often described as a combination of technology and culture. Without both, DevOps isn't complete. However, applying the culture to outdated technology is a recipe for disaster; as response times grow and connections between teams are delayed by technology, the culture will die. A Nutanix Enterprise Cloud has many benefits that provide the needed base for a true DevOps paradigm.
Cybersecurity is a critical component of software development in many industries including medical devices. However, code is not always written to be robust or secure from the unknown or the unexpected. This gap can make medical devices susceptible to cybersecurity attacks ranging from compromised personal health information to life-sustaining treatment. In his session at @ThingsExpo, Clark Fortney, Software Engineer at Battelle, will discuss how programming oversight using key methods can incre...
Most technology leaders, contemporary and from the hardware era, are reshaping their businesses to do software in the hope of capturing value in IoT. Although IoT is relatively new in the market, it has already gone through many promotional terms such as IoE, IoX, SDX, Edge/Fog, Mist Compute, etc. Ultimately, irrespective of the name, it is about deriving value from independent software assets participating in an ecosystem as one comprehensive solution.
The 20th International Cloud Expo has announced that its Call for Papers is open. Cloud Expo, to be held June 6-8, 2017, at the Javits Center in New York City, brings together Cloud Computing, Big Data, Internet of Things, DevOps, Containers, Microservices and WebRTC to one location. With cloud computing driving a higher percentage of enterprise IT budgets every year, it becomes increasingly important to plant your flag in this fast-expanding business opportunity. Submit your speaking proposal ...
SYS-CON Events announced today that Juniper Networks (NYSE: JNPR), an industry leader in automated, scalable and secure networks, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Juniper Networks challenges the status quo with products, solutions and services that transform the economics of networking. The company co-innovates with customers and partners to deliver automated, scalable and secure network...
New competitors, disruptive technologies, and growing expectations are pushing every business to both adopt and deliver new digital services. This ‘Digital Transformation’ demands rapid delivery and continuous iteration of new competitive services via multiple channels, which in turn demands new service delivery techniques – including DevOps. In this power panel at @DevOpsSummit 20th Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, panelists will examine how DevOps helps to meet th...
SYS-CON Events announced today that Hitachi, the leading provider the Internet of Things and Digital Transformation, will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Hitachi Data Systems, a wholly owned subsidiary of Hitachi, Ltd., offers an integrated portfolio of services and solutions that enable digital transformation through enhanced data management, governance, mobility and analytics. We help globa...
SYS-CON Events announced today that T-Mobile will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. As America's Un-carrier, T-Mobile US, Inc., is redefining the way consumers and businesses buy wireless services through leading product and service innovation. The Company's advanced nationwide 4G LTE network delivers outstanding wireless experiences to 67.4 million customers who are unwilling to compromise on ...
Five years ago development was seen as a dead-end career, now it’s anything but – with an explosion in mobile and IoT initiatives increasing the demand for skilled engineers. But apart from having a ready supply of great coders, what constitutes true ‘DevOps Royalty’? It’ll be the ability to craft resilient architectures, supportability, security everywhere across the software lifecycle. In his keynote at @DevOpsSummit at 20th Cloud Expo, Jeffrey Scheaffer, GM and SVP, Continuous Delivery Busine...
20th Cloud Expo, taking place June 6-8, 2017, at the Javits Center in New York City, NY, will feature technical sessions from a rock star conference faculty and the leading industry players in the world. Cloud computing is now being embraced by a majority of enterprises of all sizes. Yesterday's debate about public vs. private has transformed into the reality of hybrid cloud: a recent survey shows that 74% of enterprises have a hybrid cloud strategy.
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend @CloudExpo | @ThingsExpo, June 6-8, 2017, at the Javits Center in New York City, NY and October 31 - November 2, 2017, Santa Clara Convention Center, CA. Learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that SoftLayer, an IBM Company, has been named “Gold Sponsor” of SYS-CON's 18th Cloud Expo, which will take place on June 7-9, 2016, at the Javits Center in New York, New York. SoftLayer, an IBM Company, provides cloud infrastructure as a service from a growing number of data centers and network points of presence around the world. SoftLayer’s customers range from Web startups to global enterprises.