Welcome!

Agile Computing Authors: Rajesh Ramchandani, Elizabeth White, Yeshim Deniz, Carmen Gonzalez, Plutora Blog

Related Topics: Java IoT, Industrial IoT, Adobe Flex, Machine Learning , Recurring Revenue, Apache

Java IoT: Blog Feed Post

Debunking DRAM vs. Flash Controversy vis-a-vis In-Memory Processing

“Minimal” Performance Advantage of DRAM vs SSD

Wikibon produced an interesting material (looks like paid by Aerospike, NoSQL database recently emerged by resurrecting failed CitrusLeaf and acquihiring AlchemyDB, which product, of course, was recommended in the end) that compares NoSQL databases based on storing data in flash-based SSD vs. storing data in DRAM.

There are number of factual problems with that paper and I want to point them out.

Note that Wikibon doesn’t mention GridGain in this study (we are not a NoSQL datastore per-se after all) so I don’t have any bone in this game other than annoyance with biased and factually incorrect writing.

“Minimal” Performance Advantage of DRAM vs SSD
The paper starts with a simple statement “The minimal performance disadvantage of flash, relative to main memory…”. Minimal? I’ve seen number of studies where performance difference between SSDs and DRAM range form 100 to 10,000 times. For example, this University of California, Berkeley study claims that SSD bring almost no advantage to the Facebook Hadoop cluster and DRAM pre-caching is the way forward.

Let me provide even shorter explanation. Assuming we are dealing with Java – SSD devices are visible to Java application as typical block devices, and therefore accessed as such. It means that a typical object read from such device involves the same steps as reading this object from a file: hardware I/O subsystem, OS I/O subsystem, OS buffering, Java I/O subsystem & buffering, Java deserialization and induced GC. And… if you read the same object from DRAM – it involves few bytecode instructions – and that’s it.

Native C/C++ apps (like MongoDB) can take a slightly quicker route with memory mapped files (or various other IPC methods) – but the performance increase will not be significant (for obvious reason of needing to read/swap the entire pages vs. single object access pattern in DRAM).

Yet another recent technical explanation of the disadvantages of SSD storage can be found here (talking about Oracle’s “in-memory” strategy).

MongoDB, Cassandra, CouchDB DRAM-based?
Amid all the confusion on this topic it’s no wonder the author got it wrong. Neither MongoDB, Cassandra or CouchDB are in-memory systems. They are disk-based systems with support for memory caching. There’s nothing wrong with that and nothing new – every database developed in the last 25 years naturally provides in-memory caching to augment it’s main disk storage.

The fundamental difference here is that in-memory data systems like GridGain, SAP HAHA, GigaSpaces, GemFire, SqlFire, MemSQL, VoltDB, etc. use DRAM (memory) as the main storage medium and use disk for optional durability and overflow. This focus on RAM-based storage allows to completely re-optimized all main algorithms used in these systems.

For example, ACID implementation in GridGain that provides support for full-featured distributed ACID transactions beats every NoSQL database (EC-based) out there in read and even write performance: there are no single key limitations, no consistency trade offs to make, no application-side MVCC, no user-based conflict resolutions or other crutches – it just works the same way as it works in Oracle or DB2 – but faster.

2TB Cluster for $1.2M :)
If there was on piece in the original paper that was completely made up to fit the predefined narrative it was a price comparison. If the author thinks that 2TB RAM cluster costs $1.2M today – I have not one but two Golden Gate bridges to sell just for him…

Let’s see. A typical Dell/HP/IBM/Cisco blade with 256GB of DRAM will cost below $20K if you just buy on the list prices (Cisco seems to offer the best prices starting at around $15K for 256GB blades). That brings the total cost of 2TB cluster well below $200K (with all network and power equipment included and 100s TBs of disk storage).

Is this more expensive that SSD only cluster? Yes, by 2.5-3x times more expensive. But you are getting dramatic performance increase with the right software that more than justifies that price increase.

Conclusion
2-3x times price difference is nonetheless important and it provides our customers a very clear choice. If price is an issue and high performance is not – there are disk-based systems of wide varieties. If high performance and sub-second response on processing TBs of data is required – the hardware will be proportionally more expensive.

However, with 1GB of DRAM costing less than 1 USD and DRAM prices dropping 30% every 18 months – the era of disks (flash or spinning) is clearly coming to its logical end. It’s normal… it’s a progress and we all need to learn how to adapt.

Has anyone seen tape drives lately?

Read the original blog entry...

More Stories By Thomas Krafft

Over 15 years of experience in marketing and demand creation, with strategies driving over $500 million in revenue for a variety of companies in several high-growth and competitive markets, including consumer software and web services, ecommerce, demand creation through web and search, big data, and now healthcare.

@ThingsExpo Stories
Data is the fuel that drives the machine learning algorithmic engines and ultimately provides the business value. In his session at 20th Cloud Expo, Ed Featherston, director/senior enterprise architect at Collaborative Consulting, will discuss the key considerations around quality, volume, timeliness, and pedigree that must be dealt with in order to properly fuel that engine.
SYS-CON Events announced today that DatacenterDynamics has been named “Media Sponsor” of SYS-CON's 18th International Cloud Expo, which will take place on June 7–9, 2016, at the Javits Center in New York City, NY. DatacenterDynamics is a brand of DCD Group, a global B2B media and publishing company that develops products to help senior professionals in the world's most ICT dependent organizations make risk-based infrastructure and capacity decisions.
"Matrix is an ambitious open standard and implementation that's set up to break down the fragmentation problems that exist in IP messaging and VoIP communication," explained John Woolf, Technical Evangelist at Matrix, in this SYS-CON.tv interview at @ThingsExpo, held Nov 4–6, 2014, at the Santa Clara Convention Center in Santa Clara, CA.
Growth hacking is common for startups to make unheard-of progress in building their business. Career Hacks can help Geek Girls and those who support them (yes, that's you too, Dad!) to excel in this typically male-dominated world. Get ready to learn the facts: Is there a bias against women in the tech / developer communities? Why are women 50% of the workforce, but hold only 24% of the STEM or IT positions? Some beginnings of what to do about it! In her Day 2 Keynote at 17th Cloud Expo, Sandy Ca...
IoT is at the core or many Digital Transformation initiatives with the goal of re-inventing a company's business model. We all agree that collecting relevant IoT data will result in massive amounts of data needing to be stored. However, with the rapid development of IoT devices and ongoing business model transformation, we are not able to predict the volume and growth of IoT data. And with the lack of IoT history, traditional methods of IT and infrastructure planning based on the past do not app...
WebRTC services have already permeated corporate communications in the form of videoconferencing solutions. However, WebRTC has the potential of going beyond and catalyzing a new class of services providing more than calls with capabilities such as mass-scale real-time media broadcasting, enriched and augmented video, person-to-machine and machine-to-machine communications. In his session at @ThingsExpo, Luis Lopez, CEO of Kurento, introduced the technologies required for implementing these idea...
Why do your mobile transformations need to happen today? Mobile is the strategy that enterprise transformation centers on to drive customer engagement. In his general session at @ThingsExpo, Roger Woods, Director, Mobile Product & Strategy – Adobe Marketing Cloud, covered key IoT and mobile trends that are forcing mobile transformation, key components of a solid mobile strategy and explored how brands are effectively driving mobile change throughout the enterprise.
Apache Hadoop is emerging as a distributed platform for handling large and fast incoming streams of data. Predictive maintenance, supply chain optimization, and Internet-of-Things analysis are examples where Hadoop provides the scalable storage, processing, and analytics platform to gain meaningful insights from granular data that is typically only valuable from a large-scale, aggregate view. One architecture useful for capturing and analyzing streaming data is the Lambda Architecture, represent...
SYS-CON Events announced today that delaPlex will exhibit at SYS-CON's @CloudExpo, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. delaPlex pioneered Software Development as a Service (SDaaS), which provides scalable resources to build, test, and deploy software. It’s a fast and more reliable way to develop a new product or expand your in-house team.
The explosion of new web/cloud/IoT-based applications and the data they generate are transforming our world right before our eyes. In this rush to adopt these new technologies, organizations are often ignoring fundamental questions concerning who owns the data and failing to ask for permission to conduct invasive surveillance of their customers. Organizations that are not transparent about how their systems gather data telemetry without offering shared data ownership risk product rejection, regu...
With major technology companies and startups seriously embracing IoT strategies, now is the perfect time to attend @ThingsExpo 2016 in New York. Learn what is going on, contribute to the discussions, and ensure that your enterprise is as "IoT-Ready" as it can be! Internet of @ThingsExpo, taking place June 6-8, 2017, at the Javits Center in New York City, New York, is co-located with 20th Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry p...
The Internet of Things will challenge the status quo of how IT and development organizations operate. Or will it? Certainly the fog layer of IoT requires special insights about data ontology, security and transactional integrity. But the developmental challenges are the same: People, Process and Platform and how we integrate our thinking to solve complicated problems. In his session at 19th Cloud Expo, Craig Sproule, CEO of Metavine, demonstrated how to move beyond today's coding paradigm and sh...
SYS-CON Events announced today that IoT Now has been named “Media Sponsor” of SYS-CON's 20th International Cloud Expo, which will take place on June 6–8, 2017, at the Javits Center in New York City, NY. IoT Now explores the evolving opportunities and challenges facing CSPs, and it passes on some lessons learned from those who have taken the first steps in next-gen IoT services.
As organizations realize the scope of the Internet of Things, gaining key insights from Big Data, through the use of advanced analytics, becomes crucial. However, IoT also creates the need for petabyte scale storage of data from millions of devices. A new type of Storage is required which seamlessly integrates robust data analytics with massive scale. These storage systems will act as “smart systems” provide in-place analytics that speed discovery and enable businesses to quickly derive meaningf...
SYS-CON Events announced today that WineSOFT will exhibit at SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Based in Seoul and Irvine, WineSOFT is an innovative software house focusing on internet infrastructure solutions. The venture started as a bootstrap start-up in 2010 by focusing on making the internet faster and more powerful. WineSOFT’s knowledge is based on the expertise of TCP/IP, VPN, SSL, peer-to-peer, mob...
The Internet of Things can drive efficiency for airlines and airports. In their session at @ThingsExpo, Shyam Varan Nath, Principal Architect with GE, and Sudip Majumder, senior director of development at Oracle, discussed the technical details of the connected airline baggage and related social media solutions. These IoT applications will enhance travelers' journey experience and drive efficiency for the airlines and the airports.
SYS-CON Media announced today that @WebRTCSummit Blog, the largest WebRTC resource in the world, has been launched. @WebRTCSummit Blog offers top articles, news stories, and blog posts from the world's well-known experts and guarantees better exposure for its authors than any other publication. @WebRTCSummit Blog can be bookmarked ▸ Here @WebRTCSummit conference site can be bookmarked ▸ Here
In his keynote at @ThingsExpo, Chris Matthieu, Director of IoT Engineering at Citrix and co-founder and CTO of Octoblu, focused on building an IoT platform and company. He provided a behind-the-scenes look at Octoblu’s platform, business, and pivots along the way (including the Citrix acquisition of Octoblu).
With billions of sensors deployed worldwide, the amount of machine-generated data will soon exceed what our networks can handle. But consumers and businesses will expect seamless experiences and real-time responsiveness. What does this mean for IoT devices and the infrastructure that supports them? More of the data will need to be handled at - or closer to - the devices themselves.
SYS-CON Events announced today that Dataloop.IO, an innovator in cloud IT-monitoring whose products help organizations save time and money, has been named “Bronze Sponsor” of SYS-CON's 20th International Cloud Expo®, which will take place on June 6-8, 2017, at the Javits Center in New York City, NY. Dataloop.IO is an emerging software company on the cutting edge of major IT-infrastructure trends including cloud computing and microservices. The company, founded in the UK but now based in San Fran...