| By Michael Kopp | Article Rating: |
|
| February 9, 2013 12:00 PM EST | Reads: |
2,447 |
The other day I was looking at a web application that was using MongoDB as its central database. We were analyzing the application for potential performance problems and inside five minutes I detected what I must consider to be a MongoDB anti pattern and had a 40% impact on response time. The funny thing: It was a Java best practice that triggered it.
Analyzing the Application
The first thing I always do is look at the topology of an application to get a feel for it.

Overall Transaction Flow of the Application
As we see it's a modestly complex web application and it's using MongoDB as its datastore. Overall MongoDB contributes about 7% to the response time of the application. I noticed that about half of all transactions are actually calling MongoDB so I took a closer look.

Flow of Transactions that access MongoDB, showing 10% response time contribution of MongoDB
Those transactions that actually do call MongoDB spend about 10% of their response time in that popular document database. As a next step I wanted to know what was being executed against MongoDB.

Overview of all MongoDB commands. This shows that the JourneyCollection find and getCount contribute the most to response time
One immediately notices the first two lines, which contribute much more to the response time per transaction than all the others. What was interesting was that thegetCount on the JourneyCollection had the highest contribution time, but the developer responsible was not aware that he was even using it anywhere.
Things get interesting - the mysterious getCount call
Taking things one level deeper, we looked at all transactions that were executing the ominous getCount on the JourneyCollection.

Transactions that call JourneyCollection.getCount spend nearly half their time in MongoDB
What jumps out is that those particular transactions spend indeed over 40% of their time in MongoDB, so there was a big potential for improvement here. Another click and we looked at all MongoDB calls that were executed within the context of the same transaction as the getCount call we found so mysterious.

All MongoDB Statements that run within the same transaction context as the JourneyCollection.getCount
What struck us as interesting was that the number of executions per transaction of thefind and getCount on the JourneyCollection seemed closely connected. At this point we decided to look at the transactions themselves - we needed to understand why that particular MongoDB call was executed.

Single Transactions that execute the ominous getCount call
It's immediately clear that several different transaction types are executing that particulargetCount. What that meant for us is that the problem was likely in the core framework of that particular application rather than being specific to any one user action. Here is the interesting snippet:

The Transaction Trace shows where the getCount is executed exactly
We see that the WebService findJourneys spends all its time in the two MongoDB calls. The first is the actual find call to the Journey Collection. The MongoDB client is good at lazy loading, so the find does not actually do much yet. It only calls the server once we access the result set. We can see the round trip to MongoDB visualized in the call node at the end.
We also see the offending getCount. We see that it is executed by a method called sizewhich turns out to be com.mongodb.DBCursor.size method. This was news to our developer. Looking at several other transactions we found that this was a common pattern. Every time we search for something in the JourneyCollection the getCountwould be executed by com.mongodb.DBCursor.size. This always happens before we would really execute the send the find command to the server(which happens in the callmethod). So we used CompuwareAPM DTM's (a.k.a dynaTrace) developer integration and took a look at the offending code. Here is what we found:
BasicDBObject fields = new BasicDBObject();
fields.put(journeyStr + "." + MongoConstants.ID, 1);
fields.put(MongoConstants.ID, 0);
Collection locations = find(patternQuery, fields);
ArrayList results = new ArrayList(locations.size());
for (DBObject dbObject : locations) {
String loc = dbObject.getString(journeyStr);
results.add(loc);
}
return results;
The code looks harmless enough; we execute a find, create an array for the result and fill it. The offender is the location.size(). MongoDBs DBCursor is similar to the ResultSet in JDBC, it does not return the whole data set at once, but only a subset. As a consequence it doesn't really know how many elements the find will end up with. The only way for MongoDB to determine the final size seems to be to execute a getCountwith the same criteria as the original find. In our case that additional unnecessary roundtrip made up 40% of the web services response time!
An Anti-Patter triggered by a Best Practice
So it turns out that calling size on the DBCursor must be considered an anti-pattern! The real funny thing is that the developer thought he was writing performant code. He was following the best practice to pre-size arrays. This avoids any unnecessary re-sizing. In this particular case however, that minor theoretical performance improvement led to a 40% performance degradation!
Conclusion
The take away here is not that MongoDB is bad or doesn't perform. In fact the customer is rather happy with it. But mistakes happen and similar to other database applications we need to have the visibility into a running application to see how much it contributes to the overall response time. We also need to have that visibility to understand which statements are called where and why.
In addition this also demonstrates nicely why premature micro optimization, without leveraging an APM solution, in production will not lead to better performance. In some cases - like this one - it can actually lead to worse performance.
Published February 9, 2013 Reads 2,447
Copyright © 2013 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Michael Kopp
Michael Kopp has over 12 years of experience as an architect and developer in the Enterprise Java space. Before coming to CompuwareAPM dynaTrace he was the Chief Architect at GoldenSource, a major player in the EDM space. In 2009 he joined dynaTrace as a technology strategist in the center of excellence. He specializes application performance management in large scale production environments with special focus on virtualized and cloud environments. His current focus is how to effectively leverage BigData Solutions and how these technologies impact and change the application landscape.
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Windows Azure IaaS Reaches General Availability
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- NASA's Twitter Account Wins Back-To-Back Shorty Awards
- Big Data Isn’t About the Database, It’s About the Application
- BEA Updates WebLogic SOA Portal for Web 2.0 Era
- Cloud Expo New York | Danger Ahead: Why File Sync Is NOT Endpoint Backup
- Upcoming Bloomberg BNA Webinar Focuses on COPPA Compliance
- Symphony EYC Appoints New Account Manager to Drive Global Opportunities
- Cloud Expo New York: Basics of SSD Technology and Its Use in Cloud
- Cloud People: A Who's Who of Cloud Computing
- Cloud Expo New York Speaker Profile: Dave Linthicum – Cloud Technology Partners
- Cloud Expo New York: How to Use Google Apps Script
- Windows Azure IaaS Reaches General Availability
- Enterasys Spotlights SDN's Impact on Traditional Networking in Upcoming Webinar
- Upcoming Domino's Pizza Investor Events
- New Relic Q1 2013 Blazes Past Growth Targets and Reaches 40,000 Active Customer Accounts
- Scripps Networks Interactive’s Popular Lifestyle Shows from HGTV, DIY Network, Food Network, Cooking Channel and Travel Channel Coming to Prime Instant Video and Amazon Instant Video
- Rackspace Hosting Named “Platinum Plus Sponsor” of Cloud Expo New York
- RetailMeNot Shoppers Trend Report: While Over 8 in 10 U.S. Residents Cite Affordability as Their Top Vacation Priority, a Majority (58%) Could Waste Hundreds of Dollars by Booking Travel a la Carte
- Small Cancers, Big Data, and a Life Examined
- NASA's Twitter Account Wins Back-To-Back Shorty Awards
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Success, Arrogance, Rise and Fall
- AJAX World RIA Conference & Expo Kicks Off in New York City
- Personal Branding Checklist
- The Top 250 Players in the Cloud Computing Ecosystem
- i-Technology Viewpoint: Attack of the Blogs
- Exclusive Q&A with Jeff Haynie, Co-Founder & CEO, Appcelerator
- Web 2.0 News and Wrapping Up "Real-World AJAX" Seminar
- Passing Parameters to Flex That Works
- i-Technology Viewpoint: It's Time to Take the Quotation Marks Off "Web 2.0"






















