|By David Abramowski||
|March 10, 2009 02:42 AM EDT||
Cloud computing is a rather powerful tool that allows even the smallest of businesses to provide an enterprise class environment for web applications. In a nutshell, the cloud is nothing more than the ability to rent computer services on demand from a 3rd party provider. At MioWorks.com we use Amazon Web Services, but there are several other services out there for you to explore.
Mastering the cloud takes a bit of work, a dash of experience and an openness to learn from others. But once you do master it, the benefits are tremendous. You’ll never have to order another server or rent a rack in a data center. You’ll be able to fluidly control your environment by increasing and decreasing the services you need on the fly, saving time and money.
This power, flexibility and potential demands that you pay attention to the details. You must anticipate that the cloud can have hiccups and that as quickly as a server comes to life, that server can disappear. In previous blog posts I’ve already talked about the importance of backups and recovery drills, but let’s take a step back. Today let’s talk about monitoring and how important it is to your survival.
Ok I’ll bite, why is monitoring so important
Let me sum this up in a single sentence: Monitoring can be the difference between “whew that was close” and “holy s$%t we are down”. I lied - I need another sentence… Monitoring can also be the difference between a five minute outage and a five hour outage.
What to monitor
Every web based application environment in the cloud is a jigsaw puzzle of pieces. At the core you have your virtual hardware followed by your operating system. Each of your servers is then configured differently depending on its specific duty. You may have application servers, web servers, search servers, database servers and the list goes on. Each of these servers needs to be monitored from several points of view - both internally and externally.
The big question isn’t “Is the server running?” it should be “Is the server and all of its pieces running correctly? Each virtual server in your setup is a maze of processes, files, directories and file systems. At any given time a hiccup can occur within this delicate environment that will eventually disrupt the end user’s ability to use your service. In our environment we use monit and munin (two open source tools) on the inside to provide us with critical monitoring, recovery & trending capabilities.
Monit provides systems monitoring and error recovery for our Unix systems. In our environment we have configured monit to watch dozens of potential failure points. Monit can start a process if it is not running and can kill/restart a process if it takes too many resources. Monit is also configurable as an intrusion detection system by watching for changes in files, directories and file systems. By spending a little time learning and using Monit your system administrator has a great tool to keep a constant eye on all the pieces of the puzzle.
In addition to the direct monitoring and error recovery system, we also like to see the bigger picture. We use Munin to aggregate information across our server pool. Munin provides a graphical view that allows your team to quickly see what’s different from yesterday. You can quickly determine your resource utlization and plan in ADVANCE any increase of capacity.
From the outside
Keeping track of all the pieces inside the cloud is very important, but you also need to know how your environment in the cloud is performing to the outside world. There are more external monitoring services out there than I can count. But I’ll tell you who we use. Our favorite at the moment is monitis.com. We like them because starting at just $10/month you get on demand fault & performance monitoring for your environment. This external watchdog system helps to keep everyone informed if/when the cloud is having issues. It also provides us with important statistics on response time and application performance that we use to determine how to adjust our infrastructure.
Your monitoring program must become a living, breathing element of your systems administration. As new problems arise or potential problems are identified, the monitoring system must be adjusted to be proactive. The good news is that the more you adjust your monitoring and error recovery system, the less you’ll be surprised in the future. It takes discipline to post mortem each problem and determine how to proactively detect for it in the future. And this discipline will distinguish your application in the frenzy of the cloud.
Real world results of a good monitoring program
In the real world your monitoring system can be the difference between keeping your systems alive and thriving OR having unhappy customers and missed SLAs. It can help you pinpoint exactly what went wrong and reduce the time it takes for the first responders to identify and solve the issue. There are lots of solutions in the marketplace including commerical and open source alternatives. It may seem overwhelming at first, but once you start the process and improve little by little, you’ll be amazed at the positive impact your monitoring program will have on your environment stability and your ability to get some sleep.
- The Odd Couple: Marrying Agile and Waterfall
- Fanning the Flames of Agile
- Internet of @ThingsExpo Silicon Valley Call for Papers Now Open
- MangoApps to Exhibit at Cloud Expo New York
- WSO2 Introduces Industry’s First Enterprise Identity Bus With the Launch of WSO2 Identity Server 5.0
- Last Chance to Register for LTE World Summit
- The Butterfly Effect Within IT
- The Business Challenges Impacting Digital Transformation
- Stay Current on the Internet of Things
- Setting the Bar for Agile Architecture
- New Relic Announces General Availability of Real-Time Analytics Platform New Relic Insights
- Misconceptions Around App Testing in the Private Cloud
- How to Get the Best From Virtual Employees
- Global Financial Firms Can Effectively Address Technology Risk Guidelines
- .CLUB Domain Name Extension Now Available for General Registration
- AMAG, HP, ImageWare Systems, March Networks and StrikeForce Discuss Security Solutions in SecuritySolutionsWatch.com Interviews
- MapR Technologies Announces Upcoming June Conferences
- More Mainstream Businesses Depend on Open Source
- F5 to Present at Upcoming Technology and Investor Conferences
- The Odd Couple: Marrying Agile and Waterfall
- Flexera Software’s InstallShield 2014 Release Introduces New Support of Cloud and Virtualised Installations, High-DPI Displays and Touch Devices, and Agile Development
- FlexNet Manager Suite Wins CODiE Award for Best Asset Management Solution - 4th CODiE Award for Flexera Software
- Fanning the Flames of Agile
- WSO2 Guest Speakers at WSO2Con Europe 2014 Will Examine Technology Developments and Best Practices Enabling the Connected Business
- The Top 150 Players in Cloud Computing
- Who Are The All-Time Heroes of i-Technology?
- Where Are RIA Technologies Headed in 2008?
- Success, Arrogance, Rise and Fall
- AJAX World RIA Conference & Expo Kicks Off in New York City
- The Top 250 Players in the Cloud Computing Ecosystem
- Personal Branding Checklist
- i-Technology Viewpoint: Attack of the Blogs
- Exclusive Q&A with Jeff Haynie, Co-Founder & CEO, Appcelerator
- Cloud People: A Who's Who of Cloud Computing
- Ulitzer Names the World's 30 Most Influential Cloud Computing Bloggers
- Web 2.0 News and Wrapping Up "Real-World AJAX" Seminar