Welcome!

Agile Computing Authors: Pat Romanski, Yeshim Deniz, Elizabeth White, ManageEngine IT Matters, Liz McMillan

Blog Feed Post

Why you should build an Immutable Infrastructure

Why you should build an Immutable Infrastructure – by Florian Motlik, CTO of Codeship

Some of the major challenges today when building infrastructure are predictability, scalability and automated recovery. A predictable system will promote the exact same artifact that you tested into your production system so no intermittent failure can cause any trouble. A scalable system make it trivial, especially automatically, to deal with any rise in traffic. And automated recovery will make sure your team can focus on building a better product and sleep during the night instead of maintaining infrastructure constantly.

At Codeship we’ve found that an Infrastructure made up of immutable components has helped us tremendously with these goals.

Julian Dunn from Chef recently released a blog post about their stance on immutable infrastructure.

Chad Fowler summed it up very well in a tweet

Instead of going over every piece of the article, I want to present an overview of the experience we – and others – have had in making parts of our infrastructure immutable.

What is Immutable Infrastructure

Immutable infrastructure is comprised of immutable components that are replaced for every deployment, rather than being updated in-place. Those components are started from a common image that is built once per deployment and can be tested and validated. The common image can be built through automation, but doesn’t have to be. Immutability is independent of any tool or workflow for building the images.

Its best use case is in a cloud or virtualized environment. While it’s possible in non-virtualized environments, the benefit doesn’t outweigh the effort.

State Isolation

The main criticism against immutable infrastructure – as stated in the Chef blog post – is that there is always state somewhere in the system and, therefore, the whole system isn’t immutable. That misses the point of immutable components. The main advantage when it comes to state in immutable infrastructure is that it is siloed. The boundaries between layers storing state and the layers that are ephemeral are clearly drawn and no leakage can possibly happen between those layers. There simply is no way to mix state into different components when you can’t expect them to be up and running the next minute.

Atomic Deployments and Validation

Updating an existing server can easily have unintended consequences. That’s why Chef, Puppet, CFEngine or other such tools exist – to take care of consistency across your infrastructure. A central system is necessary to manage the expected state of each server and to take action to ensure compliance. Deployment is not an atomic action but a transition that can go wrong and lead to an unknown state. This becomes very hard and complex to debug, as the exact state you are in is hard to know. Chef, Puppet or CFEngine are very complex systems as they have to deal with an overly complex problem.

Another solution to that problem is to build completely new images and servers that contain the application and the environment every time you want to deploy. In that case, the deployment doesn’t depend on the status the servers were in before, so the result is much more predictable and repeatable. Any third-party issues that may cause the deployment to fail can be caught by validating the new image and ensuring no production system was impacted. This one image can then be used to start any number of servers and switch atomically from the old machines to the new ones by changing the load balancer, for example.

There are of course downsides to rebuilding your images with every deployment. A full rebuild of the system takes a lot longer than simply updating and restarting the application. By layering your deployment you can optimize this, e.g. have a repository to build a base image and use that base image to just put in your application for the deployment image, but it will still be a slower process.

Another problem is that you introduce dependencies to third parties during deployment. If you install packages in the system and your apt repository is slow or down this can fail the deployment. While this could be a problem in a non immutable infrastructure as well you typically interact less with third party systems when you just push new code into an already provisioned system.

By deploying from a pre-provisioned base image and updating that base image regularly you can soften that problem, but it’s still there and might fail a deployment from time to time.

Building the automation currently still takes more time at the beginning of the project, as the tools for building immutable infrastructure are still new or need to be developed. It is definitely more investment in the beginning, but pays off immediately.

You can still use Chef, Puppet, CFEngine or Ansible to build your images, but as they aren’t built for an immutable infrastructure workflow they tend to be more complex than necessary.

Fast Recovery by preserving History

As all deployments are done by building new images, history is preserved automatically for rollback when necessary. The same process and automation that is used to deploy the next version can be used to roll back, which ensures the process of rolling back will work. By automating the creation of the images, you can even recreate historical images and branch off from earlier points in the history of the infrastructure.

Data schema changes are a potential problem, but that’s a general issue with rollbacks. Backwards compatibility and zero downtime deployments are a way to make sure this will work regardless of the changes.

Simple Experimentation

As you control the whole environment and application, any experiments with new versions of the language, operating system or dependencies are easy. With strict testing and validation in place, and the ability to roll-back if necessary, all the fear of upgrading any dependency is removed. Experimentation becomes an integral and trivial part of building your infrastructure.

Makes you collect your logs and metrics in a central location

With immutable components in place, it’s easy to simply kill a misbehaving server. While often errors are simply a product of the environment, for example a third party system misbehaving, and can be ignored, some will keep coming up. Not having access into the servers puts the right incentive on the team to collect and store logs and system metrics externally. This way, debugging can happen while the server is long gone.

If logs and metrics are missing to properly debug an issue, it’s easy to add more data collection to the infrastructure and replace all existing servers. Then once the error comes up again you can debug it fully from the data stored on an external system.

Conclusions

Immutable components as part of your infrastructure are a way to reduce inconsistency in your infrastructure and improve the trust into your deployment process. Atomic deployments, combined with validation of the image and easy rollback, make managing your infrastructure a lot easier.

It forces teams to silo data and expect failures that are inherent when building on top of a cloud infrastructure or when building systems in general. This increases resilience and trains you in a process to withstand any problems, especially in an automated fashion. Furthermore, it helps with building simple and independent components that are easy to deploy and scale.

And it’s not a theoretical idea. At Codeship, we’ve built our infrastructure this way for a long time. Heroku and other PaaS providers are built as immutable components and lots of companies – small and very large – have used immutability as a core concept of their infrastructure.

Tools like Packer have made building immutable components very easy. Together with existing cloud infrastructure they are a powerful concept to help you build better and safer infrastructure. Let me know in the comments if you have any questions or interesting insights to share.

Thanks

I got great feedback by the following people on this article. Thanks for taking the time and helping me to make it much clearer and simply better.

Links

Read the original blog entry...

More Stories By Manuel Weiss

I am the cofounder of Codeship – a hosted Continuous Integration and Deployment platform for web applications. On the Codeship blog we love to write about Software Testing, Continuos Integration and Deployment. Also check out our weekly screencast series 'Testing Tuesday'!

@ThingsExpo Stories
SYS-CON Events announced today that TechTarget has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. TechTarget storage websites are the best online information resource for news, tips and expert advice for the storage, backup and disaster recovery markets.
SYS-CON Events announced today that Telecom Reseller has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Telecom Reseller reports on Unified Communications, UCaaS, BPaaS for enterprise and SMBs. They report extensively on both customer premises based solutions such as IP-PBX as well as cloud based and hosted platforms.
Artificial intelligence, machine learning, neural networks. We’re in the midst of a wave of excitement around AI such as hasn’t been seen for a few decades. But those previous periods of inflated expectations led to troughs of disappointment. Will this time be different? Most likely. Applications of AI such as predictive analytics are already decreasing costs and improving reliability of industrial machinery. Furthermore, the funding and research going into AI now comes from a wide range of com...
SYS-CON Events announced today that Silicon India has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Published in Silicon Valley, Silicon India magazine is the premiere platform for CIOs to discuss their innovative enterprise solutions and allows IT vendors to learn about new solutions that can help grow their business.
SYS-CON Events announced today that Ayehu will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara California. Ayehu provides IT Process Automation & Orchestration solutions for IT and Security professionals to identify and resolve critical incidents and enable rapid containment, eradication, and recovery from cyber security breaches. Ayehu provides customers greater control over IT infras...
SYS-CON Events announced today that MobiDev, a client-oriented software development company, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. MobiDev is a software company that develops and delivers turn-key mobile apps, websites, web services, and complex software systems for startups and enterprises. Since 2009 it has grown from a small group of passionate engineers and business...
SYS-CON Events announced today that Conference Guru has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. A valuable conference experience generates new contacts, sales leads, potential strategic partners and potential investors; helps gather competitive intelligence and even provides inspiration for new products and services. Conference Guru works with conference organi...
SYS-CON Events announced today that GrapeUp, the leading provider of rapid product development at the speed of business, will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Grape Up is a software company, specialized in cloud native application development and professional services related to Cloud Foundry PaaS. With five expert teams that operate in various sectors of the market acr...
In this presentation, Striim CTO and founder Steve Wilkes will discuss practical strategies for counteracting fraud and cyberattacks by leveraging real-time streaming analytics. In his session at @ThingsExpo, Steve Wilkes, Founder and Chief Technology Officer at Striim, will provide a detailed look into leveraging streaming data management to correlate events in real time, and identify potential breaches across IoT and non-IoT systems throughout the enterprise. Strategies for processing massive ...
Internet of @ThingsExpo, taking place October 31 - November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA, is co-located with 21st Cloud Expo and will feature technical sessions from a rock star conference faculty and the leading industry players in the world. The Internet of Things (IoT) is the most profound change in personal and enterprise IT since the creation of the Worldwide Web more than 20 years ago. All major researchers estimate there will be tens of billions devic...
"MobiDev is a Ukraine-based software development company. We do mobile development, and we're specialists in that. But we do full stack software development for entrepreneurs, for emerging companies, and for enterprise ventures," explained Alan Winters, U.S. Head of Business Development at MobiDev, in this SYS-CON.tv interview at 20th Cloud Expo, held June 6-8, 2017, at the Javits Center in New York City, NY.
SYS-CON Events announced today that Cloud Academy named "Bronze Sponsor" of 21st International Cloud Expo which will take place October 31 - November 2, 2017 at the Santa Clara Convention Center in Santa Clara, CA. Cloud Academy is the industry’s most innovative, vendor-neutral cloud technology training platform. Cloud Academy provides continuous learning solutions for individuals and enterprise teams for Amazon Web Services, Microsoft Azure, Google Cloud Platform, and the most popular cloud com...
SYS-CON Events announced today that CA Technologies has been named "Platinum Sponsor" of SYS-CON's 21st International Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. CA Technologies helps customers succeed in a future where every business - from apparel to energy - is being rewritten by software. From planning to development to management to security, CA creates software that fuels transformation for companies in the applic...
The current age of digital transformation means that IT organizations must adapt their toolset to cover all digital experiences, beyond just the end users’. Today’s businesses can no longer focus solely on the digital interactions they manage with employees or customers; they must now contend with non-traditional factors. Whether it's the power of brand to make or break a company, the need to monitor across all locations 24/7, or the ability to proactively resolve issues, companies must adapt to...
SYS-CON Events announced today that TMC has been named “Media Sponsor” of SYS-CON's 21st International Cloud Expo and Big Data at Cloud Expo, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Global buyers rely on TMC’s content-driven marketplaces to make purchase decisions and navigate markets. Learn how we can help you reach your marketing goals.
Amazon started as an online bookseller 20 years ago. Since then, it has evolved into a technology juggernaut that has disrupted multiple markets and industries and touches many aspects of our lives. It is a relentless technology and business model innovator driving disruption throughout numerous ecosystems. Amazon’s AWS revenues alone are approaching $16B a year making it one of the largest IT companies in the world. With dominant offerings in Cloud, IoT, eCommerce, Big Data, AI, Digital Assista...
We build IoT infrastructure products - when you have to integrate different devices, different systems and cloud you have to build an application to do that but we eliminate the need to build an application. Our products can integrate any device, any system, any cloud regardless of protocol," explained Peter Jung, Chief Product Officer at Pulzze Systems, in this SYS-CON.tv interview at @ThingsExpo, held November 1-3, 2016, at the Santa Clara Convention Center in Santa Clara, CA
SYS-CON Events announced today that IBM has been named “Diamond Sponsor” of SYS-CON's 21st Cloud Expo, which will take place on October 31 through November 2nd 2017 at the Santa Clara Convention Center in Santa Clara, California.
Multiple data types are pouring into IoT deployments. Data is coming in small packages as well as enormous files and data streams of many sizes. Widespread use of mobile devices adds to the total. In this power panel at @ThingsExpo, moderated by Conference Chair Roger Strukhoff, panelists looked at the tools and environments that are being put to use in IoT deployments, as well as the team skills a modern enterprise IT shop needs to keep things running, get a handle on all this data, and deliver...
SYS-CON Events announced today that Enzu will exhibit at SYS-CON's 21st Int\ernational Cloud Expo®, which will take place October 31-November 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Enzu’s mission is to be the leading provider of enterprise cloud solutions worldwide. Enzu enables online businesses to use its IT infrastructure to their competitive advantage. By offering a suite of proven hosting and management services, Enzu wants companies to focus on the core of their ...