#9 – Canonical’s cloud-init saves you from image soup, on every cloud
Thursday, May 29th, 2014This is a series of posts on reasons to choose Ubuntu for your public or private cloud work & play.
We run an extensive program to identify issues and features that make a difference to cloud users. One result of that program is that we pioneered dynamic image customisation and wrote cloud-init. I’ll tell the story of cloud-init as an illustration of the focus the Ubuntu team has on making your devops experience fantastic on any given cloud.
Ever struggled to find the “right” image to use on your favourite cloud? Ever wondered how you can tell if an image is safe to use, what keyloggers or other nasties might be installed? We set out to solve that problem a few years ago and the resulting code, cloud-init, is one of the more visible pieces Canonical designed and built, and very widely adopted.
Traditionally, people used image snapshots to build a portfolio of useful base images. You’d start with a bare OS, add some software and configuration, then snapshot the filesystem. You could use those snapshots to power up fresh images any time you need more machines “like this one”. And that process works pretty amazingly well. There are hundreds of thousands, perhaps millions, of such image snapshots scattered around the clouds today. It’s fantastic. Images for every possible occasion! It’s a disaster. Images with every possible type of problem.
The core issue is that an image is a giant binary blob that is virtually impossible to audit. Since it’s a snapshot of an image that was running, and to which anything might have been done, you will need to look in every nook and cranny to see if there is a potential problem. Can you afford to verify that every binary is unmodified? That every configuration file and every startup script is safe? No, you can’t. And for that reason, that whole catalogue of potential is a catalogue of potential risk. If you wanted to gather useful data sneakily, all you’d have to do is put up an image that advertises itself as being good for a particular purpose and convince people to run it.
There are other issues, even if you create the images yourself. Each image slowly gets out of date with regard to security updates. When you fire it up, you need to apply all the updates since the image was created, if you want a secure machine. Eventually, you’ll want to re-snapshot for a more up-to-date image. That requires administration overhead and coordination, most people don’t do it.
That’s why we created cloud-init. When your virtual machine boots, cloud-init is run very early. It looks out for some information you send to the cloud along with the instruction to start a new machine, and it customises your machine at boot time. When you combine cloud-init with the regular fresh Ubuntu images we publish (roughly every two weeks for regular updates, and whenever a security update is published), you have a very clean and elegant way to get fresh images that do whatever you want. You design your image as a script which customises the vanilla, base image. And then you use cloud-init to run that script against a pristine, known-good standard image of Ubuntu. Et voila! You now have purpose-designed images of your own on demand, always built on a fresh, secure, trusted base image.
Auditing your cloud infrastructure is now straightforward, because you have the DNA of that image in your script. This is devops thinking, turning repetitive manual processes (hacking and snapshotting) into code that can be shared and audited and improved. Your infrastructure DNA should live in a version control system that requires signed commits, so you know everything that has been done to get you where you are today. And all of that is enabled by cloud-init. And if you want to go one level deeper, check out Juju, which provides you with off-the-shelf scripts to customise and optimise that base image for hundreds of common workloads.