Canonical just announced a new, free, and very cool way to provide thousands of IP addresses to each of your VMs on AWS. Check out the fan networking on Ubuntu wiki page to get started, or read Dustin’s excellent fan walkthrough. Carry on here for a simple description of this happy little dose of awesome.

Containers are transforming the way people think about virtual machines (LXD) and apps (Docker). They give us much better performance and much better density for virtualisation in LXD, and with Docker, they enable new ways to move applications between dev, test and production. These two aspects of containers – the whole machine container and the process container, are perfectly complementary. You can launch Docker process containers inside LXD machine containers very easily. LXD feels like KVM only faster, Docker feels like the core unit of a PAAS.

The density numbers are pretty staggering. It’s *normal* to run hundreds of containers on a laptop.

And that is what creates one of the real frustrations of the container generation, which is a shortage of easily accessible IP addresses.

It seems weird that in this era of virtual everything that a number is hard to come by. The restrictions are real, however, because AWS restricts artificially the number of IP addresses you can bind to an interface on your VM. You have to buy a bigger VM to get more IP addresses, even if you don’t need extra compute. Also, IPv6 is nowehre to be seen on the clouds, so addresses are more scarce than they need to be in the first place.

So the key problem is that you want to find a way to get tens or hundreds of IP addresses allocated to each VM.

Most workarounds to date have involved “overlay networking”. You make a database in the cloud to track which IP address is attached to which container on each host VM. You then create tunnels between all the hosts so that everything can talk to everything. This works, kinda. It results in a mess of tunnels and much more complex routing than you would otherwise need. It also ruins performance for things like multicast and broadcast, because those are now exploding off through a myriad twisty tunnels, all looking the same.

The Fan is Canonical’s answer to the container networking challenge.

We recognised that container networking is unusual, and quite unlike true software-defined networking, in that the number of containers you want on each host is probably roughly the same. You want to run a couple hundred containers on each VM. You also don’t (in the docker case) want to live migrate them around, you just kill them and start them again elsewhere. Essentially, what you need is an address multiplier – anywhere you have one interface, it would be handy to have 250 of them instead.

So we came up with the “fan”. It’s called that because you can picture it as a fan behind each of your existing IP addresses, with another 250 IP addresses available. Anywhere you have an IP you can make a fan, and every fan gives you 250x the IP addresses. More than that, you can run multiple fans, so each IP address could stand in front of thousands of container IP addresses.

We use standard IPv4 addresses, just like overlays. What we do that’s new is allocate those addresses mathematically, with an algorithmic projection from your existing subnet / network range to the expanded range. That results in a very flat address structure – you get exactly the same number of overlay addresses for each IP address on your network, perfect for a dense container setup.

Because we’re mapping addresses algorithmically, we avoid any need for a database of overlay addresses per host. We can calculate instantly, with no database lookup, the host address for any given container address.

More importantly, we can route to these addresses much more simply, with a single route to the “fan” network on each host, instead of the maze of twisty network tunnels you might have seen with other overlays.

You can expand any network range with any other network range. The main idea, though, is that people will expand a class B range in their VPC with a class A range. Who has a class A range lying about? You do! It turns out that there are a couple of class A networks that are allocated and which publish no routes on the Internet.

We also plan to submit an IETF RFC for the fan, for address expansion. It turns out that “Class E” networking was reserved but never defined, and we’d like to think of that as a new “Expansion” class. There are several class A network addresses reserved for Class E, which won’t work on the Internet itself. While you can use the fan with unused class A addresses (and there are several good candidates for use!) it would be much nicer to do this as part of a standard.

The fan is available on Ubuntu on AWS and soon on other clouds, for your testing and container experiments! Feedback is most welcome while we refine the user experience.

Configuration on Ubuntu is super-simple. Here’s an example:

In /etc/network/fan:

# fan 241
241.0.0.0/8 172.16.3.0/16 dhcp

In /etc/network/interfaces:

iface eth0 static
address 172.16.3.4
netmask 255.255.0.0
up fanctl up 241.0.0.0/8 172.16.3.4/16
down fanctl down 241.0.0.0/8 172.16.3.4/16

This will map 250 addresses on 241.0.0.0/8 to your 172.16.0.0/16 hosts.

Docker, LXD and Juju integration is just as easy. For docker, edit /etc/default/docker.io, adding:

DOCKER_OPTS=”-d -b fan-10-3-4 –mtu=1480 –iptables=false”

You must then restart docker.io:

sudo service docker.io restart

At this point, a Docker instance started via, e.g.,

docker run -it ubuntu:latest

will be run within the specified fan overlay network.

Enjoy!

This is a series of posts on reasons to choose Ubuntu for your public or private cloud work & play.

We run an extensive program to identify issues and features that make a difference to cloud users. One result of that program is that we pioneered dynamic image customisation and wrote cloud-init. I’ll tell the story of cloud-init as an illustration of the focus the Ubuntu team has on making your devops experience fantastic on any given cloud.

 

Ever struggled to find the “right” image to use on your favourite cloud? Ever wondered how you can tell if an image is safe to use, what keyloggers or other nasties might be installed? We set out to solve that problem a few years ago and the resulting code, cloud-init, is one of the more visible pieces Canonical designed and built, and very widely adopted.

Traditionally, people used image snapshots to build a portfolio of useful base images. You’d start with a bare OS, add some software and configuration, then snapshot the filesystem. You could use those snapshots to power up fresh images any time you need more machines “like this one”. And that process works pretty amazingly well. There are hundreds of thousands, perhaps millions, of such image snapshots scattered around the clouds today. It’s fantastic. Images for every possible occasion! It’s a disaster. Images with every possible type of problem.

The core issue is that an image is a giant binary blob that is virtually impossible to audit. Since it’s a snapshot of an image that was running, and to which anything might have been done, you will need to look in every nook and cranny to see if there is a potential problem. Can you afford to verify that every binary is unmodified? That every configuration file and every startup script is safe? No, you can’t. And for that reason, that whole catalogue of potential is a catalogue of potential risk. If you wanted to gather useful data sneakily, all you’d have to do is put up an image that advertises itself as being good for a particular purpose and convince people to run it.

There are other issues, even if you create the images yourself. Each image slowly gets out of date with regard to security updates. When you fire it up, you need to apply all the updates since the image was created, if you want a secure machine. Eventually, you’ll want to re-snapshot for a more up-to-date image. That requires administration overhead and coordination, most people don’t do it.

That’s why we created cloud-init. When your virtual machine boots, cloud-init is run very early. It looks out for some information you send to the cloud along with the instruction to start a new machine, and it customises your machine at boot time. When you combine cloud-init with the regular fresh Ubuntu images we publish (roughly every two weeks for regular updates, and whenever a security update is published), you have a very clean and elegant way to get fresh images that do whatever you want. You design your image as a script which customises the vanilla, base image. And then you use cloud-init to run that script against a pristine, known-good standard image of Ubuntu. Et voila! You now have purpose-designed images of your own on demand, always built on a fresh, secure, trusted base image.

Auditing your cloud infrastructure is now straightforward, because you have the DNA of that image in your script. This is devops thinking, turning repetitive manual processes (hacking and snapshotting) into code that can be shared and audited and improved. Your infrastructure DNA should live in a version control system that requires signed commits, so you know everything that has been done to get you where you are today. And all of that is enabled by cloud-init. And if you want to go one level deeper, check out Juju, which provides you with off-the-shelf scripts to customise and optimise that base image for hundreds of common workloads.

Every detail matters, and building great software means taking time to remove the papercuts. Ubuntu has over the past 5 years been refined in many ways to feel amazingly comfortable on the cloud. In the very early days of EC2 growth the Ubuntu team recognised how many developers were enjoying fast access to infrastructure on demand, and we set about polishing up Ubuntu to be amazing on the cloud.

This was a big program of work; the Linux experience had many bad assumptions baked in – everything had been designed to be installed once on a server then left largely untouched for as long as possible, but cloud infrastructure was much more dynamic than that.

We encouraged our team to use the cloud as much as possible, which made the work practical and motivated people to get it right themselves. If you want to catch all the little scratchy bits, make it part of your everyday workflow. Today, we have added OpenStack clouds to the mix, as well as the major public clouds. Cloud vendors have taken diverse approaches to IAAS so we find ourselves encouraging developers to use all of them to get a holistic view, and also to address any cloud-specific issues that arise. But the key point is – if it’s great for us, that’s a good start on making it great for everybody.

Then we set about interviewing cloud users and engaging people who were deep into cloud infrastructure to advise on what they needed. We spent a lot of time immersing ourselves in the IAAS experience through the eyes of cloud users – startups and industrial titans, universities and mid-sized, everyday companies. We engaged the largest and fastest-moving cloud users like Netflix, who have said they enjoy Ubuntu as a platform on the cloud. And that in turn drove our prioritisation of paper-cuts and significant new features for cloud users.

We also looked at the places people actually spend time developing. Lots of them are on Ubuntu desktops, but Windows and MacOS are popular too, and it takes some care to make it very easy for folks there to have a great devops experience.

All of this is an industrial version of the user experience design process that also powers our work on desktop, tablet and phone – system interfaces and applications. Devops, sysadmins, developers and their managers are humans too, so human-centric design principles are just as important on the infrastructure as they are on consumer electronics and consumer software. Feeling great at the command line, being productive as an operator and a developer, are vital to our community and our ecosystem. We keep all the potency of Linux with the polish of a refined, designed environment.

Along the way we invented and designed a whole raft of key new pieces of Ubuntu. I’ll write about one of them, cloud-init, next. The net effect of that work makes Ubuntu really useful on every cloud. That’s why the majority of developers using IAAS do so on Ubuntu.

Ubuntu in 2013

Wednesday, December 26th, 2012

This is a time of year to ponder what matters most and choose what we’ll focus on in the year to come. Each of us has our own priorities and perspective, so your goals may be very different to mine. Nevertheless, for everyone in the Ubuntu project, here’s what I’ll be working towards in the coming year, and why.

First, what matters most?

It matters that we not exclude people from our audience. From the artist making scenes for the next blockbuster, to the person who needs a safe way to surf the web once a day, it’s important to me, and to the wider Ubuntu community, the people be able to derive some benefit from our efforts. Some of that benefit might be oblique – when someone prefers XFCE to Unity, they are still benefiting from enormous efforts by hundreds of people to make the core Ubuntu platform, as well as the Xubuntu team’s unique flourish. Even in the rare case where the gift is received ungraciously, the joy is in the giving, and it matters that our efforts paid dividends for others.

In this sense, it matters most that we bring the benefits of free software to an audience which would not previously have had the confidence to be different. If you’ve been arguing over software licenses for the best part of 15 years then you would probably be fine with whatever came before Ubuntu. And perhaps the thing you really need is the ability to share your insights and experience with all the people in your life who wouldn’t previously have been able to relate to the things you care about. So we have that interest in common.

It matters that we make a platform which can be USED by anybody. That’s why we’ve invested so much into research and thinking about how people use their software, what kinds of tools they need handy access to, and what the future looks like. We know that there are plenty of smart people who’s needs are well served by what existed in the past. We continue to maintain older versions of Ubuntu so that they can enjoy those tools on a stable platform. But we want to shape the future, which means exploring territory that is unfamiliar, uncertain and easy to criticise. And in this regard, we know, scientifically, that Ubuntu with Unity is better than anything else out there. That’s not to diminish the works of others, or the opinions of those that prefer something else, it’s to celebrate that the world of free software now has a face that will be friendly to anybody you care to recommend it.

It also matters that we be relevant for the kinds of computing that people want to do every day.

That’s why Unity in 2013 will be all about mobile – bringing Ubuntu to phones and tablets. Shaping Unity to provide the things we’ve learned are most important across all form factors, beautifully. Broadening the Ubuntu community to include mobile developers who need new tools and frameworks to create mobile software. Defining new form factors that enable new kinds of work and play altogether. Bringing clearly into focus the driving forces that have shaped our new desktop into one facet of a bigger gem.

It’s also why we’ll push deeper into the cloud, making it even easier, faster and cost effective to scale out modern infrastructure on the cloud of your choice, or create clouds for your own consumption and commerce. Whether you’re building out a big data cluster or a super-scaled storage solution, you’ll get it done faster on Ubuntu than any other platform, thanks to the amazing work of our cloud community. Whatever your UI of choice, having the same core tools and libraries from your phone to your desktop to your server and your cloud instances makes life infinitely easier. Consider it a gift from all of us at Ubuntu.

There will always be things that we differ on between ourselves, and those who want to define themselves by their differences to us on particular points. We can’t help them every time, or convince them of our integrity when it doesn’t suit their world view. What we can do is step back and look at that backdrop: the biggest community in free software, totally global, diverse in their needs and interests, but united in a desire to make it possible for anybody to get a high quality computing experience that is first class in every sense. Wow. Thank you. That’s why I’ll devote most of my time and energy to bringing that vision to fruition. Here’s to a great 2013.

Microsoft has built an impressive new entrant to the Infrastructure-as-a-Service market, and Ubuntu is there for customers who want to run workloads on Azure that are best suited to Linux. Windows Azure was built for the enterprise market, an audience which is increasingly comfortable with Ubuntu as a workhorse for scale-out workloads; in short, it’s a good fit for both of us, and it’s been interesting to do the work to bring Ubuntu to the platform.

Given that it’s normal for us to spin up 2,000-node Hadoop clusters with Juju, it will be very valuable to have a new enterprise-oriented cloud with which to evaluate performance, latency, reliability, scalability and many other key metrics for production deployment scenarios.

As IAAS grows in recognition as a standard part of the enterprise toolkit, it will be important to have a wide range of infrastructures that are addressable, with diverse strengths. In the case of Windows Azure, there is clearly a deep connection between Windows-based IT and the new IAAS. But I think Microsoft has set their sights on a bigger story, which is high-quality enterprise-oriented infrastructure that is generally useful. That’s why Ubuntu is important to them, and why it was worthwhile for us to work together despite our differences. Just as we need to ensure that customers can run Ubuntu and Windows together inside their data centre and on the LAN, we want to ensure that cloud workloads play nicely.

The team leading Azure has a sophisticated understanding of Ubuntu and Linux in general. They are taking a pragmatic approach that will raise eyebrows around the Redmond campus, but is exactly what customers want to see. We have taken a similar view. I know there will be members of the free software community that will leap at the chance to berate Microsoft for its very existence, but it’s not very Ubuntu to do so: let’s argue our perspective, work towards our goals, be open to those who are open to us, and build great stuff. There is nothing proprietary in Ubuntu-for-Azure, and no about-turn from us on long-held values. This is us making sure our audience, and especially the enterprise audience, can benefit from the work our community and Canonical do no matter where they want to do it.

Windows Azure IAAS is in beta. If you are using the cloud today, or interested in it, I highly recommend you try it out. There’s no better way to make yourself heard over there.

As we move from “tens” to “hundreds” to “thousands” of nodes in a typical data centre we need new tools and practices. This hyperscale story – of hyper-dense racks with wimpy nodes – is the big shift in the physical world which matches the equally big shift to cloud computing in the virtualised world. Ubuntu’s popularity in the cloud comes in part from being leaner, faster, more agile. And MAAS – Metal as a Service – is bringing that agility back to the physical world for hyperscale deployments.

Servers used to aspire to being expensive. Powerful. Big. We gave them names like “Hercules” or “Atlas”. The bigger your business, or the bigger your data problem, the bigger the servers you bought. It was all about being beefy – with brands designed to impress, like POWER and Itanium.

Things are changing.

Today, server capacity can be bought as a commodity, based on the total cost of compute: the cost per teraflop, factoring in space, time, electricity. We can get more power by adding more nodes to our clusters, rather than buying beefier nodes. We can increase reliability by doubling up, so services keep running when individual nodes fail. Much as RAID changed the storage game, this scale-out philosophy, pioneered by Google, is changing the server landscape.

In this hyperscale era, each individual node is cheap, wimpy and, by historical standards for critical computing, unreliable. But together, they’re unstoppable. The horsepower now resides in the cluster, not the node. Likewise, the reliability of the infrastructure now depends on redundancy, rather than heroic performances from specific machines. There is, as they say, safety in numbers.

We don’t even give hyperscale nodes proper names any more – ask “node-0025904ce794”. Of course, you can still go big with the cluster name. I’m considering “Mark’s Magnificent Mountain of Metal” – significantly more impressive than “Mark’s Noisy Collection of Fans in the Garage”, which is what Claire will probably call it. And that’s not the knicker-throwing kind of fan, either.

The catch to this massive multiplication in node density, however, is in the cost of provisioning. Hyperscale won’t work economically if every server has to be provisioned, configured  and managed as if it were a Hercules or an Atlas. To reap the benefits, we need leaner provisioning processes. We need deployment tools to match the scale of the new physical reality.

That’s where Metal as a Service (MAAS) comes in. MAAS makes it easy to set up the hardware on which to deploy any service that needs to scale up and down dynamically – a cloud being just one example. It lets you provision your servers dynamically, just like cloud instances – only in this case, they’re whole physical nodes. “Add another node to the Hadoop cluster, and make sure it has at least 16GB RAM” is as easy as asking for it.

With a simple web interface, you can  add, commission, update and recycle your servers at will.  As your needs change, you can respond rapidly, by adding new nodes and dynamically re-deploying them between services. When the time comes, nodes can be retired for use outside the MAAS.

As we enter an era in which ATOM is as important in the data centre as XEON, an operating system like Ubuntu makes even more sense. Its freedom from licensing restrictions, together with the labour saving power of tools like MAAS, make it cost-effective, finally, to deploy and manage hundreds of nodes at a time

Here’s another way to look at it: Ubuntu is bringing cloud semantics to the bare metal world. What a great foundation for your IAAS.

Automated deployment of Ubuntu with Orchestra

Thursday, October 27th, 2011

 

Orchestra is one of the most exciting new capabilities in 11.10. It provides automated installation of Ubuntu across sets of machines. Typically, it’s used by people bringing up a cluster or farm of servers, but the way it’s designed makes it very easy to bring up rich services, where there may be a variety of different kinds of nodes that all need to be installed together.

There’s a long history of tools that have been popular at one time or another for automated installation. FAI is the one I knew best before Orchestra came along and I was interested in the rationale for a new tool, and the ways in which it would enhance the experience of people building clusters, clouds and other services at scale. Dustin provided some of that in his introduction to Orchestra, but the short answer is that Orchestra is savvy to the service orchestration model of Juju, which means that the intelligence distilled in Juju charms can easily be harnessed in any deployment that uses Orchestra on bare metal.

What’s particularly cool about THAT is that it unifies the new world of private cloud with the old approach of Linux deployment in a cluster. So, for example, Orchestra can be used to deploy Hadoop across 3,000 servers on bare metal, and that same Juju charm can also deploy Hadoop on AWS or an OpenStack cloud. And soon it should be possible to deploy Hadoop across n physical machines with automatic bursting to your private or favourite public cloud, all automatically built in. Brilliant. Kudos to the conductor 🙂

Private cloud is very exciting – and with Ubuntu 11.10 it’s really easy to set up a small cloud to kick the tires, then scale that up as needed for production. But there are still lots of reasons why you might want to deploy a service onto bare metal, and Orchestra is a neat way to do that while at the same time preparing for a cloud-oriented future, because the work done to codify policies or practices in the physical environment should be useful immediately in the cloud, too.

For 12.04 LTS, where supporting larger-scale deployments will be a key goal, Orchestra becomes a tool that every Ubuntu administrator will find useful. I bet it will be the focus of a lot of discussion at UDS next week, and a lot of work in this cycle.

Cloudy prognosis for mainframes

Monday, October 24th, 2011

The death of the mainframe is about as elusive as the year of the Linux desktop. But cloud computing might finally present a terminal opportunity, so to speak, to those stalwarts of big business computing, by providing a compelling answer to the twin stories of reliability and throughput that have always been highlights of the big iron pitch.

Advocates of big iron talk about reliability. But with public clouds, we’re learning how to build services that achieve very high levels of reliability despite having low individual node reliability. It doesn’t matter if a single node in the cloud fails – cloud-style architectures route around that damage and keep the overall service available. Just as we dial storage reliability up or down by designing RAID arrays for the right balance of performance and resilience to failure, you can dial service reliability up or down in the cloud by allowing for redundancy. That comes at a price, of course, but the price of an extra 9 is substantially lower when you tackle it cloud-style than when you try and achieve it on a single piece of hardware.

The other big strength of big iron was always throughput. Customers will pay for it, so mainframe vendors were always happy to oblige them. But again, it’s hard to beat the throughput of a Hadoop cluster, and even harder to scale the throughput of a mainframe as cost-effectively as one can scale a private cloud infrastructure underneath Hadoop.

I’m not suggesting insurance companies will throw away their mainframes. They’re working, they’re paid for, so they’ll stick around. But the rapid adoption of cloud-based architectures is going to make it very difficult to consolidate future IT onto mainframes (something that happened in every prior generation) and is also going to reduce the incentive for doing so in the first place. After 20 years of imminent irrelevance, there’s finally a real reason to think their time is up.

Building clouds for fun and profit

Monday, September 19th, 2011

So you’d like to spin up an internal cloud for hadoop or general development, shifting workloads from AWS to your own infrastructure or prototyping some new cloud services?

Call Canonical’s cloud infrastructure design and consulting team.

There are a couple of scenarios that we’re focused on at the moment, where we can offer standardised engagements:

  • Telco’s building out cloud infrastructures for public cloud services. These are aiming for specific markets based on geography or network topology – they have existing customers and existing networks and a competitive advantage in handling outsourced infrastructure for companies that are well connected to them, as well as a jurisdictional advantage over the global public cloud providers.
  • Cloud infrastructure prototypes at a division or department level. These are mostly folk who want the elasticity and dynamic provisioning of AWS in a private environment, often to work on products that will go public on Rackspace or AWS in due course, or to demonstrate and evaluate the benefits of this sort of architecture internally.
  • Cloud-style legacy deployments. These are folk building out HPC-type clusters running dedicated workloads that are horizontally scaled but not elastic. Big Hadoop deployments, or Condor deployments, fall into this category.

Cloud has become something of a unifying theme in many of our enterprise and server-oriented conversations in the past six months. While not everyone is necessarily ready to shift their workloads to a dynamic substrate like Ubuntu Cloud Infrastructure (powered by OpenStack) it seems that most large-scale IT deployments are embracing cloud-style design and service architectures, even when they are deploying on the metal. So we’ve put some work into tools which can be used in both cloud and large-scale-metal environments, for provisioning and coordination.

With 12.04 LTS on the horizon, OpenStack exploding into the wider consciousness of cloud-savvy admins, and projects like Ceph and CloudFoundry growing in stature and capability, it’s proving to be a very dynamic time for IT managers and architects. Much as the early days of the web presented a great deal of hype and complexity and options, only to settle down into a few key standard practices and platforms, cloud infrastructure today presents a wealth of options and a paucity of clarity; from NoSQL choices, through IAAS choices, through PAAS choices. Over the next couple of months I’ll outline how we think the cloud stack will shape up. Our goal is to make that “clean, crisp, obvious” deployment Just Work, bringing simplicity to the cloud much as we strive to bring it on the desktop.

For the moment, though, it’s necessary to roll up sleeves and get hands a little dirty, so the team I mentioned previously has been busy bringing some distilled wisdom to customers embarking on their cloud adventures in a hurry. Most of these engagements started out as custom consulting and contract efforts, but there are now sufficient patterns that the team has identified a set of common practices and templates that help to accelerate the build-out for those typical scenarios, and packaged those up as a range of standard cloud building offerings.

 

Innovation and OpenStack: Lessons from HTTP

Thursday, September 8th, 2011

OpenStack is facing an important choice: does define a new set of API’s, one of many such efforts in cloud infrastructure, or does it build around the existing AWS API’s?  So far, OpenStack has had it both ways, with some new API work and also some AWS-based effort. I’m writing to make the case for a tighter definition of mission around the de facto standard infrastructure API’s of EC2, S3 and a few other elements of AWS.

What prompted this blog was my overhearing (or, seeing an email on a list) the statement that cloud infrastructure projects like OpenStack, Eucalyptus and others should “innovate at the level of the API and infrastructure concepts”. I’m of the view that any projects which try to do so will fail and are not worth spending your or my time on. They are going to be about as successful as projects that try to reinvent HTTP to make it better/faster/cleaner/whatever. Which is to say – not successful at all, because no new protocol with the same conceptual goals will match the ecosystem that exists today around HTTP. There will of course be protocol innovation, the last word is never written, but for the web, it’s a done deal. All the proprietary and ad-hoc things that preceded HTTP have died, and good riddance. Similarly, cloud infrastructure will converge around a standard API which will be imperfect but real. Innovation is all about how that API is implemented, not which API it is.

Nobody would say the web server market lacks innovation. There are many, many different companies and communities that make and market web server solutions. And each of those is innovating in some way – focusing on a different audience, or trying a different approach. Yet that entire market is constrained by a public standard: HTTP, which evolves far more slowly than the products that implement it.

There are also a huge number of things that wrap themselves around HTTP, from cache accelerators to 3G content compressors; the standardisation of that thin layer has created a massive ecosystem and driven fantastic innovation, even as many of the core concepts that drove HTTP’s initial design have eroded or softened. For example, HTTP was relentlessly stateless, but we’ve added cookies and cacheing to address issues caused by that (at the time radical) design constraint.

Today, cloud infrastructure is looking for its HTTP. I think that standard already exists in de facto form today at AWS, with EC2, S3 and some of the credential mechanisms being essentially the core primitives of cloud infrastructure management. There is enormous room for innovation in cloud infrastructure *implementations*, even within the constraints of that minimalist API. The hackers and funders and leaders and advocates of OpenStack, and any number of other cloud infrastructure projects both open source and proprietary, would be better off figuring out how to leverage that standardisation than trying to compete with it, simply because no other API is likely to gain the sort of ecosystem we see around AWS today.

It’s true that those API’s would better be defined in a clean, independent forum analogous to the W3C than inside the boiler-room of development at any single cloud provider, but that’s a secondary issue. And over time, it can be engineered to work that way.

More importantly for the moment, those who make an authentic effort to fit into the AWS protocol standard immediately gain access to chunks of the AWS gene pool, effectively gratis. From services like RightScale to tools like ElasticFox, your cloud is going to be more familiar, more effective and more potent if it can ease the barriers to porting from AWS. No two implementations will magically Just Work, but the rough edges and gotchas can get ironed out much more easily if there is a clear standard and reference implementations. So the cost of “porting” will always be lower between clouds that have commonality by design or heritage.

For OpenStack itself, until that standard is codified, I would describe the most successful mission statement as “to be the reference public cloud provider scale implementation of cloud infrastructure compatible with AWS core API’s”. That’s going to give all the public cloud providers who want to compete with Amazon the best result: they’ll be able to compete on service terms, while convincing early adopters that the move to their offering will be relatively painless. All it takes, really, is some humility and the wisdom to recognise the right place to innovate.

There will be many implementations of those core API’s. One or other will be the Apache, the “just start here” option. But it doesn’t matter so much which one that is, frankly. I think OpenStack has the best possible chance to be that, but only if they stick to this crisp mission and don’t allow themselves to be drawn into front-end differentiation for the sake of it. Should that happen, OpenStack will be vulnerable to another open source project which credibly aims to achieve the goals outlined here. Very vulnerable. Witness the ways in which Eucalyptus is rightly pointing out its superior AWS compatibility in comparison with OpenStack.

For the public cloud providers that hope to build on OpenStack, API differentiation is poison in a juicy steak. It looks tasty, but it’s going to cost you the race prematurely. There were lots of technical reasons why alternatives to Windows were *better*, they just failed to become de facto standards. As long as Amazon doesn’t package up AWS as an on-premise solution, it’s possible to establish a de facto standard around something else, but that something else (perhaps OpenStack) needs to be AWS-compatible in some meaningful way to get enough momentum to matter. That means there’s a window of opportunity to get this right, which is not going to stay open indefinitely. Either Amazon, or another open source project, could close that window on OpenStack’s fingers. And that would be a pity, since the community around OpenStack has tons of energy and goodwill. In order to succeed, it will need to channel that energy into innovation on the implementation, not on trying to redefine an existing standard.

Of course, all this would be much easier if there were a real HTTP-like standard defining those API’s. The web had the enormous advantage of being founded by Tim Berners-Lee, in an institution like CERN, with the vision to setup the W3C. In the case of today’s cloud infrastructure, there isn’t the same dynamic or set of motivations. Amazon’s position of vagueness on the AWS API’s is tactically perfect for them right now, and I would expect them to maintain that line while knowing full well there is no real proprietary claim in a public network API, and no real advantage to be had from claiming otherwise. What’s needed is simply to start codifying existing practice as a draft standard in a credible forum of experts, with a roadmap and the prospect of support from multiple vendors. I think that would be relatively easy to arrange, if we could get Rackspace, IBM and HP to sit down and commit to doing it. We already have HP and Rackspace at the OpenStack table, so the signs are encouraging.

A good standard would:

* be pragmatic about the fact that Amazon has already made a bunch of decisions we’ll live with for ever.
* have a commitment from folk like OpenStack and Eucalyptus to aim for compliance
* include a real automated functional test suite that becomes the interop benchmark of choice over time
* be open to participation by Amazon, though that will not likely come for some time
* be well documented and well managed, like HTTP and CSS and HTML
* not be run but the ITU or ISO

I’m quite willing to contribute resources to getting such a standard off the ground. Forget big consortiums or working groups or processes or lobbying forums, what’s needed are a few savvy folk who know AWS, Eucalyptus and OpenStack, together with a very few technical writers. Let me know if you’re interested.

Now, I started out by saying that I was writing to make the case for OpenStack to be focused on a particular area. It’s a bit cheeky for me to write anything of the sort, of course, because OpenStack is a well run project that has an excellent steering group, which recently held a poll of contributors to appoint some new members, none of which was me. I’ve every confidence in the leadership of the project, despite the tremendous pressure they are under to realise the hopes of so many diverse users and companies. I’m optimistic for the potential OpenStack has to accelerate cloud technology, and in Canonical we put a considerable amount of effort into making OpenStack deployment a smooth experience for Ubuntu users and Canonical customers. Ubuntu Cloud Infrastructure depends now on OpenStack. And I have a few old friends who are also leaders in the OpenStack community, so for all those reasons I thought it worth making this perspective public.