In 1996 Regurgitator released a song called “Kong Foo Sing“. It starts with the line “Happiness is a Kong Foo Sing”, in reference to a particular brand of fortune cookie. But one night last week at the OpenStack Summit, I couldn’t help but think it would be better stated as “Happiness is a Hong Kong SIM”, because I’ve apparently become thoroughly addicted to my data connection.
I was there with five other SUSE engineers who work on SUSE Cloud (our OpenStack offering); Ralf Haferkamp, Michal Jura, Dirk Müller, Vincent Untz and Bernhard Wiedemann. We also had SUSE crew manning a booth which had one of those skill tester machines filled with plush Geekos. I didn’t manage to get one. Apparently my manual dexterity is less well developed than my hacking skills, because I did make ATC thanks to a handful of openSUSE-related commits to TripleO (apologies for the shameless self-aggrandizement, but this is my blog after all).
Given this was my first design summit, I thought it most sensible to first attend “Design Summit 101“, to get a handle on the format. The summit as a whole is split into general sessions and design summit sessions, the former for everyone, the latter intended for developers to map out what needs to happen for the next release. There’s also vendor booths in the main hall.
Roughly speaking, design sessions get a bunch of people together with a moderator/leader and an etherpad up on a projector, which anyone can edit. Then whatever the topic is, is hashed out over the next forty-odd minutes. It’s actually a really good format. The sessions I was in, anyone who wanted to speak or had something to offer, was heard. Everyone was courteous, and very welcoming of input, and of newcomers. Actually, as I remarked on the last day towards the end of Joshua McKenty’s “Culture, Code, Community and Conway” talk, everyone is terrifyingly happy. And this is not normal, but it’s a good thing.
As I’ve been doing high availability and storage for the past several years, and have also spent time on SUSE porting and scalability work on Crowbar, I split my time largely between HA, storage and deployment sessions.
On the deployment front, I went to:
- HA/Production Configuration, where the pieces of OpenStack that TripleO needs to deploy in a highly available manner were discussed (actually this could have been discussed for a solid week 😉
- Stable Branch Support and Update Futures, about updating images made for TripleO.
- An Evaluation of OpenStack Deployment Frameworks, where two guys from Symantec discussed the evaluation they’d done of Fuel, JuJu/MaaS, Crowbar, Foreman and Rackspace Private Cloud. In short, nothing was perfect, but Crowbar 1.6 performed the best (i.e. met their requirements better than any of the other solutions tested).
- Roundtable: Deploying and Upgrading OpenStack.
- OpenStack’s Bare Metal Provisioning Service, wherein I attained a better understanding of Ironic.
- It Not Just An Unicorn, Updating Our Public Cloud Platform from Folsom to Grizzly – how eNovance manage upgrades. Automate all the things and test, test, test. Binary updates are done by rsyncing prepared trees, but everything can be rolled back and forwards, because everything is in revision control. It sounds like they’ve done a very thorough job in their environment. I’m less sure this technique is applicable in a generic fashion.
- The Road to Live Upgrades. Notably they want to add a live upgrade test as a commit gate.
- Hardware Management Ramdisk. Lots of work to do here for Ironic to deploy ramdisks to do, e.g.: firmware updates, RAID configuration, etc.
- Firmware Updates (followed right on from the previous session).
- Making Ironic Resilient to Failures (what do you do if your TFTP/PXE server goes away?)
- Compass – Yet Another OpenStack Deployment System, from Huawei, to be released as open source under the Apache 2.0 license “soon” (end of November). A layer on top of Chef, but with other configuration tools as pluggable modules. If you squint at it just right, I’d argue it’s not dissimilar to Crowbar, at least from a high level.
On High Availability:
- Practical Lessons from Building a Highly Available OpenStack Private Cloud (Ceph for all storage, HA via four separate Pacemaker clusters. Notably the cluster running compute can scale out by just adding more nodes.
- High Availability Update: Havana and Icehouse, wherein I attempted to look scary sitting the front row wearing my STONITH Deathmatch t-shirt. I hope Florian and Syed will forgive my butchering their talk by summarizing it as: If you’re using MySQL, you want Galera. RabbitMQ still has consistency issues with mirrored queues and there can be only one Neutron L3 agent, so you need Pacemaker for those at least, so using Pacemaker to “HA all the things” is still an eminently reasonable approach (haproxy is great for load balancing, but no good if you have a service that’s fundamentally active/passive). Use Ceph for all your storage.
- Database Clusters as a Service in OpenStack: Integrated, Scalable, Highly Available and Secure. Focused on MySQL/MariaDB/Percona, Galera and variants thereof, which combinations are supported by Rackspace, HP Cloud and Amazon, and various deployment considerations (including replication across data centers).
On Storage:
- Encrypted Block Storage: Technical Walkthrough. This looks pretty neat. Crypto is done on the compute host via dm-crypt, so everything is encrypted in the volume store and even over the wire going to and from the compute host. Still needs work (naturally), notably it currently uses a single static key. Later, it will use Barbican.
- Swift Drive Workloads and Kinetic Open Storage. Sadly I had to skip out of this one early, but Seagate now have an interesting product which is a disk (and some enclosures) which present disks as key/value stores over ethernet, rather than as block devices. The idea here is you remove a whole lot of layers of the storage stack to try to get better performance.
- Real World Usage of GlusterFS + OpenStack. Interesting history of the project, what the pieces are, and how they now provide an “all-in-one” storage solution for OpenStack.
- Ceph: The De Facto Storage Backend for OpenStack. It was inevitable that this would go back-to-back with a GlusterFS presentation. All storage components (Glance, Cinder, object store) unified. Interestingly the
libvirt_image_type=rbd
option lets you directly boot all VMs from Ceph (at least if you’re using KVM). Is it the perfect stack? “Almost” (glance images are still copied around more than they should be, but there’s a patch for this floating around somewhere, also some snapshot integration work is still necessary). - Sheepdog: Yet Another All-In-One Storage for Openstack. So everyone is doing all-in-one storage for OpenStack now 😉 I haven’t spent any time with Sheepdog in the past, so this was interesting. It apparently tries to have minimal assumptions about the underlying kernel and filesystem, yet supports thousands of nodes, is purportedly fast and small (<50MB memory footprint) and consists of only 35K lines of C code.
- Ceph OpenStack Integration Unconference (gathering ideas to improve Ceph integration in OpenStack).
Around all this of course were many interesting discussions, meals and drinks with all sorts of people; my immediate colleagues, my some-time partners in crime, various long-time conference buddies and an assortment of delightful (and occasionally crazy) new acquaintances. If you’ve made it this far and haven’t been to an OpenStack summit yet, try to get to Atlanta in six months or Paris in a year. I don’t know yet whether or not I’ll be there, but I can pretty much guarantee you’ll still have a good time.