Hello Salty Goodness

Anyone who’s ever deployed Ceph presumably knows about ceph-deploy. It’s right there in the Deployment chapter of the upstream docs, and it’s pretty easy to use to get a toy test cluster up and running. For any decent sized cluster though, ceph-deploy rapidly becomes cumbersome… As just one example, do you really want to have to `ceph-deploy osd prepare` every disk? For larger production clusters it’s almost certainly better to use a fully-fledged configuration management tool, such as Salt, which is what this post is about.

Continue reading

Thunderbird Uses OpenGL – Who Knew?

I have a laptop and a desktop system (as well as a bunch of other crap, but let’s ignore that for a moment). Both laptop and desktop are running openSUSE Tumbleweed. I’m usually in front of my desktop, with dual screens, a nice keyboard and trackball, and the laptop is sitting with the lid closed tucked away under the desk. Importantly, the laptop is where my mail client lives. When I’m at my desk, I ssh from desktop to laptop with X forwarding turned on, then fire up Thunderbird, and it appears on my desktop screen. When I go travelling, I take the laptop with me, and I’ve still got my same email client, same settings, same local folders. Easy. Those of you considering heckling me for not using $any_other_mail_client and/or $any_other_environment, please save it for later.

Yesterday I had an odd problem. A new desktop system arrived, so I installed Tumbleweed, eventually ssh’d to my Laptop, started Thunderbird, and…

# thunderbird

…nothing happened. There’s usually a little bit of junk on the console at that point, and the Thunderbird window should have appeared on my desktop screen. But it didn’t. strace showed it stuck in a loop, waiting for something:

wait4(22167, 0x7ffdfc669be4, 0, NULL)   = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=22164, si_uid=1000} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
wait4(22167, 0x7ffdfc669be4, 0, NULL)   = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=22164, si_uid=1000} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)
wait4(22167, 0x7ffdfc669be4, 0, NULL)   = ? ERESTARTSYS (To be restarted if SA_RESTART is set)
--- SIGVTALRM {si_signo=SIGVTALRM, si_code=SI_TKILL, si_pid=22164, si_uid=1000} ---
rt_sigreturn({mask=[]})                 = -1 EINTR (Interrupted system call)

After an assortment of random dead ends (ancient and useless bug reports about Thunderbird and Firefox failing to run over remote X sessions), I figured I may as well attach a debugger to see if I could get any more information:

# gdb -p 22167
GNU gdb (GDB; openSUSE Tumbleweed) 7.11
[...]
Attaching to process 22167
Reading symbols from /usr/lib64/thunderbird/thunderbird-bin...
[...]
0x00007f2e95331a1d in poll () from /lib64/libc.so.6
(gdb) break
Breakpoint 1 at 0x7f2e95331a1d
(gdb) bt
#0 0x00007f2e95331a1d in poll () from /lib64/libc.so.6
#1 0x00007f2e8730b410 in ?? () from /usr/lib64/libxcb.so.1
#2 0x00007f2e8730cecf in ?? () from /usr/lib64/libxcb.so.1
#3 0x00007f2e8730cfe2 in xcb_wait_for_reply () from /usr/lib64/libxcb.so.1
#4 0x00007f2e86ecc845 in ?? () from /usr/lib64/libGL.so.1
#5 0x00007f2e86ec74b8 in ?? () from /usr/lib64/libGL.so.1
#6 0x00007f2e86e9a2a9 in ?? () from /usr/lib64/libGL.so.1
#7 0x00007f2e86e9654b in ?? () from /usr/lib64/libGL.so.1
#8 0x00007f2e86e966b3 in glXChooseVisual () from /usr/lib64/libGL.so.1
#9 0x00007f2e90fa0d6f in glxtest () at /usr/src/debug/thunderbird/mozilla/toolkit/xre/glxtest.cpp:230
#10 0x00007f2e90fa1003 in fire_glxtest_process () at /usr/src/debug/thunderbird/mozilla/toolkit/xre/glxtest.cpp:333
#11 0x00007f2e90f9b4cd in XREMain::XRE_mainInit (this=this@entry=0x7ffdfc66c448, aExitFlag=aExitFlag@entry=0x7ffdfc66c3ef) at /usr/src/debug/thunderbird/mozilla/toolkit/xre/nsAppRunner.cpp:3134
#12 0x00007f2e90f9ee27 in XREMain::XRE_main (this=this@entry=0x7ffdfc66c448, argc=argc@entry=1, argv=argv@entry=0x7ffdfc66d958, aAppData=aAppData@entry=0x7ffdfc66c648)
at /usr/src/debug/thunderbird/mozilla/toolkit/xre/nsAppRunner.cpp:4362
#13 0x00007f2e90f9f0f2 in XRE_main (argc=1, argv=0x7ffdfc66d958, aAppData=0x7ffdfc66c648, aFlags=) at /usr/src/debug/thunderbird/mozilla/toolkit/xre/nsAppRunner.cpp:4484
#14 0x00000000004054c8 in do_main (argc=argc@entry=1, argv=argv@entry=0x7ffdfc66d958, xreDirectory=0x7f2e9504a9c0) at /usr/src/debug/thunderbird/mail/app/nsMailApp.cpp:195
#15 0x0000000000404c4a in main (argc=1, argv=0x7ffdfc66d958) at /usr/src/debug/thunderbird/mail/app/nsMailApp.cpp:332
(gdb) continue
[Inferior 1 (process 22167) exited with code 01]

OK, so it’s libGL that’s waiting for something. Why is my mail client trying to do stuff with OpenGL?

Hang on! When I told gdb to continue, suddenly Thunderbird appeared, running properly, on my desktop display. WTF?

As far as I can tell, the problem is that my new desktop system has an NVIDIA GPU (nouveau drivers, BTW), and my laptop and previous desktop system both have Intel GPUs. Something about ssh’ing from the desktop with the NVIDIA GPU to the laptop with the Intel GPU, causes Thunderbird (and, indeed, any GL app — I also tried glxinfo and glxgears) to just wedge up completely. Whereas if I do the reverse (ssh from Intel GPU laptop to NVIDIA GPU desktop) and run GL apps, it works fine.

After some more Googling, I discovered I can make Thunderbird work properly over remote X like this:

# LIBGL_ALWAYS_INDIRECT=1 thunderbird

That will apparently cause glXCreateContext to return BadValue, which is enough to kick Thunderbird along. LIBGL_ALWAYS_SOFTWARE=1 works equally well to enable Thunderbird to function, while presumably still allowing it to use OpenGL if it really needs to for something (proof: LIBGL_ALWAYS_INDIRECT=1 glxgears fails, LIBGL_ALWAY_SOFTWARE=1 glxgears gives me spinning gears).

I checked Firefox too, and it of course has the same remote X problem, and the same solution.

Salt and Pepper Squid with Fresh Greens

A few days ago I told Andrew Wafaa I’d write up some notes for him and publish them here. I became hungry contemplating this work, so decided cooking was the first order of business:

Salt and Pepper Squid with Fresh Greens

It turned out reasonably well for a first attempt. Could’ve been crispier, and it was quite salty, but the pepper and chilli definitely worked (I’m pretty sure the chilli was dried bhut jolokia I harvested last summer). But this isn’t a post about food, it’s about some software I’ve packaged for managing Ceph clusters on openSUSE and SUSE Linux Enterprise Server.

Continue reading

One More chef-client Run

Carrying on from my last post, the failed chef-client run came down to the init script in ceph 0.56 not yet knowing how to iterate /var/lib/ceph/{mon,osd,mds} and automatically start the appropriate daemons. This functionality seems to have been introduced in 0.58 or so by commit c8f528a. So I gave it another shot with a build of ceph 0.60.

On each of my ceph nodes, a bit of upgrading and cleanup. Note the choice of ceph 0.60 was mostly arbitrary, I just wanted the latest thing I could find an RPM for in a hurry. Also some of the rm invocations won’t be necessary, depending on what state things are actually in:

# zypper ar -f http://download.opensuse.org/repositories/home:/dalgaaf:/ceph:/extra/openSUSE_12.3/home:dalgaaf:ceph:extra.repo
# zypper ar -f http://gitbuilder.ceph.com/ceph-rpm-opensuse12-x86_64-basic/ref/next/x86_64/ ceph.com-next_openSUSE_12_x86_64
# zypper in ceph-0.60
# kill $(pidof ceph-mon)
# rm /etc/ceph/*
# rm /var/run/ceph/*
# rm -r /var/lib/ceph/*/*

That last gets rid of any half-created mon directories.

I also edited the Ceph environment to only have one mon (one of my colleagues rightly pointed out that you need an odd number of mons, and I had declared two previously, for no good reason). That’s knife environment edit Ceph on my desktop, and set "mon_initial_members": "ceph-0" instead of "ceph-0,ceph-1".

I also had to edit each of the nodes, to add an osd_devices array to each node, and remove the mon role from ceph-1. That’s knife node edit ceph-0.example.com then insert:

  "normal": {
    ...
    "ceph": {
      "osd_devices": [  ]
    }
  ...

Without the osd_devices array defined, the osd recipe fails (“undefined method `each_with_index’ for nil:NilClass”). I was kind of hoping an empty osd_devices array would allow ceph to use the root partition. No such luck, the cookbook really does expect you to be doing a sensible deployment with actual separate devices for your OSDs. Oh, well. I’ll try that another time. For now at least I’ve demonstrated that ceph-0.60 does give you what appears to be a clean mon setup when using the upstream cookbooks on openSUSE 12.3:

knife ssh name:ceph-0.example.com -x root chef-client
[2013-04-15T06:32:13+00:00] INFO: *** Chef 10.24.0 ***
[2013-04-15T06:32:13+00:00] INFO: Run List is [role[ceph-mon], role[ceph-osd], role[ceph-mds]]
[2013-04-15T06:32:13+00:00] INFO: Run List expands to [ceph::mon, ceph::osd, ceph::mds]
[2013-04-15T06:32:13+00:00] INFO: HTTP Request Returned 404 Not Found: No routes match the request: /reports/nodes/ceph-0.example.com/runs
[2013-04-15T06:32:13+00:00] INFO: Starting Chef Run for ceph-0.example.com
[2013-04-15T06:32:13+00:00] INFO: Running start handlers
[2013-04-15T06:32:13+00:00] INFO: Start handlers complete.
[2013-04-15T06:32:13+00:00] INFO: Loading cookbooks [apache2, apt, ceph]
[2013-04-15T06:32:13+00:00] INFO: Processing template[/etc/ceph/ceph.conf] action create (ceph::conf line 6)
[2013-04-15T06:32:13+00:00] INFO: template[/etc/ceph/ceph.conf] updated content
[2013-04-15T06:32:13+00:00] INFO: template[/etc/ceph/ceph.conf] mode changed to 644
[2013-04-15T06:32:13+00:00] INFO: Processing service[ceph_mon] action nothing (ceph::mon line 23)
[2013-04-15T06:32:13+00:00] INFO: Processing execute[ceph-mon mkfs] action run (ceph::mon line 40)
creating /var/lib/ceph/tmp/ceph-ceph-0.mon.keyring
added entity mon. auth auth(auid = 18446744073709551615 key=AQC8umZRaDlKKBAAqD8li3u2JObepmzFzDPM3g== with 0 caps)
ceph-mon: mon.noname-a 192.168.4.118:6789/0 is local, renaming to mon.ceph-0
ceph-mon: set fsid to f80aba97-26c5-4aa3-971e-09c5a3afa32f
ceph-mon: created monfs at /var/lib/ceph/mon/ceph-ceph-0 for mon.ceph-0
[2013-04-15T06:32:14+00:00] INFO: execute[ceph-mon mkfs] ran successfully
[2013-04-15T06:32:14+00:00] INFO: execute[ceph-mon mkfs] sending start action to service[ceph_mon] (immediate)
[2013-04-15T06:32:14+00:00] INFO: Processing service[ceph_mon] action start (ceph::mon line 23)
[2013-04-15T06:32:15+00:00] INFO: service[ceph_mon] started
[2013-04-15T06:32:15+00:00] INFO: Processing ruby_block[tell ceph-mon about its peers] action create (ceph::mon line 64)
mon already active; ignoring bootstrap hint

[2013-04-15T06:32:16+00:00] INFO: ruby_block[tell ceph-mon about its peers] called
[2013-04-15T06:32:16+00:00] INFO: Processing ruby_block[get osd-bootstrap keyring] action create (ceph::mon line 79)
2013-04-15 06:32:16.872040 7fca8e297780 -1 monclient(hunting): authenticate NOTE: no keyring found; disabled cephx authentication
2013-04-15 06:32:16.872042 7fca8e297780 -1 unable to authenticate as client.admin
2013-04-15 06:32:16.872400 7fca8e297780 -1 ceph_tool_common_init failed.
[2013-04-15T06:32:18+00:00] INFO: ruby_block[get osd-bootstrap keyring] called
[2013-04-15T06:32:18+00:00] INFO: Processing package[gdisk] action upgrade (ceph::osd line 37)
[2013-04-15T06:32:27+00:00] INFO: package[gdisk] upgraded from uninstalled to 
[2013-04-15T06:32:27+00:00] INFO: Processing service[ceph_osd] action nothing (ceph::osd line 48)
[2013-04-15T06:32:27+00:00] INFO: Processing directory[/var/lib/ceph/bootstrap-osd] action create (ceph::osd line 67)
[2013-04-15T06:32:27+00:00] INFO: Processing file[/var/lib/ceph/bootstrap-osd/ceph.keyring.raw] action create (ceph::osd line 76)
[2013-04-15T06:32:27+00:00] INFO: entered create
[2013-04-15T06:32:27+00:00] INFO: file[/var/lib/ceph/bootstrap-osd/ceph.keyring.raw] owner changed to 0m
[2013-04-15T06:32:27+00:00] INFO: file[/var/lib/ceph/bootstrap-osd/ceph.keyring.raw] group changed to 0
[2013-04-15T06:32:27+00:00] INFO: file[/var/lib/ceph/bootstrap-osd/ceph.keyring.raw] mode changed to 440
[2013-04-15T06:32:27+00:00] INFO: file[/var/lib/ceph/bootstrap-osd/ceph.keyring.raw] created file /var/lib/ceph/bootstrap-osd/ceph.keyring.raw
[2013-04-15T06:32:27+00:00] INFO: Processing execute[format as keyring] action run (ceph::osd line 83)
creating /var/lib/ceph/bootstrap-osd/ceph.keyring
added entity client.bootstrap-osd auth auth(auid = 18446744073709551615 key=AQAOl2tR0M4bMRAAatSlUh2KP9hGBBAP6u5AUA== with 0 caps)
[2013-04-15T06:32:27+00:00] INFO: execute[format as keyring] ran successfully
[2013-04-15T06:32:28+00:00] INFO: Chef Run complete in 14.479108446 seconds
[2013-04-15T06:32:28+00:00] INFO: Running report handlers
[2013-04-15T06:32:28+00:00] INFO: Report handlers complete

Witness:

ceph-0:~ # rcceph status
=== mon.ceph-0 === 
mon.ceph-0: running {"version":"0.60-468-g98de67d"}

On the note of building an easy-to-deploy Ceph appliance, assuming you’re not using Chef and just want something to play with, I reckon the way to go is use config pretty similar to what would be deployed by this Chef cookbook, i.e. an absolute minimal /etc/ceph/ceph.conf, specifying nothing other than initial mons, then use the various Ceph CLI tools to create mons and osds on each node and just rely on the init script in Ceph >= 0.58 to do the right thing with what it finds (having to explicitly specify each mon, osd and mds in the Ceph config by name always bugged me). Bonus points for using csync2 to propagate /etc/ceph/ceph.conf across the cluster.

The Ceph Chef Experiment

Sometimes it’s most interesting to just dive in and see what breaks. There’s a Chef cookbook for Ceph on github which seems rather more recently developed than the one in SUSE-Cloud/barclamp-ceph, and seeing as its use is documented in the Ceph manual, I reckon that’s the one I want to be using. Of course, the README says “Tested as working: Ubuntu Precise (12.04)”, and I’m using openSUSE 12.3…

First things first, need a Chef server, so I installed openSUSE 12.3 on a VM, then installed Chef 10 on that, roughly following the manual installation instructions. Note for those following along at home – sometimes the blocks I’ve copied here are just commands, sometimes they include command output as well. You’ll figure it out :-)

# zypper ar -f http://download.opensuse.org/repositories/systemsmanagement:/chef:/10/openSUSE_12.3/systemsmanagement:chef:10.repo
# zypper in rubygem-chef-server
# chkconfig couchdb on
# rccouchdb start
# chkconfig rabbitmq-server on
# rcrabbitmq-server start
# rabbitmqctl add_vhost /chef
# rabbitmqctl add_user chef testing
# rabbitmqctl set_permissions -p /chef chef ".*" ".*" ".*"
# for service in solr expander server server-webui; do
      chkconfig chef-$service on
      rcchef-$service start
  done

I didn’t bother editing /etc/chef/server.rb, the config as shipped works fine (not that the AMQP password is very secure, mind). The only catch is the web UI didn’t start. IIRC this is due to /etc/chef/webui.pem not existing yet (chef-server creates it, but this doesn’t finish until later).

Then configured knife:

# knife configure -i
WARNING: No knife configuration file found
Where should I put the config file? [/root/.chef/knife.rb]
Please enter the chef server URL: [http://os-chef.example.com:4000]
Please enter a clientname for the new client: [root]
Please enter the existing admin clientname: [chef-webui]
Please enter the location of the existing admin client's private key: [/etc/chef/webui.pem]
Please enter the validation clientname: [chef-validator]
Please enter the location of the validation key: [/etc/chef/validation.pem]
Please enter the path to a chef repository (or leave blank):
Creating initial API user...
Created client[root]
Configuration file written to /root/.chef/knife.rb

And make a client for me:

# knife client create tserong -d -a -f /tmp/tserong.pem
Created client[tserong]

Then set up my desktop as a Chef workstation (roughly following these docs, and again pulling Chef from systemsmanagement:chef:10 on OBS):

# sudo zypper in rubygem-chef
# cd ~
# git clone git://github.com/opscode/chef-repo.git
# cd chef-repo
# mkdir -p ~/.chef
# scp root@os-chef:/etc/chef/validation.pem ~/.chef/
# scp root@os-chef:/tmp/tserong.pem ~/.chef/
# knife configure
WARNING: No knife configuration file found
Where should I put the config file? [/home/tserong/.chef/knife.rb]
Please enter the chef server URL: [http://desktop.example.com:4000] http://os-chef.example.com:4000
Please enter an existing username or clientname for the API: [tserong]
Please enter the validation clientname: [chef-validator]
Please enter the location of the validation key: [/etc/chef/validation.pem] /home/tserong/.chef/validation.pem
Please enter the path to a chef repository (or leave blank): /home/tserong/chef-repo
[...]
Configuration file written to /home/tserong/.chef/knife.rb

Make sure it works:

# knife client list
chef-validator
chef-webui
root
tserong

Grab the cookbooks and upload them to the Chef server. The Ceph cookbook claims to depend on apache and apt, although presumably the former is only necessary for RADOSGW, and the latter for Debian-based systems. Anyway:

# cd ~/chef-repo
# git submodule add git@github.com:opscode-cookbooks/apache2.git cookbooks/apache2
# git submodule add git@github.com:opscode-cookbooks/apt.git cookbooks/apt
# git submodule add git@github.com:ceph/ceph-cookbooks.git cookbooks/ceph
# knife cookbook upload apache2
# knife cookbook upload apt
# knife cookbook upload ceph

Boot up a couple more VMs to be Ceph nodes, using the appliance image from last time. These need chef-client installed, and need to be registered with the chef server. knife bootstrap will install chef-client and dependencies for you, but after looking at the source, if /usr/bin/chef doesn’t exist, it actually uses wget or curl to pull http://opscode.com/chef/install.sh and runs that. How this is considered a good idea is completely baffling to me, so again I installed our chef build from OBS on each of my Ceph nodes (note to self: should add this to appliance image on Studio):

# zypper ar -f http://download.opensuse.org/repositories/systemsmanagement:/chef:/10/openSUSE_12.3/systemsmanagement:chef:10.repo
# zypper in rubygem-chef

And ran the now-arguably-safe knife bootstrap from my desktop:

# knife bootstrap ceph-0.example.com
Bootstrapping Chef on ceph-0.example.com
[...]
# knife bootstrap ceph-1.example.com
Bootstrapping Chef on ceph-1.example.com
[...]

Then, roughly following the Ceph Deploying with Chef document.

Generate a UUID and monitor secret (had to do the latter on one of my Ceph VMs, as ceph-authtool is conveniently already installed):

# uuidgen -r
f80aba97-26c5-4aa3-971e-09c5a3afa32f
# ceph-authtool /dev/stdout --name=mon. --gen-key
[mon.]
key = AQC8umZRaDlKKBAAqD8li3u2JObepmzFzDPM3g==

Then on my desktop:

knife environment create Ceph

This I filled in with:

{
  "name": "Ceph",
  "description": "",
  "cookbook_versions": {
  },
  "json_class": "Chef::Environment",
  "chef_type": "environment",
  "default_attributes": {
    "ceph": {
      "monitor-secret": "AQC8umZRaDlKKBAAqD8li3u2JObepmzFzDPM3g==",
      "config": {
        "fsid": "f80aba97-26c5-4aa3-971e-09c5a3afa32f",
        "mon_initial_members": "ceph-0,ceph-1",
        "global": {
        },
        "osd": {
          "osd journal size": "1000",
          "filestore xattr use omap": "true"
        }
      }
    }
  },
  "override_attributes": {
  }
}

Uploaded roles:

# knife role from file cookbooks/ceph/roles/ceph-mds.rb
# knife role from file cookbooks/ceph/roles/ceph-mon.rb
# knife role from file cookbooks/ceph/roles/ceph-osd.rb
# knife role from file cookbooks/ceph/roles/ceph-radosgw.rb

Assigned roles to nodes:

# knife node run_list add ceph-0.example.com 'role[ceph-mon],role[ceph-osd],role[ceph-mds]'
# knife node run_list add ceph-1.example.com 'role[ceph-mon],role[ceph-osd],role[ceph-mds]'

I didn’t bother with recipe[ceph::repo] as I don’t care about installation right now (Ceph is already installed in my VM images).

Had to set "chef_environment": "Ceph" for each node by running:

# knife node edit ceph-0.example.com
# knife node edit ceph-1.example.com

Didn’t set Ceph osd_devices per node – I’m just playing, so can sit on top of the root partition.

Now let’s see if it works:

# knife ssh name:ceph-0.example.com -x root chef-client
[2013-04-11T13:44:47+00:00] INFO: *** Chef 10.24.0 ***
[2013-04-11T13:44:48+00:00] INFO: Run List is [role[ceph-mon], role[ceph-osd], role[ceph-mds]]
[2013-04-11T13:44:48+00:00] INFO: Run List expands to [ceph::mon, ceph::osd, ceph::mds]
[2013-04-11T13:44:48+00:00] INFO: HTTP Request Returned 404 Not Found: No routes match the request: /reports/nodes/ceph-0.example.com/runs
[2013-04-11T13:44:48+00:00] INFO: Starting Chef Run for ceph-0.example.com
[2013-04-11T13:44:48+00:00] INFO: Running start handlers
[2013-04-11T13:44:48+00:00] INFO: Start handlers complete.
[2013-04-11T13:44:48+00:00] INFO: Loading cookbooks [apache2, apt, ceph]
No ceph-mon found.

[2013-04-11T13:44:48+00:00] INFO: Processing template[/etc/ceph/ceph.conf] action create (ceph::conf line 6)
[2013-04-11T13:44:48+00:00] INFO: template[/etc/ceph/ceph.conf] backed up to /var/chef/backup/etc/ceph/ceph.conf.chef-20130411134448
[2013-04-11T13:44:48+00:00] INFO: template[/etc/ceph/ceph.conf] updated content
[2013-04-11T13:44:48+00:00] INFO: template[/etc/ceph/ceph.conf] owner changed to 0
[2013-04-11T13:44:48+00:00] INFO: template[/etc/ceph/ceph.conf] group changed to 0
[2013-04-11T13:44:48+00:00] INFO: template[/etc/ceph/ceph.conf] mode changed to 644
[2013-04-11T13:44:48+00:00] INFO: Processing service[ceph_mon] action nothing (ceph::mon line 23)
[2013-04-11T13:44:48+00:00] INFO: Processing execute[ceph-mon mkfs] action run (ceph::mon line 40)
creating /var/lib/ceph/tmp/ceph-ceph-0.mon.keyring
added entity mon. auth auth(auid = 18446744073709551615 key=AQC8umZRaDlKKBAAqD8li3u2JObepmzFzDPM3g== with 0 caps)
ceph-mon: mon.noname-a 192.168.4.118:6789/0 is local, renaming to mon.ceph-0
ceph-mon: set fsid to f80aba97-26c5-4aa3-971e-09c5a3afa32f
ceph-mon: created monfs at /var/lib/ceph/mon/ceph-ceph-0 for mon.ceph-0
[2013-04-11T13:44:49+00:00] INFO: execute[ceph-mon mkfs] ran successfully
[2013-04-11T13:44:49+00:00] INFO: execute[ceph-mon mkfs] sending start action to service[ceph_mon] (immediate)
[2013-04-11T13:44:49+00:00] INFO: Processing service[ceph_mon] action start (ceph::mon line 23)
[2013-04-11T13:44:49+00:00] INFO: service[ceph_mon] started
[2013-04-11T13:44:49+00:00] INFO: Processing ruby_block[tell ceph-mon about its peers] action create (ceph::mon line 64)
connect to
/var/run/ceph/ceph-mon.ceph-0.asok
failed with
(2) No such file or directory

connect to
/var/run/ceph/ceph-mon.ceph-0.asok
failed with
(2) No such file or directory

[2013-04-11T13:44:49+00:00] INFO: ruby_block[tell ceph-mon about its peers] called
[2013-04-11T13:44:49+00:00] INFO: Processing ruby_block[get osd-bootstrap keyring] action create (ceph::mon line 79)
2013-04-11 13:44:49.928800 7f58e9677700 0
-- :/23863 >> 192.168.4.117:6789/0 pipe(0x18f0d30 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault

2013-04-11 13:44:52.928739 7f58efc1c700 0 -- :/23863 >> 192.168.4.118:6789/0 pipe(0x7f58e0000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-04-11 13:44:55.929375 7f58e9677700 0 -- :/23863 >> 192.168.4.117:6789/0 pipe(0x7f58e0003010 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-04-11 13:44:58.929211 7f58efc1c700 0 -- :/23863 >> 192.168.4.118:6789/0 pipe(0x7f58e00039f0 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-04-11 13:45:01.929787 7f58e9677700 0 -- :/23863 >> 192.168.4.117:6789/0 pipe(0x7f58e00023b0 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
[...]

And it’s stuck there, trying and failing to talk to something.

See those “no such file or directory” errors after “service[ceph_mon] started”? Yeah? Well, the mon isn’t started, hence the missing sockets in /var/run/ceph.

Why isn’t the mon started? Turns out the ceph init script won’t start any mon (or osd or mds for that matter) if you don’t have entries in the config file with some suffix, e.g. [mon.a]. And all I’ve got is:

[global]
  fsid =  f80aba97-26c5-4aa3-971e-09c5a3afa32f
  mon initial members = ceph-0,ceph-1
  mon host = 192.168.4.118:6789, 192.168.4.117:6789

[osd]
    osd journal size = 1000
    filestore xattr use omap = true

But given the mon recipe triggers ceph-mon-all-starter if using upstart (which it would be, on the “Tested as working: Ubuntu Precise”), and ceph-mon-all-starter seems to just ultimately run something like ceph-mon --cluster=ceph -i ceph-0 regardless of what’s in the config file… Maybe I can cheat.

Directly starting ceph-mon from a shell on ceph-0 before the chef-client run turned out to be a bad idea (bit of a chicken and egg problem figuring out what to inject into the “mon host” line of the config file). So I put a bit of evil into the mon recipe:

diff --git a/recipes/mon.rb b/recipes/mon.rb
index 5cd76de..a518830 100644
--- a/recipes/mon.rb
+++ b/recipes/mon.rb
@@ -61,6 +61,10 @@ EOH
   notifies :start, "service[ceph_mon]", :immediately
 end
 
+execute 'hack to force mon start' do
+  command "ceph-mon --cluster=ceph -i #{node['hostname']}"
+end
+
 ruby_block "tell ceph-mon about its peers" do
   block do
     mon_addresses = get_mon_addresses()

Try again:

# knife ssh name:ceph-0.example.com -x root chef-client
[2013-04-11T15:10:43+00:00] INFO: *** Chef 10.24.0 ***
[2013-04-11T15:10:44+00:00] INFO: Run List is [role[ceph-mon], role[ceph-osd], role[ceph-mds]]
[2013-04-11T15:10:44+00:00] INFO: Run List expands to [ceph::mon, ceph::osd, ceph::mds]
[2013-04-11T15:10:44+00:00] INFO: HTTP Request Returned 404 Not Found: No routes match the request: /reports/nodes/ceph-0.example.com/runs
[2013-04-11T15:10:44+00:00] INFO: Starting Chef Run for ceph-0.example.com
[2013-04-11T15:10:44+00:00] INFO: Running start handlers
[2013-04-11T15:10:44+00:00] INFO: Start handlers complete.
[2013-04-11T15:10:44+00:00] INFO: Loading cookbooks [apache2, apt, ceph]
[2013-04-11T15:10:44+00:00] INFO: Storing updated cookbooks/ceph/recipes/mon.rb in the cache.
No ceph-mon found.

[2013-04-11T15:10:44+00:00] INFO: Processing template[/etc/ceph/ceph.conf] action create (ceph::conf line 6)
[2013-04-11T15:10:44+00:00] INFO: Processing service[ceph_mon] action nothing (ceph::mon line 23)
[2013-04-11T15:10:44+00:00] INFO: Processing execute[ceph-mon mkfs] action run (ceph::mon line 40)
[2013-04-11T15:10:44+00:00] INFO: Processing execute[hack to force mon start] action run (ceph::mon line 65)
starting mon.ceph-0 rank 1 at 192.168.4.118:6789/0 mon_data /var/lib/ceph/mon/ceph-ceph-0 fsid f80aba97-26c5-4aa3-971e-09c5a3afa32f
[2013-04-11T15:10:44+00:00] INFO: execute[hack to force mon start] ran successfully
[2013-04-11T15:10:44+00:00] INFO: Processing ruby_block[tell ceph-mon about its peers] action create (ceph::mon line 69)
adding peer 192.168.4.118:6789/0 to list: 192.168.4.117:6789/0,192.168.4.118:6789/0

adding peer 192.168.4.117:6789/0 to list: 192.168.4.117:6789/0,192.168.4.118:6789/0

[2013-04-11T15:10:44+00:00] INFO: ruby_block[tell ceph-mon about its peers] called
[2013-04-11T15:10:44+00:00] INFO: Processing ruby_block[get osd-bootstrap keyring] action create (ceph::mon line 84)
2013-04-11 15:10:44.432266 7f8f9f8c0700  0 
-- :/25965 >> 192.168.4.117:6789/0 pipe(0x16d9d30 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault

2013-04-11 15:10:50.433053 7f8f9f7bf700  0 -- 192.168.4.118:0/25965 >> 192.168.4.117:6789/0 pipe(0x7f8f94001d30 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-04-11 15:10:56.433268 7f8fa5e65700  0 -- 192.168.4.118:0/25965 >> 192.168.4.117:6789/0 pipe(0x7f8f94001d30 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-04-11 15:11:02.433987 7f8f9f8c0700  0 -- 192.168.4.118:0/25965 >> 192.168.4.117:6789/0 pipe(0x7f8f94002db0 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault
2013-04-11 15:11:08.434358 7f8f9f7bf700  0 -- 192.168.4.118:0/25965 >> 192.168.4.117:6789/0 pipe(0x7f8f94004fb0 sd=3 :0 s=1 pgs=0 cs=0 l=1).fault

At this point it’s stalled presumably waiting to talk to the other mon, so in another terminal window had to kick off a chef-client run on ceph-1 to get it into the same state as ceph-0 (knife ssh name:ceph-1.example.com -x root chef-client). This allowed both nodes to progress to the next problem:

2013-04-11 15:11:28.563438 7f8fa5e67780 -1 monclient(hunting): authenticate NOTE: no keyring found; disabled cephx authentication
2013-04-11 15:11:28.563443 7f8fa5e67780 -1 unable to authenticate as client.admin
2013-04-11 15:11:28.563814 7f8fa5e67780 -1 ceph_tool_common_init failed.
2013-04-11 15:11:29.572208 7f2369130780 -1 monclient(hunting): authenticate NOTE: no keyring found; disabled cephx authentication
2013-04-11 15:11:29.572210 7f2369130780 -1 unable to authenticate as client.admin
2013-04-11 15:11:29.572527 7f2369130780 -1 ceph_tool_common_init failed.
2013-04-11 15:11:31.380073 7f1907d18780 -1 monclient(hunting): authenticate NOTE: no keyring found; disabled cephx authentication
2013-04-11 15:11:31.380078 7f1907d18780 -1 unable to authenticate as client.admin
2013-04-11 15:11:31.380720 7f1907d18780 -1 ceph_tool_common_init failed.
2013-04-11 15:11:32.392345 7fc2bc462780 -1 monclient(hunting): authenticate NOTE: no keyring found; disabled cephx authentication
[...]

And we’re spinning again.

But that’s enough for one day.

Hackweek 9: Ceph Appliance Odyssey

This week is SUSE Hack Week 9. I wanted to spend some time working on a Ceph appliance image to make it easy to play with Ceph on openSUSE and/or SLES.

I tried making a SLES 11 SP2 appliance with SUSE Studio. I had to add the filesystems and devel:libraries:c_c++ repos from OBS to get reasonably up-to-date Ceph 0.56 and libboost_thread.so.1.49.0, but on boot when the appliance tried to expand its root filesystem, it died claiming it couldn’t load libe2p.so.2. Studio claims to be pulling in e2fsprogs from both the SP2 Updates and filesystems repo, so maybe that’s the problem. It seems impossible to choose one or the other, as they are the same version. (Update: it was just pointed out to me that you can click the little box next to the version number to choose which one is installed – must try again.)

So I left that alone and tried an openSUSE 12.3 appliance. The filesystems/ceph build for 12.3 is disabled, so I branched it and kicked off a build which failed with an exciting OOM error:

[ 3831s] [ 3803.167109] Out of memory: Kill process 16364 (cc1plus) score 254 or sacrifice child
[ 3831s] [ 3803.167959] Killed process 16364 (cc1plus) total-vm:825128kB, anon-rss:168760kB, file-rss:4kB
[ 3831s] g++: internal compiler error: Killed (program cc1plus)
[ 3831s] Please submit a full bug report,
[ 3831s] with preprocessed source if appropriate.
[ 3831s] See  for instructions.

Guess I should do what it says and file a bug. But I really did want something to play with immediately, so I added http://ceph.com/rpm/opensuse12/x86_64/ as a repo, and pulled in the upstream Ceph 0.56 RPMs. This seems to have worked and given me an openSUSE 12.3 image I can use to run through the Ceph 5-Minute Quick Start, Block Device Quick Start and CephFS Quick Start. So, here’s my extremely terse openSUSEified version of those quick start documents:

5-Minute Quick Start

Deploy the Appliance Image

I’m doing this with a couple of VMs, so in my case I make a couple of copies of the image:

# cp ~/openSUSE_12.3_Ceph_0.56.x86_64-0.0.3.qcow2 \
    /var/lib/libvirt/images/ceph-quickstart-server.qcow2
# cp ~/openSUSE_12.3_Ceph_0.56.x86_64-0.0.3.qcow2 \
    /var/lib/libvirt/images/ceph-quickstart-client.qcow2

Then I use virt-manager to create two VMs, backed by those images. Boot ‘em up, log in (root password is “linux”), run yast network and set sensible hostnames (“ceph-client” and “ceph-server” instead of “linux-kjqd”, although admittedly those names wouldn’t be very sensible in a real deployment with more than one node).

Edit the Configuration File

The appliance image includes the /etc/ceph/ceph.conf file from the original 5-minute quick start, so log in to ceph-server, edit that file and replace {hostname} and {ip-address} with their real values, then copy the configuration file to ceph-client:

# scp /etc/ceph/ceph.conf ceph-client:/etc/ceph/

Deploy the Configuration

On ceph-server, create directories for each daemon:

# mkdir -p /var/lib/ceph/osd/ceph-0
# mkdir -p /var/lib/ceph/osd/ceph-1
# mkdir -p /var/lib/ceph/mon/ceph-a
# mkdir -p /var/lib/ceph/mds/ceph-a

Still on ceph-server, run the following:

# cd /etc/ceph
# mkcephfs -a -c /etc/ceph/ceph.conf -k ceph.keyring

Start Ceph

On ceph-server:

# chkconfig ceph on
# rcceph start
# ceph health

This will initially show something like:

HEALTH_ERR 576 pgs stuck inactive; 576 pgs stuck unclean; no osds

Eventually it will say HEALTH_OK and you’re good to go.

Copy the Keyring to the Client

This is necessary for authentication:

# scp /etc/ceph/ceph.keyring ceph-client:/etc/ceph/

Block Device Quick Start

On ceph-client:

# rbd create foo --size 4096
# modprobe rbd
# rbd map foo --pool rbd --name client.admin
# mkfs.ext4 -m0 /dev/rbd1
# mkdir /mnt/myrbd
# mount /dev/rbd1 /mnt/myrbd

(Why is this /dev/rbd1, not /dev/rbd/rbd/foo as in the original quick start?)

CephFS Quick Start

On ceph-client (kernel driver, not FUSE):

# mkdir /mnt/mycephfs
# mount -t ceph -o name=admin,secret=$(ceph-authtool \
    --name client.admin /etc/ceph/ceph.keyring --print-key) \
    ceph-server:/ /mnt/mycephfs

Interestingly, this gives “mount: error writing /etc/mtab: Invalid argument”, but still seems to actually mount the filesystem.

Also note that it appears I have 32GB of space for Ceph to use, even though ceph-server only has a 16GB root partition. I rather think that’s because there’s two OSDs, but both are just running off the root filesystem, they’re not separate disks/filesystems. I assume this is one of those Don’t Try This At Home things.

 

openSUSE 12.3 / Lenovo T430

My new Lenovo T430 arrived last week. After delighting in that satisfying new laptop smell, I made recovery DVDs I will presumably never need, then blew away Windows 7 and installed openSUSE 12.3 (full disclosure: I work for SUSE, so my choice of distro may not be entirely unbiased).

Some niceties:

  • The textured touchpad is lovely. Much better feel than a pure flat surface.
  • As I’d expect, the keyboard is excellent (even if PGUP/PGDN aren’t where I’m used to).
  • The openSUSE installer is quick and easy. I’m pretty sure there’s less steps than last time I did a regular openSUSE install from scratch a couple of years ago.
  • No problem setting up encrypted LVM, although on my ~500GB drive it defaults to a 20GB root and 25GB /home, with a whole lotta free space left over in the encrypted partition, so that might want some tweaking.
  • Entering the passphrase on boot happens on a pretty graphical screen, you don’t get thrown back to a terminal window where random junk is appearing over the passphrase entry prompt.
  • Moving my mail over from my old laptop was pretty much just an rsync of the Thunderbird profile directory (and maybe a tweak to ~/.thunderbird/profiles.ini)

Some oddities:

  • The Novell GroupWise 8.0.2 client had a couple of problems:
    • It claims to need libXm.so.3 (listed in RPM Requires), but works fine without it. This is fortunate, because openSUSE 12.3 doesn’t ship openmotif22-libs-32bit anymore.
    • Unless you’ve installed libpangox-1_0-0-32bit, the GroupWise client will segfault somewhere in libwebrenderer.so. This is less than obvious.
  • The YaST disk partitioner seems slightly confused adding new LVs inside my encrypted VG later on (it either locked up or crashed). I haven’t had time to investigate this properly, so I’ve ignored it for the moment and used lvcreate and mkfs in a terminal instead.
  • You do need to reboot at least once after initial install for NetworkManager to work properly (this is mentioned in the release notes).
  • I’m running GNOME 3.6, and I tried using the tweak tool to have it just blank the screen – not suspend – when closing the laptop lid. Turns out systemd is being too clever for me, so I had to fiddle with that a bit (set HandleLidSwitch=ignore in /etc/systemd/logind.conf, then run sudo systemctl restart systemd-logind).

Very little else to report so far. Aside from the oddities above everything else seems to Just WorkTM. OTOH, all I’ve really done is web browsing, email and assorted fiddling around in terminals. Maybe listened to a bit of music (the inbuilt speakers are well and truly loud enough, but a bit tinnier than real speakers – can’t say I’m terribly surprised by that though).

Cloud Infrastructure, Distributed Storage and High Availability at LCA 2013

I’m pleased to announce that we will be holding a one day Cloud Infrastructure, Distributed Storage and High Availability mini conference on Monday 28 January 2013 as part of linux.conf.au 2013 in Canberra, Australia.

This miniconf is about building reliable infrastructure, from two-node HA failover pairs to multi-thousand-core cloud systems. You might like to think of it as a sequel to the LCA 2012 High Availability and Distributed Storage miniconf (videos here).

Do any of the following describe you?

  • You’re building cloud infrastructure for others to use (openstack, cloudstack, eucalyptus, …)
  • Your data needs to be reliably available everywhere (ceph, glusterfs, drbd, …)
  • Your system absolutely must be up all the time (pacemaker, corosync, …)

If so, this is the miniconf for you! Please consider submitting a presentation at http://tinyurl.com/cidsha-lca2013

We’re expecting most talk slots to be 25 minutes (including questions and changeover), but there will be openings for shorter lightning talks and maybe a couple of longer talks. CFP closes on Sunday November 4, 2012. Notifications of acceptance will be emailed out after this date.

Note that there is also an OpenStack-specific miniconf running on Tuesday 29 January. We’re hoping this will give us a pretty awesome two-day LCA 2013 CloudFest. As a rough rule of thumb, more generic or infrastructure-related talks should go to Cloud, Distributed Storage & HA, while deeper OpenStack-specific talks should probably go to the OpenStack miniconf. If in doubt, or if you have any other questions, please contact me directly.

Thanks!