There’s a famous saying about hammers and problems. I think it goes like if all you have is a hammer, everything looks like a nail. That’s a common thing to happen on the devops/sysadmin trade and goes completely against the UNIX Philosophy of small tools getting together to achieve big things. Sometimes we will essentially try to fit the problem to the tool, just because we have all this variety of do everything tools.
SSH is awesome, it’s perhaps into the top 10 most important tools for the sysadmin job and still surprises me sometimes with cool features. Doing some reading about cluster administration using SSH I stumbled into this: Host * ControlMaster auto ControlPath /tmp/ssh_mux_%h_%p_%r ControlPersist 4h What witchery is that, you ask? Has science gone too far? Well, no, that’s actually pretty old and I should be probably ashamed of only discovering it so recently.
So this week after a version upgrade on GraphicsMagick we got some segfaults on our servers. Nothing terrible, twelve segfaults or close to that on a 24 hour period. The only information was a line on
Feb 22 13:28:27 serverXX kernel: [1953364.275653] gm: segfault at 0 ip 00007fd137bd41e0 sp 00007fff5770dcd0 error 6 in libGraphicsMagick.so.3.7.0[7fd1379b9000+29d000]
No core dumps since
ulimit -c is zeroed. What to do to at least have an idea of what is happening?
Getting back to work as a full time sysadmin was great, I got back to speed on scalability, updated my toolbox and learnt about other fantastic tools, like Graphite. Graphite is a graphing tool, extremely configurable and scalable. One thing, though, bothered me: the lack of good tools to send server metrics to it. I tried collectd graphite plugins and none did what I wanted the way I wanted.
So I decided to flex my node.js dev muscles and here is HoardD. This is a node.js app written in coffee-script that basically runs scripts and tools to get information about a server and sends it to carbon (Graphite’s storage backend). It’s easily expansible to include more metrics and very very fast and small (11MB or so, depending on scripts loaded, most of it is node).
So, 20 years ago Linus was sending his now famous Usenet message about a new hobby. It’s a big date, 20 years and that went from a small project to what we have now and pretty much my only way to make money and live :)
This post is one of those quick notes to myself, so I can stop searching this info everytime I need it. This is the sequence of commands to create a self-signed SSL certificate. If you need to know what you are doing please go read this post. I suggest you to paste them one by one, and by you I mean the future me, that will try to paste them all at once and do something stupid.
So after a time with a team dealing with chef and knife you end having a pretty vast chef repository. Git helps but it can get a little messy, use the little loops below to search for empty roles run lists or roles with no nodes. They work on chef 0.9, chef 0.10 apparently has some fancy search plugin to do the same (and more) but I haven’t tested it yet.