Ops

Making tools that aren't a hammer

There’s a famous saying about hammers and problems. I think it goes like if all you have is a hammer, everything looks like a nail. That’s a common thing to happen on the devops/sysadmin trade and goes completely against the UNIX Philosophy of small tools getting together to achieve big things. Sometimes we will essentially try to fit the problem to the tool, just because we have all this variety of do everything tools.

One old weird trick to speed SSH

SSH is awesome, it’s perhaps into the top 10 most important tools for the sysadmin job and still surprises me sometimes with cool features. Doing some reading about cluster administration using SSH I stumbled into this: Host * ControlMaster auto ControlPath /tmp/ssh_mux_%h_%p_%r ControlPersist 4h What witchery is that, you ask? Has science gone too far? Well, no, that’s actually pretty old and I should be probably ashamed of only discovering it so recently.

Debugging segfaults from logs to gdb

So this week after a version upgrade on GraphicsMagick we got some segfaults on our servers. Nothing terrible, twelve segfaults or close to that on a 24 hour period. The only information was a line on /var/log/kernel.log:

Feb 22 13:28:27 serverXX kernel: [1953364.275653] gm[16356]: segfault at 0 ip 00007fd137bd41e0 sp 00007fff5770dcd0 error 6 in libGraphicsMagick.so.3.7.0[7fd1379b9000+29d000]

No core dumps since ulimit -c is zeroed. What to do to at least have an idea of what is happening?

HoardD javascript support (and #monitoringsucks)

First things first: the latest revision of HoardD on github already supports scripts written in pure javascript. Really it was easy to make it work but I kinda overlooked it on the first version. The README.md is already updated. Second: most of you probably know the #monitoringsucks movement/hashtag/discussions. I totally agree that the current monitoring tools only do part of the job and getting them to work together is horrible. I have some ideas on how to solve the problem, but the path from idea to code is a long one.

Announcing HoardD

Getting back to work as a full time sysadmin was great, I got back to speed on scalability, updated my toolbox and learnt about other fantastic tools, like Graphite. Graphite is a graphing tool, extremely configurable and scalable. One thing, though, bothered me: the lack of good tools to send server metrics to it. I tried collectd graphite plugins and none did what I wanted the way I wanted.

So I decided to flex my node.js dev muscles and here is HoardD. This is a node.js app written in coffee-script that basically runs scripts and tools to get information about a server and sends it to carbon (Graphite’s storage backend). It’s easily expansible to include more metrics and very very fast and small (11MB or so, depending on scripts loaded, most of it is node).

Me and Linux

So, 20 years ago Linus was sending his now famous Usenet message about a new hobby. It’s a big date, 20 years and that went from a small project to what we have now and pretty much my only way to make money and live :)

Generating a self-signed SSL Certificate, commands only

This post is one of those quick notes to myself, so I can stop searching this info everytime I need it. This is the sequence of commands to create a self-signed SSL certificate. If you need to know what you are doing please go read this post. I suggest you to paste them one by one, and by you I mean the future me, that will try to paste them all at once and do something stupid.

Finding Those Empty Things on Chef With Knife

So after a time with a team dealing with chef and knife you end having a pretty vast chef repository. Git helps but it can get a little messy, use the little loops below to search for empty roles run lists or roles with no nodes. They work on chef 0.9, chef 0.10 apparently has some fancy search plugin to do the same (and more) but I haven’t tested it yet.