April 21, 2012
Monsieur Boulet nails it in a panel from this comic

Monsieur Boulet nails it in a panel from this comic

April 5, 2012
"Nothing is easy, that’s worth a shit"

— Spermbirds

February 28, 2012
"Follow the hood ornament cause the rear view mirror is just for checking how good you look"

— David Lee Roth

February 27, 2012
Color study. Will do a few more. Then order some temp tats.

Color study. Will do a few more. Then order some temp tats.

February 25, 2012
This is not my dog

This is not my dog

February 17, 2012
"Few companies that installed computers to reduce the employment of clerks have realized their expectations… They now need more, and more expensive clerks even though they call them ‘operators’ or ‘programmers.’"

— Peter Drucker

February 10, 2012
This is not my dog

This is not my dog

December 16, 2011

RT @strcpy (via @sdague): haha, SOPA - What would Bender do? http://t.co/dCjYoYJj

December 7, 2011
The Evolution of Cruft

So I was thinking about the last post, and got on the machine that had the original (albeit broken) solution. Which was so much more elegant than the working solution, even though it missed some cases that the working version caught. Here it is, bask in the simplicity:

#!/bin/awk

function isnum(x){return(x==x+0)}

BEGIN { OFS = ";"; ORS = "" }

{
	if (NR==1) {
		fields=NF
		for (i=1; i <= NF; i++) {
			headers[i] = $i
			}
		}

	else {
		print "{"
		for (i=1; i < fields; i++) {
			print "\""headers[i]":\" " 
			if (isnum($i)) {
				print $i
				}
			else{
				print "\"" $i "\""
					}
			print	","
		}
		print "\""headers[fields]":\" \"" 
		for (i=fields; i <=NF; i++) {
			print $i 
		}
		print "\"}\n"
	}
}

December 2, 2011
awkward: JSON-ify command output

I was messing around with getting data into BUGswarm from some standard Linux/UNIX commands like ps, iostat, vmstat. I started with a sed statement that made a JSON string out of every output line:

ps -ef |sed 's/\(.*\)/[\"\1\"]/' | produce.py

It was a decent start, it counted as valid JSON and I was able to post to my swarm. But the data wasn’t organized, just messy strings:

["UID        PID  PPID  C STIME TTY          TIME CMD"]
["root         1     0  0 Oct25 ?        00:00:02 /sbin/init"]
["root         2     0  0 Oct25 ?        00:00:00 [kthreadd]"]
["root         3     2  0 Oct25 ?        00:00:00 [migration/0]"]

Next I took a stab at creating a JSON object from vmstat’s output. I made an ugly ugly single awk print statement that hard coded every parameter from vmstat.

{ print "{\"r\": "$1", \"b\": "$2", \"swpd\": "$3", \"free\": "$4", \"buff\": "$5", \"cache\": "$6","\
	"\"si\": "$7", \"so\": "$8", \"bi\": "$9", \"bo\": "$10", \"in\": "$11", \"cs\": "$12", \"us\": \
	"$13", \"sy\": "$14", \"id\": "$15", \"wa\": "$16"}"} 

Run vmstat, pipe into that monstrosity and get something kinda useful:

{"r": 3, "b": 0, "swpd": 162668, "free": 461012, "buff": 3036440, "cache": 634388,"si": 0, "so": 0, "bi": 9, "bo": 6, "in": 1, "cs": 1, "us": 	4, "sy": 1, "id": 94, "wa": 0}

Cool. So I went home. All the way it stewed in my brain. I wanted to read in the column headers and use that to tag the data. As long as there was a line that had the data description, I could make the object.

stew stew stew….

Then I sat down and wrote this messy little piece of awkward goodness. The default is to look for header descriptions on the first line. Some commands like vmstat and io stat give other bits before the data columns. I put in the header var to account for that. On the command line specify header= and the line number with the column headers.

A few hints as you read through the code. NR is the number of records read so far. Think of it as line number. NF is the number of fields in a record. Let’s take a walk through the code

First up is a little function to tell us if something is a numeric value

function isnum(x){return(x==x+0)}

In the BEGIN clause I set OFS (output record separator) to nothing, this way I can piece out the record in multiple print statements. I also check if header was passed in from the command line. If not, set the default of 1:

BEGIN {
	ORS = ""
	if (!header) {
		header=1
	}
}

If the current record is less than our specified start line, throw the whole line away

NR < header {next}

This if statement could have been a separate clause for NR==header. I’m primarily a C coder (I didn’t figure out the clause check until this morning) so I went with an if to check if the line was the header line.

The assumption is that the number of headers is the number of fields we should expect. But the output of ps will come in with a higher field count on lines that have complete command lines for running processes (ps I’m looking at you!). Later we use this to concatenate remaining records in the last entry

{
	if (NR==header) {
		fields=NF
		for (i=1; i <= NF; i++) {
			headers[i] = $i
		}
	}

Now to the guts. Read in a line, print the header as the object field description, print the value (as a string if necessary).

	else {
		print "{"
			for (i=1; i < fields; i++) {
				print "\""headers[i]"\": " 
				if (isnum($i)) {
					print $i
				}
				else{
					print "\"" $i "\""
			}
			print	", "
		}

		print "\""headers[fields]"\": "

The first draft of code is always so pretty. Then you find corner cases that the short elegant solution doesn’t cover. Feh. The logic goes like this. If the number of fields in this record matches what was expected, print it out and be done. Printing of course checks for numeric or string and accomodates

		if (NF==fields) {
			if (isnum($i)) {
				print $i
			}
			else {
				print "\"" $i "\""
			}
		}

But if we have more fields than expected, concatenate them into the last field (ps, this is all about you). Since a number wouldn’t be in multiple parts, assume stringage. Finish up the record and have a nice day

		else {
			print "\""
			for (i=fields; i <=NF; i++) {
				print $i " "
			}
		print "\"" 
		}

	print "}\n"

	}

Oh right. Since this is piping into another program that will send it into the cloud, flush out this line and let the magic happen

	fflush()
}

Well, that’s it. Next up I’ll be specifying another variable on the command line to name the object. This way the application that is digesting the data that is passed through the swarm will know what it’s getting.

Oh heck, you want to see how it works? I run it like this:

vmstat 2 |awk -f ../awkward header=2

and get output like this (actually I don’t get output, I pipe it into produce.py and it goes into the specified swarm)

{"r": 1, "b": 0, "swpd": 162736, "free": 533052, "buff": 2989260, "cache": 641272, "si": 0, "so": 0, "bi": 0, "bo": 10, "in": 966, "cs": 3122, "us": 10, "sy": 3, "id": 87, "wa": 0}
{"r": 1, "b": 0, "swpd": 162736, "free": 533044, "buff": 2989260, "cache": 641280, "si": 0, "so": 0, "bi": 0, "bo": 0, "in": 1014, "cs": 3205, "us": 9, "sy": 4, "id": 87, "wa": 0}
{"r": 1, "b": 0, "swpd": 162736, "free": 533060, "buff": 2989260, "cache": 641276, "si": 0, "so": 0, "bi": 0, "bo": 0, "in": 1002, "cs": 3033, "us": 9, "sy": 3, "id": 88, "wa": 0}
{"r": 1, "b": 0, "swpd": 162736, "free": 533044, "buff": 2989264, "cache": 641280, "si": 0, "so": 0, "bi": 0, "bo": 10, "in": 909, "cs": 2924, "us": 10, "sy": 3, "id": 86, "wa": 0}

Q.E.D

Get the complete script in github.

October 25, 2011

RT @JillEBond: #GOP has introduced these bills: 44 on abortion, 99 on religion, 71 family relationships, 36 on marriage, 522 on taxation …

October 24, 2011

Grumpy pancake says “Do not eat!” http://t.co/RsI7wJut

October 20, 2011

RT @doctorow: Free Bieber: campaign to kill proposed law that would send you to prison for 5 years for singing copyrighted music http:// …

October 13, 2011

RT @BUGswarm: Introducing BUGswarm: A new way to acquire data from and control embedded devices using JavaScript or plain old HTTP. htt …

October 12, 2011

Elton John: There’s a Global War Against the Right of Gay People to Live and Love. We Need to Fight Back http://t.co/qnfQASaq