Carpe Diem

31 Jan

Bull riding

Falling off a freakin' bull

I recently got a chance to do something I’d always wanted to try: bull riding. Sure, I’m getting a little old for it. Sure, it’s been called “the most dangerous 8 seconds in sports.” That didn’t matter to me. It was worth it. Both times.

People generally act with some mixture of surprise and concern for my mental health. But I assure them I’m fine.

I had a chance to realize a childhood dream and took it. But I won’t lie… it was terrifying. While I was pulling on the chaps I was so full of adrenaline that I was fumbling words. I had to concentrate in order to finish a thought. My hands were numb. The cowboys simply called it “respect for the bull.”

Trust me, it was just a completely justifiable reaction. That massive bull tossed me off with ease. I lasted maybe just a couple seconds.

Anyway, I had wanted to draw this into a programming post somehow but I was making all sorts of overwrought analogies that just didn’t work. Actually, if I were being honest with myself, I just wanted to tell everybody how much fun I had.

Skydiving picture from "The Bucket List"

So I won’t be paining myself to stretch this post. Instead, go watch “Dead Poets Society,” or Steve Jobs address Stanford, or whatever inspires you. Then stop procrastinating and get back to work!





-

No bulls were hurt in the making of this post. This particular bull will be a PBR bull someday, which makes him something of a prize and livelihood for his owner. He has every incentive to take care of them. And no, before you ask.

Jepp 2.4 – Released

26 Jan

Jepp (Java Embedded Python) is a small C library and jar to embed CPython in a Java application. This lets you script Java objects similar to Java scripting languages.

https://sourceforge.net/projects/jepp/files/

I’ve added a build of Python 2.6.4 with Visual Studio 2003. Recent Python builds have switched to VS 2008, but until Java updates they’ll be incompatible. This is the first time I’ve built Python on Windows, and I’m generally not to familiar with Windows to begin with, so if you have a better source then use that. I believe it works though.

Release 2.4 reverts the older import hack that was raising havoc with some native Python modules. In order to restore the old functionality, just run:

from jep import *
__builtins__.__import__ = jep.jimport

Or build with –enable-import.

Let me know if you need help.

As seen on Twitter

6 Jan

So, I’ve given in and I’m on Twitter now and I’m basically talking to myself. Laugh at me!

It seems the more I deride something, the more popular it eventually becomes. See Twitter, the XBox, iPhone, Farmville, MySpace (well, I wasn’t so off on that one), Windows, etc.

Yeah. Well, I don’t own stocks for a good reason. I kid Twitter but it’s pretty cool to follow people you find interesting.

Still, I can’t resist posting Twitter’s not-so-impressive stats from Leemba:

SLA Metrics:

Availability: 99.87599%
Mean time to recovery (hours): 0.014
Mean time between failures (hours): 11.294
Total failures (red or purple status): 35
Since service was created on: Sun 20 Dec 2009 11:11:24 AM PST

10 years ago

1 Jan

World will end in year 2000

It was not better back then.

10 years ago I switched to Linux full time. Having tried and failed at Debian first, I later found a SuSE 6 box at a local retailer and have been running Linux ever since. I switched to Debian a short year or two later.

10 years ago I was screaming at the TV while watching CSU beat CU. That was a good omen for the decade.

10.8 years ago I’d argued why Linux would never amount to anything and how awesome Windows was.

10 years ago my external DVD drive was pretty cool.

10 years ago I thought being able to write a Javascript image swap would get me a job.

9.6 years ago it kinda did.

10 years ago I was hoping the power would go out due to y2k and spare me from staying at an awkward party.

10 years ago IE was a decent browser. Netscape 4 seemed to make everything difficult.

10 years ago I used a Yahoo email account because it had 4 megs of free storage. Hotmail was porting to Windows.

10 years ago I was learning PHP3. I didn’t like it. I was also creating my first website for a local newspaper, which was an ugly, table-based embarrassment of bad color choices and marquee text. None of it was saved in the wayback machine so it’s thankfully ceased to exist.

10 years ago I wrote an awful tech advice column. The first article was how to backup files between two computers using modems. I hadn’t learned a lot of networking at that point.

10 years ago I hadn’t yet used Visual Basic.

10 years ago Fat Tire was a cool local beer that few outside of Fort Collins, CO had ever heard of. Now it seems every corner store carries seasonal brews and every bar has Fat Tire on tap.

Happy New Year!

No Windows!

26 Dec

There was just no way I was going to get my step daughter a Windows netbook for Christmas. No way, no how. I couldn’t find a good way to articulate why, and so for a couple weeks I was just responding, “No Windows!” to my wife’s questions. I hadn’t expected it to be controversial but I did eventually have to explain myself.

Even after disregarding all of the security issues and the lame attempts at fixing administrative access, it was mainly an issue of freedom. Windows is proprietary and I was determined that her first computer would be more educational than a dumb terminal for homework.

I got my validation yesterday. Not more than an hour after she’d ripped off the gift wrap and screamed in delight, she was showing off a drawing in a program that I’d never seen before. I’ve been using Linux for ten years and she’d already managed to show me something new. She’d found it all on her own. (It sure wasn’t me. The only time I’d been allowed to touch the netbook was to type the wireless password.)

Kids learn quickly. They’re simply better able to pick up new computer interfaces and learn how to use them, long before most adults.

All kinds of Free

Adults hate new things. I don’t know what my success ratio would be trying to switch somebody to Linux or Firefox or OpenOffice or whatever, and I really don’t want to know. It’d be too depressing. There’s no way to count the number of times I’ve seen superior software pushed aside for something more familiar, even if that meant putting up with crashes and viruses or spending hundreds of dollars.

After a long enough time, better software can begin to make headway but, man, it takes such a long time. The few times I’ve really been successful switching somebody they were either young, who will pickup anything quickly, or to adults who’ve had some previous exposure to other operating systems. Usually that means they used something else when they were young.

Actually, when I first picked up Linux, the thought ringing in my head was, “This is just like DOS! Except it doesn’t suck!” Far from the horror stories of actually having to type commands (gasp!), I felt immediately at home. That’s because the old clunker I’d grown up with had run DOS. It didn’t matter how vastly different it really was, it mattered that I felt comfortable and wanted to learn more.

While I’m sure she will have to use Windows frequently in school and work, I hope that a dose of Linux now will help her be more open to new things her whole life.

Practical things

Her main complaint so far is the netbook is unable to play Windows games. Fathers will be all too familiar with the Disney channel and probably Wizards 101. But I doubt the little thing even has the power to run that game. Anyway, she’s not too heartbroken over it.

After all, she’s already found the Disney site where she can watch videos without torturing the rest of us.

Still, people will sputter, “She’ll have to use Windows at work!” They’ll cross their arms and lean back as if working is the most important thing anybody will ever do.

Well, maybe she will. But she won’t have to deal with that for probably another ten years. What OS she’ll use then is completely unknowable. Even if it’s Windows, it’s unlikely to be what we have today. And just like I have to learn where they heck they put the network properties in every new Windows release, she will always need to learn new things.

Besides, she uses plenty of Windows at school and her Grandma’s house. She already knows Windows plenty well enough.

So far the netbook as been a resounding success.

Client-side charting for Leemba

19 Dec

Well, nobody’s heard from me in quite a while because I’ve been hard at work building my server monitor. Sorry, I suck. I have been, among other things, trying to get charting right.

It seems that while everybody has a different monitoring tool, most of the open source monitors use either MTRG or some other sort of pre-built image for their charts.

The problems with that approach are numerous. Like you might have to change the data to avoid big outliers to be able to view smaller details. Or often times you just get a simple averaged chart when minimum or maximum values are more interesting. Without real time graphing, you might have to wait several minutes or hours to see changes on your chart.

For Leemba, my monitor and all-around time sink, I decided on a hybrid approach. Charts for overall historical values are generated nightly. But recent data is charted on the fly with flot. It uses the <canvas> tag to basically draw an image with Javascript.

As it turns out, client-side charting can be dramatically better. When it’s not possible to anticipate the client needs on the server, like charting arbitrary time ranges, logic can and should be pushed to the client. Fortunately for us, browsers are getting a lot smarter.

For example, here’s a chart of Twitter.com response times from noon December 15th, to noon December 19th:

Twitter Response Times

Twitter response times (click for original size)

We can see increased response times peaking around noon every day. The maximum response was 10047ms and minimum for this period was 36ms.

By default Leemba will display averages. Without some sort of downsampling, every minute in the range would be a point on the chart. It’s just too much data to serialize and ship to the browser fast enough. As shown in the top-left options panel, the current period is 30 minutes (i.e. every 30 minutes is one point).

In many monitors, that’d be it. But with the wonders of Flot, Leemba can make charts on the fly. For one, we can change the aggregate method, one of average, minimum or maximum and the chart will update right away. Also, checking the “Show Standard Deviation” box will add a line for the deviation and a second Y axis. It can be useful to spot erratic data.

You can do a couple things to deal with outliers. The default mouse mode lets you select a new time range by just clicking and dragging on the chart itself. That way the chart can just exclude the time period. The other mode lets you zoom and pan the chart. Either way, the smaller details can be viewed without changing historical data.

Selecting time ranges is also a great way to see even more detailed charts. With a small enough time range, Leemba won’t need to downsample any data but will show the raw points. Downsampling depends on how often the tests run.

For example, we can go back to the 17th when Twitter was hacked (*) without averaging:

Twitter Response Times

Twitter.com Dec 17th

Lessons learned

Flot is pretty good but I had some problems along the way. For one, Flot only displays UTC times by default, which is not terribly helpful for us mere mortals. To fix that I currently loop the dataset and apply the time zone difference to each point and provide custom date formatting functions. (This is known.)

Unfortunately, zoom, pan and selection modes in Flot are pretty much exclusive. So that’s why the “Mouse Mode” is selectable. Even more unfortunate, that forces a redraw of the plot which resets the current zoom and pan settings. jqPlot was started just a few months ago and solves some of these problems, but it doesn’t yet deal with null values, like used in the above chart.

Flot is pretty quick. And it’s nice to not have to worry about the flash plugin crashing as it so often does on Linux. Even on IE with the excanvas emulation, it performs. Drawing times on IE are not unlike loading and displaying a flash chart, so it’s acceptable. Zooming is a bit slow, but then… Oh, well. It’s IE.

Browser support has been a non-issue. Even IE6 works without problems, and the Android and iPhone browsers display them just fine.

Of course, the database is still the bottleneck here. And the difficultly in pulling out information from a standard relational database fast enough is undoubtedly why most monitors depend on pre-generating images. I’ve employed a mess of performance tricks in Postgres to make this work. I’ll have to detail those in another post.

If you want to check out Leemba, the Open Source project is hosted by Sourceforge. It’s not ready for end users yet though. :-)

* To be fair, it sounded like a DNS exploit that could happen to just about anybody.

What the hell, Google?

25 Nov

Imagine my dismay at what it took to gain root access to my Android phone. I started here and wound up here and had to use fscking Windows software here but this had some Linux instructions and grabbed a newer SPL here and finally managed to load a build here only to find the whole reason I’d started this debacle was purposefully broken here and complained about by — wait for it — a Google employee.

I love my Android and I’m generally a Google fan. I’ve got my Wave account, Google Voice is awesome and I’m not sure if I could find my way home anymore without Maps. But what would I do if I got home if I couldn’t fire up Chrome and head for Google News or connect on GTalk, anyhow? Maybe I’d have to get some work done, I don’t know.

My Rogers HTC Magic has seemingly been abandoned at version 1.5. There are rumors of updates every now and again, but since newer apps have already rolled out for 2.0, it’s woefully behind. The minor SMS fix was months delayed, there’s been no 1.6 update, so I can just imagine when I would have gotten 2.0. But it’s Open Source and I should be able to roll my own my hardware, right?

I once installed Debian over a modem with a fistful of floppy disks, so I’m no stranger to difficult installs. That’s not the problem. This was caused purposefully by large corporations attempting to prevent me from using my own device. (Although, I suppose they could have tried harder and I’m thankful they didn’t.)

Less stupid, this way

Both the iPhone and Android have already made too many concessions to the Unix security model for root access to be meaningful. You can install all the software you want with no special privileges. Yet, piracy is still a problem, and one doesn’t need root access to be vulnerable.

Carriers need to adopt the Rackspace model. They weren’t the first but they came to mind, and so that’s what I’m calling it. When you purchase Rackspace servers you get admin rights. But you also get support. They will patch servers for you. They do this to protect their customers and their network, just the way the carriers should.

Many phones are capable of over the air updates already. They can and should be pushing a small, well-tested patches to vulnerable phones in order to protect the network for everybody, but leave users in control of their own hardware. There’s no reason somebody should have to jailbreak their iPhone to install a theme.

With all this jailbreaking and modding, mobile networks are already moving this direction anyhow. Continued attempts at control backlash just as predictably as DRM did for the music industry. It would be best to prepare for this before the smart phone market grows unwieldy large. The RIAA side of the fight is not the way to win, especially for what amounts to a hobby revenue-wise for Google.

Work with the mod community to provide a federated patching facility for everybody. Allow those that want to control our hardware (because people will figure out a way, anyhow). Worry less about piracy and more about making purchasing and customer contact suck less for the developer. Don’t wander into strange rooms.

Tools matter

19 Nov

I thought “Tools don’t matter” was interesting but it drew the wrong conclusion. Tools matter a great deal.

My brothers and I also played a lot of games growing up. Like the original author, we’d also try to get the best weapons. We called it “gun shopping.” Unfortunately, running around the map looking for weapons from fallen players was a great way to get killed.

Wasting time looking for the ultimate IDE has less drastic consequences, but this is not an industry that allows for a lot of spare time. Developers always have something bigger and better to do, and there are always anxious users waiting.

While not wasting time searching for the “one true” tool is good advice, don’t draw from that the idea that tools somehow make no difference. What does matter is productivity once you’ve reached mastery of a tool. They’re not all equal.

All major IDEs have a lot of depth. You can be crazy productive if you master any of them. But anybody can be a master of nano in about 10 minutes and there’s not a lot of extra productivity to find. It’s a great simple tool, but there are better editors for more complicated tasks. And as much as I like Netbeans, I reach for Emacs for just about anything that doesn’t end with “.java”.

Still struggling to convince himself, the author then veers into a discussion about operating systems I found laughable.

Most programs sit on a tall stack of software. A program written in a high-level language can be quite a distance from the hardware. Every library, every runtime, everything added to your program is like a layer on a cake. The higher it is, the more likely it is to fall down.

Every once in a while it’ll make the news that some air traffic controller is still using Windows 98, or that some Navy destroyer hasn’t yet moved off NT4. I doubt many would still say the tool doesn’t matter. Most would agree that they’re fucking crazy.

That’s one of the great things about Unix design philosophy. Often it’s just you and the kernel. There’s as little as possible in the way. I never worry about a broken browser hosing my server application. A loss of one component to the system doesn’t crash the whole box. Nobody needs unrestricted local admin rights just to start a service. It’s not news that Windows is a towering layer of wobbly components prone to frequent falling.

Tools matter because any failure in that long line of abstractions can take out your entire program. Of course you can do useful things with Windows. But it’s not near the rock-solid foundation that I want to build on.

Server Monitoring – Few Winners

10 May

As a programmer, I like to know how my applications are handling. I like pretty graphs of response times and I really want to know if they blow up. In our department we’ve been running a very old installation of Big Brother (BB) from before Quest. I kid you not. It’s old but it works with relatively little fuss and sheer lack of a compelling enough competitor has kept it humming away all these years.

Still, BB is very simplistic and we’d like to set up into something from this century. We’ve been literally watching for years for a suitable Open Source replacement to emerge but nothing seems to fit. Of course, we’re familiar with Hobbit but it’s in the same vein as BB. We were never really that happy with BB but it was already in place. Inertia is a powerful force.

And just yesterday I read that Nagios has been forked. Maybe I’m not the only one unhappy with the available choices. I’m about this close to writing my own. Might be fun.

Below is more or less my personal list of gripes, minus the names of the guilty. I really have no interest in gunning down well-meaning projects. Of course, some score better than others but none seem to do it all. You’ll notice I’m mostly concerned with the server itself, since for the most part the agents work great.

Want to add to the list? What do you use to monitor your apps?

  • Slow. Most of the time, if I’m checking the site I just want to see a graph or check that a specific service is working. It shouldn’t take forever. That includes navigation – it should be easy to find historical data.
  • Use existing agents. Every monitor doesn’t need it’s own agents, there are plenty out there. $new-fangled-monitor would ideally work with agents I can apt-get (with Nagios being high on that priority list).
  • All configuration for alerts, plugins and tests should be stored on the server and centrally managed.
  • Should be able to make mass changes to alerts and agent configurations, something most lack.
  • If it uses a database, it should be able to use the major Open Source databases and at least Oracle (if I’m forced).
  • It should automatically alert on obvious things. If I’ve setup a ping or HTTP test on a server I probably also want to know if it stops responding. Just allow for a way to override the default.
  • Should only alert once. I don’t really want to take the time to designate which alerts are critical and which are not. That can add up to a lot of configuration time, and I have plenty of stuff to watch. I’ll get the email and decide if it’s worth checking out right now or not. Not to mention, since one outage can cause cascading outages, I don’t want to also cause an email outage.
  • On that note, it should have easily adjustable change windows for planned maintenance.
  • Configuration should be dead simple. I have better things to do than spend all day fooling with the monitor server. I don’t mind editing text files so much, but they need to be well documented like Apache’s. The problem with text files is often times you don’t know the possible values. XML is for programs and not for end users to hand edit.
  • Should integrate with the network. Here our network is unfortunately run by mostly Windows servers. At least that means I shouldn’t have to setup users, manage passwords, etc. Single signon with Kerberos or NTLM is a must.
  • On that note, don’t require logins for status pages. Or at least be able to allow access for any authenticated domain user. It’s not a state secret. They’re already on the network if they reached the monitor. If they cared, they could ping the server themselves. Automated monitoring is supposed to make it easier.
  • RSS feeds or portlets and possibly some embeddable AJAX widget would be a great way to integrate with the applications and various other web servers. I’d love to have a page in my own web apps were users could check the status of various systems and progress on fixing them.
  • Give me a way to configure a page or dashboard just for stake holders. I want to email them a URL and let them see for themselves that the application is working.
  • It should look nice, too. I’m not sure why, but most of the monitoring solutions are ugly. Again, I want to give this to the business and let them get a warm fuzzy that everything is working. It should be simple, professional and quickly communicate where problems lie. They’re not going to build their own dashboard with flashing lights and server pictures. They just want to know what broke.
  • Should have a developer’s API. Everything and everybody knows HTTP. We have great proxies and load balancers. Firewalls know all about HTTP. It doesn’t make sense to write a new protocol. Should be usable from shell scripts.
  • Should always page if an agent stops sending updates. That seems kinda basic, but I shouldn’t have to configure an alert for each and every one. Of course, still allow for a way to override the default.
  • A nice mobile page is a must. I might not be in the office or I might be upgrading my workstation again.
  • Should work through, over and under firewalls. Unbelievably, this was an issue with one I tried.
  • Speaking of dumb problems, one I tried would show a blank page if my cookie expired. I’d have manually remove it to login again. Not awesome. The basics are important.
  • Statically typed language. Edit: eek, that’s what I meant. I know that’s a bit controversial but this is only my personal preference. Simply put, I’m probably going to install a monitor and forget about it until it breaks. I’ve been bitten by upgrading the PHP/Python/Perl package often enough that I’d prefer something less prone to incompatible changes.
  • It should not require an agent or convoluted configuration to setup a simple HTTP test from the monitor server itself. Oddly, one I’d tried required something like 30 clicks to setup a simple ping, not to mention a lot of head-scratching.

Server should run on Linux, obviously.

Would you use something that matched that description?

Run Terracotta Jobs Across the Cluster

28 Apr

In my Introduction to Terracotta, I alluded to the possibility of running jobs across the cluster but I didn’t really explain how. My first attempts involved using the tim-messaging package, but there wasn’t much there geared towards running the same job across all nodes. It mainly addresses dividing jobs across many workers (local or remote).

There are a few cases where the same job on all nodes is useful. The first time I ran into it was trying to pull some custom statistics from each node. Each node kept a moving average of requests per second for a specific user action I wanted to track. The other use case was notifying long poll HTTP clients. Server push could be initiated from one node but the client may be listening on another. Since Socket instances aren’t shareable, I needed a simple way to server push on all nodes.

I cast around until I found the old tclib forge project that predated many of the current TIMs. I modified the below class starting from that code.

To solve the problem of knowing what nodes are in the cluster, I simply have them register() themselves at startup. A ServletContextListener works great for that. That solves the problem of submitting jobs before the node is ready.

Of course, then you have to worry about nodes that may suddenly disappear. One can’t count on them unregistering themselves since they might crash or something. To fix that I have each node hold a lock the entire time they’re running. One of the necessities that Terracotta must have had to address early on is cleaning up cluster-wide locks when a client exists. To test if a node is still around, all the caller has to do is attempt to acquire the lock. If it’s successful, then that node somehow exited it’s run loop.

In practice this works pretty well. It’s much simpler than the JMX solution, although it doesn’t bother trying to resubmit jobs or some of the other neat features of tim-messaging. The tryLock() below was plenty fast enough for my purposes. Of course, it’s easy to envision adding a thread to clean up the queues instead of forcing the caller to wait for the tests. Really, that would depend on the size of your cluster.

Depending on it’s use, you might tweak the thread pool used for this. This code also depends on the tim-annotations project but you can easily configure tc-config.xml.