Latest Tweet


Alright, so I’m deploying some new code at PageLever to move our Facebook sync code to node.js. I’ve just slaved over it for a few weeks, creating a queue/worker system (to be publicly released soon) that keeps track of lots of stats and more importantly errors. Very very important to see our errors. Is the problem with us or Facebook?  Have we hit our API limits again?  Did I fat-finger yet another FQL statement?  You know, the vital stuff.

So now I’ve pushed it out there and started up a bunch of workers and I’m monitoring all of them. I start getting a few errors trickling in. No problem, this happens. Let’s check out what they say.

Error: socket hang up
    at Object. (http.js:1124:15)
    at CleartextStream. (http.js:1173:23)
    at CleartextStream.emit (events.js:64:17)
    at Array. (tls.js:792:22)
    at EventEmitter._tickCallback (node.js:190:38)

Read More

"If I had asked people what they wanted, they would have said faster horses."

- Henry Ford

Henry Ford’s famous line is often used as a guideline for innovation, but I think that the majority of tech start-ups misinterpret the quote. By and large I’m not talking about “cool” products like games or social networks (before they become useful), but websites and applications that serve a direct business need. When you understand what users want, you can examine why they want it, and then figure out how to give them what they need. Too many start-ups fall into the trap of giving users what they want without really examining why they want it, completely ignoring what they need.

Read More

Here at Signpost, we believe in track everything. We keep every request, every event, every log, every exception. We use this data for user tracking, analytics and business intelligence, and code health / bug triage.

Original System

We started out with MySQL. Requests were logged into a request table, except we handled the 1-Many relationship of request parameters in a single serialized column. Events were tracked in two tables: event_tracking and event_tracking_property, where there are many properties per event. Logs and exceptions were sent to log files, where they were rotated and hardly looked at for eternity. We ran with this setup in production for the first few months of Signpost’s lifetime and it worked out just fine.

Read More

I entered my post on Why (and How) I Replaced Amazon SQS with MongoDB into the MongoDB Blog Contest and won!

As part of the grand prize, I went out to OSCON in Portland, OR and had a great time! Kristina Chodorow's new book on MongoDB was on display and I spoke to lots of people who are interested in what MongoDB is and what it can do for them. I also went to a lot of talks, the most interesting of which (in my opinion) were about node.js and the analytics done at Twitter using Hadoop and Pig.

Read More

What is Amazon SQS?

Amazon SQS (Simple Queue Service) is a reliable message queuing service hosted in the Amazon cloud. This service is ideal for sending messages between servers that need to acknowledge that processing has been completed. When a message is popped from the queue, it is not deleted, but marked with the client who has made the request. The client is then responsible for telling SQS to delete the message from the queue. If the client does not delete a message it has popped within a certain time frame, the client loses ownership of the message and it is made available for other clients.

How am I using it?

One of the systems I am using SQS for is a distributed email delivery service (using SMTP). Since there is not an asynchronous SMTP client for Java (that I know of), I am using JavaMail to deliver messages. Sending messages with JavaMail is pretty slow and can take a number of seconds per message, with a thread being consumed for each message sent. In order to send many many messages in parallel I decided to queue up the outoging messages and spin up many instances of the SMTP application. This approach is dead simple and scales wonderfully without needing to implement an asynchronous SMTP client of my own.

Read More


One of the best things about MongoDB is the lack of an enforced schema for collections. This flexibility gives developers a lot of power in how they work with their data. Embedding records and arrays inside other records allows both a complexity and simplicity of data organization that RDBMSs can only dream of! All of that being said, working with these records in a language like Java and on large diverse teams of people who don’t want to open the database and inspect the records to see what values and sub-records are available, means that you will always spend time wrapping these records in a strong-typed class. Wrapping up loose data into classes that can both access and create that data sounds just like another project I’ve used recently. If you haven’t heard of Google’s Protocol Buffers, you might want to acquaint yourself.

Since I’ve enjoyed working with Protocol Buffers so much, I thought I could mimic their functionality and ease of use with MongoDB. This would also integrate beautifully with the GuiceyMongo project that I released a month or two ago.

Read More

Following up on my post about Google Guice Module De-Duplication I want to show a good use of singleton modules. Motivation for the Provider Module pattern came from writing the GuiceyMongo library. Every time a database or collection is configured, I need to install a provider for that database key or that collection key. Since there can be a Production, Test, QA, etc. configuration for each database or collection key I would get an exception when binding the provider twice for the same key. I needed a way to only install each provider once. This is where Provider Modules come into play.

Read More

There is no current way in the Google Guice API to run a Module’s configure method only once. This might not seem like a big deal until you try writing something like a logging configuration module or any other module that could be added or augmented in multiple places in your code. For instance, in my GuiceyMongo library, I wanted to allow users to add configurations from multiple modules. So the CalendarModule can add the calendar collection to the configuration and the StopwatchModule can add the stopwatch collection to the configuration. This is similar to the effect that the Guice Multibindings extension accomplishes.

So, how do you do this in your own code? Elite Guice 2: Binding de-duplication over at publicobject tells us how. Just override the hashCode and equals. It would look something like this:

Read More