02 Feb 2012
I could wax lyrical about how programming is an art form and requires a great deal of creativity. However,
it’s easy to loose focus on this in the middle of creating project specs and servicing your
technical debt. Like
many companies we recently held a hackathon event where
we split up into teams and worked on projects suggested by the team members.
Different teams took different approaches to the challenge, one team set about integrating an open source code
review site in our development environment, others investigated how some commercial technologies could be
useful to us. My team built a collaborative
filtering system using MongoDB. I’ll post about that project in the future, but in this post I wanted to
focus on what we learnt about running a company Hackathon event.
If you’re lucky you’ll work in a company that’s focused on technology and you’ll always be creating new and
interesting things. In the majority of companies technology is a means to a end, rather than the goal. In that
case it’s easy to become so engrossed in the day to day work that you forget to innovate or to experiment with
new technologies. A hackathon is a great way to take a step back and try something new for a few days.n
Running a hackathon event should be divided into three stages, preparation, the event and the post event.
Before the event you need to take some time to collect ideas and do some preliminary research. The event
itself should be a whirlwind of pumping out code and building something exciting. Afterwards you need to take
some time to demonstrate what you’ve built, and share what you’ve learnt.
Read More...
23 Jan 2012
A hobby project of mine would be made much easier if I could run the same code on the server as I run in the
web browser. Projects like Node.js have made Javascript on the server a more
realistic prospect, but I don’t want to give up on Python and
Django, my preferred web development tools.
The obvious solution to this problem is to embed Javascript in Python and to call the key bits of Javascript
code from Python. There are two major Javascript interpreters,
Mozilla’s SpiderMonkey and
Google’s V8.
Unfortunately the python-spidermonkey project is
dead and there’s no way of telling if it works with later version of SpiderMonkey. The
PyV8 project by contrast is still undergoing active
development.
Although PyV8 has a wiki page entitled How To
Build it’s not simple to get the project built. They recommend using prebuilt packages, but there are none
for recent version of Ubuntu. In this post I’ll describe how to build it on Ubuntu 11.11 and give a simple
example of it in action.
Read More...
20 Jan 2012
In this series of posts I’m describing how I created a CouchDB
CouchApp to display the weather data
collected by the weather station in my back garden. In the
previous post I showed you how to display a single day’s
weather data. In this post we will look at processing the data to display it by month.
The data my weather station collects consists of a record every five minutes. This means that a 31 day month
will consist of 8,928 records. Unless you have space to draw a graph almost nine thousand pixels wide then
there is no point in wasting valuable rending time processing that much data. Reducing the data to one point
per hour gives us a much more manageable 744 data points for a month. A full years worth of weather data
consists of 105,120 records, even reducing it to one point per hour gives us 8760 points. When rendering a
year’s worth of data it is clearly worth reducing the data even further, this time to one point per day.
How do we use CouchDB to reduce the data to one point per hour? Fortunately CouchDB’s map/reduce architecture
is perfect for this type of processing. CouchDB will also cache the results of the processing automatically so
it only needs to be run once rather than requiring an expensive denormalisation process each time some new
data is uploaded.
Read More...
12 Jan 2012
In this series I’m describing how I used a CouchDB
CouchApp to display the
weather data collected by a weather station in my back garden. In the
first post I described CouchApps and how to get
a copy of the site. In the next post we
looked at how to import the data collected by PyWWS and how to
render a basic page in a CouchApp. In the post we’ll extend the basic page to display real weather data.
Each document in the database is a record of the weather data at a particular point in time. As we want to
display the data over a whole day we need to use a
list
function. list
functions work similarly to the show
function we saw in
the previous post. Unlike show
functions list
functions don’t have the document passed in, they can call
a getRow
function which returns the next row to process. When there are no rows left it returns
null
.
Read More...
05 Jan 2012
In my last post I described the new
CouchDB-based website I have built to display the weather data
collected from the weather station in my back garden. In this post I’ll describe to import the data into
CouchDB and the basics of rendering a page with a CouchApp.
PyWWS writes out the raw data it collected into a series of CSV
files, one per day. These are stored in two nested directory, the first being the year, the second being
year-month
. To collect the data I use PyWWS’s live logging mode, which consists of a process
constantly running, talking to the data collector. Every five minutes it writes a new row into today’s CSV
file. Another process then runs every five minutes to read the new row, and import it into the database.
Because CouchDB stores its data using an append only format you should aim to avoid unnecessary updates. The
simplest way to write the import script would be to import each day’s data every five minutes. This would
cause the database to balloon in size, so instead we query the database to find the last update time and
import everything after than. Each update is stored as a separate document in the database, with the
timestamp
attribute containing the unix timestamp of the update.
Read More...