28 Mar 2012
Many websites have some form of recommendation system. While it’s simple to create a recommendation system for
small amounts of data, how do you create a system that scales to huge amounts of data?
How to actually calculate the similarity of two items is a complicated topic with many possible solutions.
Which one if appropriate depends on your particularly application. If you want to find out more I suggest
reading the excellent Programming Collective Intelligence
(Amazon affiliate link) by Toby Segaran.
We’ll take the simplest method for calculating similarity and just calculate the percentage of users who have
visited both pages compared to the total number who have visited either. If we have Page 1 that was visited by
user A, B and C and Page 2 that was visited by A, C and D then the A and C visited both, but A, B, C and D
visited either one so the similarity is 50%.
Read More...
20 Mar 2012
On my 25 minute train journey to work each morning I like to pass the time by
reading. The two most recent books I’ve read are
The Lean Startup: How Constant Innovation Creates Radically Successful Businesses by Eric Ries and Steve Jobs by Walter Isaacson (both links contain an affiliate id). Although one is a
biography and the other is a book on project management they actually cover similar ground, and both are books
that people working in technology should read.
Walter Isaacson’s book has been extensively reviewed and dissected so I’m not going to go into detail on it.
The book is roughly divided into two halves. The first section is on the founding of Apple, Pixar and NeXT.
This section serves an inspirational guide to setting up your own company. The joy of building a great product
and defying the odds against a company succeeding comes across very strongly. The later section following
Job’s return to Apple is a much more about the nuts and bolts of running a huge corporation. While it’s an
interesting guide to how Apple got to where it is today, it lacks the excitement of the earlier chapters.
Read More...
07 Mar 2012
A little while ago I was asked what my biggest gripe with Django was. At the time I couldn’t think of a good
answer because since I started using Django in the pre-1.0 days most of the rough edges have been smoothed.
Yesterday though, I encountered an error that made me wish I thought of it at the time.
The code that produced the error looked like this:
from django.db import models
class MyModel(model.Model):
...
def save(self):
models.Model.save(self)
...
...
The error that was raised was AttributeError: 'NoneType' object has no attribute 'Model'
. This means
that rather than containing a module object, models
was None. Clearly this is impossible as the class
could not have been created if that was the case. Impossible or not, it was clearly happening.
Read More...
10 Feb 2012
After a two week gap the recent snow in the UK has
inspired me to get back to my series of posts on my weather station website,
WelwynWeather.co.uk. In this post I’ll discuss the
records page, which shows details such as the highest and
lowest temperatures, and the heaviest periods of rain.
From a previous post in this series you’ll remember that
the website is implemented as a CouchApp. These are Javascript functions that run
inside the CouchDB database, and while they provide quite a lot of flexibility you do need to tailor your
code to them.
On previous pages we have use CouchDB’s map/reduce framework to summarise data then used a list function to
display the results. The records page could take a similar approach, but there are some drawbacks to that.
Unlike the rest of the pages the information on the records page consists of a number of unrelated numbers.
While we could create a single map/reduce function to process all of them at once. That function will quickly
grow and become unmanageable, so instead we’ll calculate the statistics individually and use AJAX to load them
dynamically into the page.
Read More...
02 Feb 2012
I could wax lyrical about how programming is an art form and requires a great deal of creativity. However,
it’s easy to loose focus on this in the middle of creating project specs and servicing your
technical debt. Like
many companies we recently held a hackathon event where
we split up into teams and worked on projects suggested by the team members.
Different teams took different approaches to the challenge, one team set about integrating an open source code
review site in our development environment, others investigated how some commercial technologies could be
useful to us. My team built a collaborative
filtering system using MongoDB. I’ll post about that project in the future, but in this post I wanted to
focus on what we learnt about running a company Hackathon event.
If you’re lucky you’ll work in a company that’s focused on technology and you’ll always be creating new and
interesting things. In the majority of companies technology is a means to a end, rather than the goal. In that
case it’s easy to become so engrossed in the day to day work that you forget to innovate or to experiment with
new technologies. A hackathon is a great way to take a step back and try something new for a few days.n
Running a hackathon event should be divided into three stages, preparation, the event and the post event.
Before the event you need to take some time to collect ideas and do some preliminary research. The event
itself should be a whirlwind of pumping out code and building something exciting. Afterwards you need to take
some time to demonstrate what you’ve built, and share what you’ve learnt.
Read More...