Jay Caines-Gooby

Back in 5 mins

Wikileaks stats update

Wednesday December 08, 2010 @ 12:22 AM (GMT)

hourglass with wings, by lwr

I’ve been mirroring wikileaks.ch since around 8am GMT Sunday (5th December). I was curious about the traffic.

As of tonight (Tues 7th December, 23:50 GMT), some 64 hours later http://leaks.gooby.org has served:

  • 2.7 GB of data
  • 13,804 unique visitors
  • 78,695 page views
  • 1,584 unique referrers

I think I might need to upgrade its hosting plan…

Why I'm mirroring wikileaks

Monday December 06, 2010 @ 04:39 PM (GMT)

I’m mirroring wikileaks as well as providing DNS for it, because I believe that the heavy-handed attempts to censor it are the thin end of a very thick wedge.

Let’s take a brief look at what’s happened recently, prior to the whole wikileaks cablegate releases:

If net neutrality ends, then broadcasters and news and entertainment corporations with deep pockets can pay to put their content in front of ISP users. There’s no incentive for the ISPs to prioritise or perhaps even carry packets to the second-tier domains. Thus we end up with lowest denominator content being the norm. Not that different from what we currently receive from our broadcasters. Obviously, this is a comfortable place for both the the media industry and government to be in. It’s the status quo. Channel 5, ITV 2, 3 & 4, 24/7. No Wikipedia, no Vimeo, no blogs, and so on.

If a government can both lean on hosting and other internet companies to stop providing their services to sites that the government would rather not see, and yet at the same time exercise the ability to remove a company from the internet by confiscating their domain, at what point will one threat follow another? Is it feasible that a US ISP who provides hosting to wikileaks, and who refuses to stop serving them, will then have their own domain confiscated?

So, to me it seems that some heavy-handed governmental censorship is not far in the offing. Wikileaks is just the start and so I’m siding with the Internet on this one.

Paul Carvill expresses this much more eloquently than I can:

“The free distribution of data, and resistance to top-down evaluation of the merit of that data, is what the web excels at. It is more important now than ever before that individuals are allowed to publish and consume information as they see fit, within the bounds of the law. The world wide web, must be allowed to operate neutrally and independently of governments and corporations, including domain name registrars, ISPs, data carriers and other and infrastructure providers. Everyone who uses the web benefits from such independence, and should promote and support it wherever possible.”

The Tech Herald has a detailed roundup of the wikileaks events this week.

Twitter has played a big part in my involvement. On Friday I saw Nic Ferrier tweet:

what about a campaign to get absolutely everyone to add wikileaks to their domain?less than a minute ago via web

Tom Morris responded:

@nicferrier Zone file edited for tommorris.org. That’s the sort of activism I can actually do. ;-)less than a minute ago via Echofon

And then by Friday night, there were a ton of hosts and subdomains pointing at the changing wikileaks IP:

WIKILEAKS: Free speech has a number: than a minute ago via web

Saturday brought the mass-mirroring project:

#Wikileaks mass-mirroring begins! Give server space to fight #censorship! #StreisandEffectless than a minute ago via identica

Twitter has been stalwartly been letting wikileaks news through. At one point it seemed to be the only place to get the current IP of the official site.

There was an Etherpad being collaboratively edited on Friday, but that’s no longer available.

#imwikileaks and #cablegate didn’t trend in the US or the UK across the weekend, but then again, #xfactor was on. There’s been a noticeable uplift in traffic this morning (Monday 6th), so we’ll see…

What a difference a weekend makes

In just three days, 3rd – 5th December, wikileaks went from a couple of official domains along with some unofficial mirrors that were being DDOS-ed off the web, to having hundreds of working official mirrors and a full-text search of all the currently released documents. Score one for the Internet.

Still in doubt? http://sowhyiswikileaksagoodthingagain.com/

Monit managing mongrels

Tuesday September 21, 2010 @ 11:14 AM (BST)

Like everybody else, I was getting the dreaded “Execution failed” from monit when it couldn’t restart a mongrel which had gone out-of-bounds on its monitoring settings.

The solution that worked for me was the env and $PATH line:

start program = "/usr/bin/env PATH=/bin:/usr/local/bin:$PATH \
                   ruby mongrel_rails cluster::start \
                                      -C /etc/mongrel_cluster/myapp.yml \
                                      --clean --only 8000"
stop program = "/usr/bin/env PATH=/bin:/usr/local/bin:$PATH \
                   ruby mongrel_rails cluster::stop \
                                      -C /etc/mongrel_cluster/myapp.yml \
                                      --only 8000"

But what scuppered me and will hopefully help you, is that whilst trying the various options suggested by Google, I wasn’t reloading the monit config. Doh!

So if you’re trying to debug your failing mongrels under monit here are a couple of tips…

Start a new shell, and unset your PATH to mimic how monit behaves:

unset PATH

Now you can test your monit stop/start lines (you’ll need to sudo as monit normally runs as root)…

/usr/bin/sudo /usr/bin/env PATH=/bin:/usr/local/bin:$PATH \
    mongrel_rails cluster::stop \
      -C /etc/mongrel_cluster/myapp.yml \
      --only 8000

/usr/bin/sudo /usr/bin/env PATH=/bin:/usr/local/bin:$PATH \
    mongrel_rails cluster::start \
      -C /etc/mongrel_cluster/myapp.yml \
      --clean \
      --only 8000

If that works for you, then you can be pretty sure that it’ll work for monit, as long as you remember to reload your monit config after making the changes.

Avoiding HTTPS mixed content warnings

Tuesday September 07, 2010 @ 11:19 PM (BST)

Skull and cross bones icon

Working on an HTTPS site today I was getting mixed content warnings from Chrome due to a few external resources still coming over HTTP.

Google analytics already has the HTTP/HTTPS switcher built into it, but jQuery was being pulled in with an explicit http call to:

 <script src="http://code.jquery.com/jquery-1.4.2.min.js" type="text/javascript"></script>

I could serve the code directly myself, but it seemed neater to use the Google CDN. It was a relatively quick fix thanks to stackoverflow. I used the simplified version:

  <script src="//ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"

which uses the // protocol-relative URL to download the external jquery file. Because it’s served over HTTPS, you won’t benefit from using the browser’s cached version, but it is one less file to be served by you.

Don’t forget your favicon

So now I was at the point of being certain that every resource on the page was being served over HTTPS, yet I was still getting the dreaded mixed content warning. Then I realised I hadn’t explicitly put a favicon link in the html. A quick check in the logs seemed to confirm this. The implicit favicon.ico request was being made by Chrome, but using HTTP.

Adding the icon link seemed to do the trick.

  <link href='/images/favicon.ico' rel='shortcut icon' />

Proxying Google maps

Once final problem was that as part of the registration process, I was showing a Google Map iframe.

I didn’t want this page served over HTTP just to avoid the mixed content warning, especially as the map page contains personal details.

As I’m using nginx to serve the site, and it’s relatively easy to proxy content served from local application servers like mongrel and unicorn, I wondered if we could do something similar with the requests to http://maps.google.com.

The nginx config is really easy:

location /maps {
  proxy_pass        http://maps.google.com;
  proxy_set_header  X-Real-IP  $remote_addr;

Just make sure this comes ahead of any location rule that matches /, as it needs more priority.

And then in my code, I just invoke the iframe without the http://maps.google.com host, when I know I’m running in production, and hence running the proxy

- host = (RAILS_ENV != "production" ? "http://maps.google.co.uk" : "") 

%iframe#map{ :scrolling => "no", 
             :marginheight => "0", 
             :marginwidth => "0", 
             :src => "#{host}/maps?hl=en&amp;gl=GB&amp;q=#{@address}&amp;mrt=loc&amp;t=m&amp;z=15&amp;iwloc=near;&amp;output=embed", 
             :frameborder => "0", 
             :height => "250", 
             :width => "250" }  

and that, currently is doing the job for me.

Max was writing some rake tasks today and it reminded me to finish off this post which has sat unfinished for months.

Bob’s been porting Charanga’s music-teaching desktop software from PC to Macs. The port is based on the work we’ve done in the past couple of years for our online products and means that we can now have an online offering as well as PC and Mac desktop products, all built from the same codebase.

We’ve initially released three products with a further eight to follow.

Each of these 11 products will come as either a hybrid DVD or CDROM, with both the PC and Mac version on, but only visible to relevant platform. Lots of CD burning products for Macs out there make it easy to burn these kind of ROMs, but the big problem is that they all need to be made manually. And with 11 different products, that’s 11 different manual processes, any one mistake of which could ruin the master that we’re sending off to the publisher.

It occurred to me that we deploy our web apps with a single invocation:

	cap deploy STAGE=production

So why not do the same with the burning of the CDROMs? Entirely automate the process so there’s no room for manual error…


Rake – Ruby Make – operates on a rakefile which defines lists of tasks, with optional requisite tasks that must first be completed. Given that building and burning the ROMs consists of a bunch of identical steps, differentiated only by the files that need to go on the relevant product’s CDROM or DVD, it sounds like an ideal tool, so let’s go ahead and build a skeleton rakefile…

There are a bunch of files that are common to all the products, plus product specific files. These get pulled out of subversion (yes, yes, we’re only just migrating to git), copied into the product filestructure, the PC content gets added, the hybrid ISO image gets created and then we use this to physically burn the ROM.

If we make each preceding step a prerequisite of the parent task, we can break the the steps down into nice self-contained pieces and have a single task invoke all the others below it.

Ultimately, I wanted to be able to stick a DVD or CDROM into the drive and then call:

	rake burn_electric_guitar_coach_dvd

And have a finished hybrid DVD pop out, fresh off the press.

Break it down

The tasks in the rakefile are roughly as follows:

# Define various constants

# The source repository

# Where we'll do all this stuff
BUILD_DIR = File.expand_path "/Users/jay/Work/music coach" 

# The copy of the remote repository we'll use locally
CACHED_COPY = "#{BUILD_DIR}/svn-cached-copy"

# shared by all products


  # specific properties for each product
  :guitar_deluxe => { 

    # Mac content
    :volume_name => "Play Electric Guitar",
    :app_folder_name => "Play Electric Guitar v3.0",
    :logo => "guitar deluxe.jpg",    
    :modules => "First Lessons For Guitar", "Guitar Improver", "Guitar Songs And Styles", "Solo Guitar Performance Pieces", "Master Rock Power Chords", "Chord miner"]
    # PC content
    :pc_iso => "guitar_deluxe.cdr",
    :pc_iso_volume_name => "GuitarDeluxe",
    # details of which files to hide from a PC on a Mac and vice versa
    :hide_hfs => "{Common,Player,program files,Redist,System32,*.exe,*.inf,*.msi,*.ini}",
    :hide_joliet => "{.background,.DS_Store,.Trashes,.com.apple.timemachine.supported,.fseventsd,Play Piano v3.0,Applications}"
  :electric_guitar_deluxe => {
    # ...
  :piano_deluxe => {
    # ...
  :play_piano => {
    # ...
  # and so on for 7 other products

# A couple of helpers...
# Input helper - gets input from user
def ask message
puts message

# Symol helper - converts a string to a symbol
# "Blah blah foo".symbolize = :blah_blah_foo
class String
  def symbolize
    self.downcase.split(" ").join("_").to_sym

# Now the tasks themselves

# The default task (runs when rake is called without arguments)
task :default => :create_repository

# The create_repository task - builds a local copy of the repository for us to work from 
desc "Create a cached copy folder where the repository will reside which we can then svn export the installer files from"
task :create_repository do 
  # the production files
  unless File.exists?("#{CACHED_COPY}")
    puts "Creating initial cached copy of the repository"
    svn_user ||= ask("Enter your svn username: ")
    svn_password ||= ask("Enter your svn password: ")
    sh "svn checkout --username #{svn_user} --password #{svn_password} '#{REPOSITORY_URL}' '#{CACHED_COPY}'"

desc "Update the cached copy of the respository to get latest versions of files"
task  :update_repository => [:create_repository] do

  puts "Updating #{topics_product} production files"
  sh "cd '#{CACHED_COPY}'; #{SVN_PATH}/svn update"
# Be DRY about the task creation and use some string to symbol magic to dynamically create the tasks
# This makes three tasks per product (11 products = 33 tasks :)
# 1. build_#{topics_product}_dmg (with a prerequisite on 1.)
# 2. build_#{topics_product}_dvd (with a prerequisite on 2.)
# 3. burn_#{topics_product}_dvd (with a prerequisite on 3.)

PRODUCTS.each do |topics_product, data|
  desc "Build #{topics_product} for Mac .dmg"
  task "build_#{topics_product}_dmg".symbolize => [:update_repository] do

    # We need to clean up .dmg and any old build folders
    # and make sure no other dmg of the same name is mounted
    sh "sudo umount -f '/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}'" if File.exists?("/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}")
    sh "sudo rm -rf '/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}'" if File.exists?("/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}")
    sh "rm '/tmp/#{PRODUCTS[topics_product][:dmg]}'" if File.exists?("/tmp/#{PRODUCTS[topics_product][:dmg]}")
    sh "rm '/tmp/#{PRODUCTS[topics_product][:app_folder_name]}.dmg'" if File.exists?("/tmp/#{PRODUCTS[topics_product][:app_folder_name]}.dmg")    
    # Take the read-only master .dmg that has the backgrounds, .DS_Store and folder stubs
    # and make a copy of it to /tmp, then resize the copy so we can add our content, then mount it
    sh "hdiutil convert '#{CACHED_COPY}/development/mac installer/#{PRODUCTS[topics_product][:dmg]}' -format UDRW -o '/tmp/#{PRODUCTS[topics_product][:dmg]}'"
    sh "hdiutil resize -size 4g '/tmp/#{PRODUCTS[topics_product][:dmg]}'; hdiutil attach '/tmp/#{PRODUCTS[topics_product][:dmg]}'; sleep 5"
    # The new, writable dmg is now mounted at '/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}'
    # and it's where we'll assemble the rest of the dmg
    tmp_product_dir = "/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}/#{PRODUCTS[topics_product][:app_folder_name]}"
    # Sort the permissions out (99 is the magic OS X user and group that appears to be owned by the current user when viewed; i.e. my uid is 6, but when I ls -l a file owned 99:99 it appears as 6:6)
    sh "sudo chown -R 99:99 '#{tmp_product_dir}'"
    sh "sudo chmod -R 777 '#{tmp_product_dir}'"
    # Export the product modules to make this specific .dmg
    PRODUCTS[topics_product][:modules].each do |product_module|
      sh "cd '#{CACHED_COPY}';  #{SVN_PATH}/svn export --force '#{PRODUCTION_FOLDER}/#{product_module}' '#{tmp_product_dir}/modules/';"
    # export the help
    sh "cd '#{CACHED_COPY}';#{SVN_PATH}/svn export --force '#{PRODUCTION_FOLDER}/help' '#{tmp_product_dir}/Production File System/help'"   
    # Right we're done making the .dmg so unmount it. The file will remain in /tmp
    sh "hdiutil detach -force '/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}'"    
  desc "Create the hybrid PC & Mac .iso file for #{topics_product}"
  task "build_#{topics_product}_dvd".symbolize => "build_#{topics_product}".symbolize do
    # Create a Hybrid DVD image containing the contents of the PC .iso and the contents of the Mac .dmg
    # get rid of any old tmp files
    sh "sudo rm -rf '/tmp/#{PRODUCTS[topics_product][:pc_iso_volume_name]}'"

    # mount the PC .iso
    sh "hdiutil attach '#{PRODUCTS[topics_product][:pc_iso]}'; sleep 5"

    # mount the Mac .dmg - this is where we'll copy the 
    sh "hdiutil attach '/tmp/#{PRODUCTS[topics_product][:dmg]}'; sleep 5"

    tmp_product_dir = "/Volumes/#{PRODUCTS[topics_product][:app_folder_name]}"
    # Copy the contents of the PC .iso to the mounted Mac .dmg (which is why we resized it to 4Gb earlier)
    # hdiutil needs to operate on a mounted volume to successfully create a hybrid iso
    sh "ditto '/Volumes/#{PRODUCTS[topics_product][:pc_iso_volume_name]}' '#{tmp_product_dir}'"    

    # - unmount the PC .iso
    sh "sudo umount -f /Volumes/'#{PRODUCTS[topics_product][:pc_iso_volume_name]}'"

    # Remove any previous hyrid .iso prior to making this one
    sh "rm -f '/tmp/hybrid.iso'"
    # Make the hybrid iso
    # exlude PC files from the Mac, and exclude Mac files from the PC
    sh "hdiutil makehybrid -o /tmp/hybrid.iso '#{tmp_product_dir}' \
                           -hfs -iso -joliet \
                           -hide-hfs    '#{tmp_product_dir}/#{PRODUCTS[topics_product][:hide_hfs]}' \
                           -hide-joliet '#{tmp_product_dir}/#{PRODUCTS[topics_product][:hide_joliet]}' \
                           -hide-iso    '#{tmp_product_dir}/#{PRODUCTS[topics_product][:hide_joliet]}'"
  desc "Burn the .iso file for #{topics_product} to DVD"
  task "burn_#{topics_product}_dvd".symbolize => "build_#{topics_product}_dvd".symbolize do
    # Burn the DVD                          
    # That really long device is the external burner's firmware address (we're using an external writer)
    # I can never remember how to get this. Use 
    #   hdiutil burn -list | grep IOService
    # to list your device
    sh "hdiutil burn -device IOService:/AppleACPIPlatformExpert/PCI0@0/AppleACPIPCI/PCIB@1E/IOPCI2PCIBridge/FRWR@3/AppleFWOHCI/IOFireWireController/IOFireWireDevice@d04b990c04c4d1/IOFireWireUnit/IOFireWireSBP2Target/IOFireWireSBP2LUN/com_apple_driver_Oxford_Semi_FW934_DSA/IOSCSIPeripheralDeviceNub/IOSCSIPeripheralDeviceType05/IODVDServices /tmp/hybrid.iso"    


The end result of this is that we end up with 11 burn_<a product name>_dvd tasks, which each invoke in turn

  1. update_repository
  2. build_<a product name>_dmg
  3. build_<a product name>_dvd

and ends up with a burnt, hybrid DVD being made for you.

The critical part happens at line 163 where the Mac .dmg is mounted under /Volumes, and is writeable, has the PC content written to it. It seems that hdiutil only likes mounted images when creating hybrid images. I experimented with various other options (using directories in /tmp, etc) but for the hybrid image to built correctly, it seems this is your only choice. The -hide switches list via globs, which files to hide from each filesystem.

Our staging server needed to be able to run multiple unicorns, each responsible for a different rails app, e.g. QA and staging

I wanted a simple /etc/init.d script that will start/stop/reload all my unicorns or just a specific one:

# starts all unicorns listed in /etc/unicorn/*.conf
/etc/init.d/unicorn start 

# stops the QA unicorn
/etc/init.d/unicorn stop /etc/unicorn/qa.conf 

The /etc/unicorn files are just simple variable setters. Here’s a sample /etc/unicorn/staging.conf


Here’s the script to save as /etc/init.d/unicorn (in case the gist doesn’t embed below) – don’t forget to run sudo /usr/sbin/update-rc.d -f unicorn defaults to link it up to your rc.d scripts for running at boot time.


Realtime Election Tweets

Tuesday May 04, 2010 @ 03:58 AM (BST)

Real-time UK General Election Tweets with node.js & websockets from jaygooby on Vimeo.

Inspired by Makato’s X-Factor real-time Twitter experiments and with just a couple of days to go until the country goes to the polls, I wondered if we could glean an outcome from the Twittersphere, especially as we get nearer to the actual count and results.


My real-time election site, monitors the twitter streaming API for mentions of the four three main British political parties, plus the Greens (hey, I live in Brighton Pavillion and they look like they might get their first seat) and their leaders, and tallies a total score and current ratio of tweets.

I’ve since updated it and it monitors all the parties, plus various independents and niche (read 0 current seats in parliament) parties and also tallies phrases like “I voted for”, “I’m voting” etc to count actual votes as well as mentions.

Using node.js and client-side HTML5 websockets (with a fallback websocket implementation in Flash for older browsers), the site is entirely implemented in Javascript. The server-side JS listens for Tweets and emits the scores as json via the client’s websocket. Some client-side JS handles the screen updates.

Server set-up

The static portions of the site are served by nginx (which is itself an asynchronous evented server like node.js). Initially I’d tried to serve the whole thing via node.js using the paperboy.js static file serving module but I’d need both paperboy and the node.ws.js websocket server to share port 80, which would mean some re-engineering, and given that the election is in two days, wanted to get something up quickly!

So my architecture is static files served out of docroot handled by nginx, and the websocket running on a high port for use by the browser clients.

This high-port usage is itself a problem, as I’d imagine that many corporate firewalls block all bar 80 and maybe 8080, which is why the combined server running entirely out of node would be a good eventual goal.

To try and solve this, I thought I’d try and proxy the high port via nginx. You can turn the proxy buffer off in nginx, making it ideal for this, but I hit another hurdle with the Flash implementation of the websocket that will be used by most browsers (only Chrome is currently able to use native HTML5 websockets).

Flash and the crossdomain policy file

Flash requests its cross-domain policy file in an HTTP call to us (aka our Socket Policy Server) thus:


Note there’s no GET verb, just the xml invocation and a NULL byte. Nice. Quite rightly nginx chokes on it, and issues an HTTP 400.

There’s an nginx hack I could have used to get the socket policy served and perhaps correctly tunnel the websocket stream itself, but I just wanted to launch, so I’ve left this as a to-do for now.

Versioning your Amazon S3 buckets

Wednesday February 10, 2010 @ 01:36 PM (GMT)

I’m sold hook line and sinker on the AWS platform. I’m especially impressed at the product innovation and ever-reducing prices.

A few days ago Amazon announced versioning for S3. This means that with the versioning flag for a bucket switched on, you can retrieve earlier versions of your files. Sweet.

Now, because I’m lazy, I tend to use S3Fox or Cyberduck for setting ACLs and creating european buckets and so on.

Neither of these have updated yet to support the versioning flag, and the AWS Console doesn’t have an S3 interface, so I thought I’d get my hands dirty and find out how to do it with the REST interface.

You issue a PUT to your bucket with the versioning querystring and the relevant XML:

<VersioningConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">

You can do it all with cURL, but the biggest pain is generating the correct Authorization: header to sign your API call.

Enter Tim Kay’s aws commandline tool – its a swiss knife for S3 and EC2 calls. Enabling the versioning was as simple as:

aws put my.bucket.com?versioning versioning.xml

Where versioning.xml contains the VersioningConfiguration xml snippet listed above.

To check the versioning status of a bucket, you do:

aws get my.bucket.com?versioning

Before you run the aws commandline, you’ll need to create an ~/.awssecret file with your AWS key and secret key. Don’t forget to chmod 600.

Charanga are participating in the 2010 Sussex Internship programme. We’re looking for a usability and user-experience intern to help us make our music e-learning system as easy to use as possible.

The March 2010 internship positions should go live on Monday 7th December. Currently the placements page is only showing 40 positions, but the additional 60 places will be there from Monday. Then you’ll be able to search for our role.

Our instrumental, vocal and curriculum music e-learning system is used by 55 local authorities and thousands of teachers and students. You’ll be undertaking user testing, making recommmendations and working with our developers to implement some of these, gaining invaluable experience in the process.

Thing 1 goes to the Brighton & Hove Montessori School, courtesy of the government’s Early Years Funding Scheme which we top up so she can attend for four mornings week and one full day.

We love the self-directed, independent, mixed age-group learning environment it provides, but once she’s five, we’ll have to dig deep to fund her education.

A group of Brighton & Hove Montessori parents are campaigning for a state-funded Montessori primary school to be opened in the city. This builds on the precedent of five other state-finded primaries that have opened in the UK.

At the moment, I’m only peripherally involved, but I did make this poster. The last time I did any print-work was using a pre-historic version of QuarkXPress, so this was an interesting challenge.

If you’ve got a child aged 3-11 and you’re concerned about or dissatisfied with the increasingly restricted range of schooling choices for your child in Brighton & Hove, then we’d welcome your support

Copyright © 2011 Jay Caines-Gooby. All rights reserved.
Powered by Thoth.