Category Archives: programming

Introducing Registrasion!

Time for me to fill you all in on some work I’ve been doing in preparation for linux.conf.au 2017. I’ve been looking into how we can better run the conference website, and reduce the workload of our volunteers into the leadup to the conference.

linux.conf.au has, for the last 10 conferences, used a home-grown conference management system called Zookeepr. I got administration experience in Zookeepr after being involved in running PyCon Australia for a couple of years, and continued to help manage registrations for the years following. While Zookeepr is a venerable piece of software, my 4 years of experience with it has helped me become familiar with a bunch of its shortcomings. Most of these shortcomings are in the area of registration handling.

A problem with conference management software is that the people who come to develop on it are often highly transient — they’re conference organisers. They show up, they make their modifications, and then they get as far away from developing it as possible. Zookeepr’s been affected by this, and it’s meant that difficulties with workarounds are usually overlooked when fixing things.

So I decided to look elsewhere.

Back in 2012, the Python Software Foundation funded a conference management suite called Symposion.

Symposion solves a bunch of problems that Zookeepr solves, and more importantly, it doesn’t suffer from the lack of continuous contributions that Zookeepr has: It’s an actively-maintained app, built on Django, and it has a community of developers supporting it in the form of the Pinax project. In the Python world, it’s used for a very large number of conferences, from PyCon US (a big conference, getting some 1000 talk submissions yearly), down to local regional conferences like PyOhio. It’s well known, and improvements to the system aren’t contingent on conference organisers maintaining interest in the system after they stop running conferences.

Unfortunately, for various reasons, Symposion doesn’t handle conference registration.

So after OSDC2015 in Hobart successfully ran their conference website with Symposion, I decided to plug the gap. In January this year, I jotted down all of the things that I thought was good about Zookeepr’s registration system, and thought about how I could go about objectively improving upon it.

I threw together a data model, and wrote a test suite, and liked what I saw. I asked the Python Software Foundation for a grant to let me do some concerted work on the project for a month or so, and they accepted.

The result is Registrasion (that’s Registration for Symposion (sorry)). I’ve just put out a 0.1 release, which I believe is suitable for running a conference if you’re happy to pull data out of the system with SQL queries, and take payments with bank transfers.

Registrasion was designed with a few key goals in mind, all of which came from observing how Zookeepr struggled around some frequent edge cases that, incidentally, come up late in the process of running a conference. Those late-occurring edge cases are invariably the ones that don’t get fixed, because volunteer conference staff all need to go and run their conference.

In particular, I focused on:

  • Eliminating manual work for registration managers (Zookeepr has a lot of that)
  • More flexibility in how we automatically offer certain items to end-users (selling flexible accommodation dates was a difficulty one conference year had)
  • Handling money properly, so that it’s possible to easily reconcile inventory and what’s in the invoicing system

Many of these goals solidified after talking to past conference organisers, who’d all used Zookeepr.

I’m quite proud of a few things in Registrasion. The first is that Registrasion makes it really easy for attendees to add extra things to their registration as they decide they need to. We also take care of automatically giving out freebies that attendees forgot to select during initial registration. In PyCon AU’s case, that’s a lot of manual work we can avert.

Another is a really flexible way in managing what parts of the inventory are available to our users, and at what time. We can show and hide items based on voucher codes, or based on whether they have other products selected. This averts a whole pile of manual work that a past linux.conf.au reported, and I’m glad that our year won’t have to

Finally, I’ve made sure that Registrasion has a lot of documentation. It was a key goal to make sure that new conference organisers can understand vaguely how the system fits together. Python’s tools, and Read The Docs, has made this very very easy.

There’s a pile more work to be done, but there’s also plenty of time before lca2017 opens its registration (in October, probably?). But so far, it’s been super-fun to dive back into Django development, given it’s something I haven’t played with in a few years, and to solve a problem that I’ve been dwelling on for a couple of years now.

Hopefully, in Registrasion, we’ve got a piece of software that can serve the community well, will find use outside of LCA, but will still serve LCA’s needs well for years to come.

If you’re interested, please come along and contribute! I’d love to have you on board!

Booting out the Warmest 100

(Beware – this article includes a link to some probable spoilers for tomorrow’s Hottest 100 count. You can read this article without reading those spoilers.)
 

You’re probably familiar with Triple J’s Hottest 100. It’s the world’s largest write-in music poll. Last year, Triple J made an easy, shareable link for people to post their votes out on Twitter and Facebook. Alas, these links were easy scraped from the web, and the Warmest 100 (link to 2012 count) was born. The top 10 (but not its order) was revealed, and the top three was guessed perfectly.

This year, voters weren’t given a shareable link, but a few thousand people took photos of their confirmation e-mails and posted them to Instagram. With a tiny bit of OCR work, the Warmest 100 guys posted their predictions for this year. They found about half the number of votes that they did last year through the scraping method, which is no mean feat, given the lack of indexing.

So the question is — how useful are these votes in predicting the Hottest 100? What songs can we be sure will be in the Hottest 100? How certain is that top 10?

Both years, Justin Warren independently replicated their countdown (spoilers), and has written up his methodology for collecting the votes this year. I asked him for his data to do some analysis, happily, he obliged.

He’s since updated his method, and his counts, and written those up, too (spoilers).

Update: he’s updated his method *again* based on some feedback I offered, and has also written that up (spoilers). This is the data my final visualisation runs off.

So, what have I done with the data?

Bootstrap Analysis

When you have a sample — a small number of votes — from the entire count, you can’t really be certain where each song will actually appear in the count.

In this case, Justin’s data collected 17,000 votes from an estimated 1.5 million votes. That’s a sample of 0.1% of the total estimated vote. It’s a sample, but we have no idea how that compares to the actual count.

If we think that the votes that we have is a representative sample of all of the votes, then what we’d like to know is what would happen if we scale this sample up to the entire count. Where will songs end up if there’s a slight inaccuracy in our sample?

The good news is that computers give us a way to figure that out!

Bootstrap analysis (due to Effron) is a statistical technique that looks at a sample of votes from the whole set of votes, and randomly generates a new set of votes, with about as many votes as the original sample. The trick is that you weight each song by the amount of votes it received in the sample. This means that songs are picked in roughly the same proportion as they appear in the sample. The random sampling based on this weighted data adds noise.

You can think of this sample as a “noisy” version of the original sample. That is, it will be a version of the original sample, but with slight variation.

If you repeat this sampling process several thousand times, and rank the songs each time, you can get a feel for where each song could appear in the rankings.

How do you do that? Well, you can look at all of the rankings a given song gets for each randomised set. Sort this list, and pick the middle 98% of them. Based on that middle 98% of rankings, you can be 98% certain that the song will be at one of those positions. In statistics, this middle 98% is called the 98% confidence interval by bootstrap.

You can repeat this for different confidence levels, by picking a different amount of rankings around the middle.

I’ve used Google Spreadsheets to visualise these confidence intervals. The lightest blues are the 99% confidence intervals. The darkest blue intervals are the 70% confidence interval. The darkest single cell is the median — i.e. the middle of all of the rankings that we collected for that song in the bootstrap process.

The visualisation is up on Google Docs. (spoilers, etc).

I’ve run the same visualisation on Justin’s 2012 data, it’s less of a spoiler than the 2013 version if you care about that. It can inform the rest of the article for you.

Notes

First up, a bit on my methodology: Justin’s data didn’t separate votes into their original ballots. So, I had to pick songs individually. To improve accuracy, I selected songs in blocks of 10, where each song in the block of 10 is different — this vaguely resembles the actual voting process.

In my experiments, I ran the sampling and ranking process 10,000 times.

You’ll notice some interesting trends in this visualisation. The first one is that the higher the song is in the countdown, the narrower its blue interval is. Why is this so?

Well, as songs get more popular, the distance between each song in the number of votes received grows. In Justin’s sample of the votes, #100 and #73 were separated by 15 votes. So if one or two votes changed between #73 and #100, that ordering could change spectacularly. Given Justin’s sample is of 17,000 votes, 15 votes represents an 0.1% change in the vote.

So at those low rankings, a tiny change in votes can make for a massive difference in ranking.

At the other end of the count, #1 and #2 are separated by 16 votes. #3 and #4 are separated by 22 votes. #4 and #5 are separated by 51 votes. Down the bottom of the list, where 16 votes could move a song 33 places in our count, you’d need 16 votes to change just to swap positions 1 and 2.

What this means from a statistical perspective is that the closer to the top you are, the more work you need to do to change your position in the count.

You’ll also see this phenomenon in the right-hand side of the intervals. The interval of a given colour on the right-hand side of the interval will generally be longer than the same colour on the left. Once again, this is because lower ranks swap around more easily than higher ranks.

Update: Since writing this article, I ran one more test – how many of the songs in the top 100 of Justin’s raw  sample of votes will make it into the actual Hottest 100? Well, bootstrapping helps us here too. For each bootstrap trial, I take the top 100 songs, and see how many of those are in the raw top 100. I reckon, with 98% confidence, that we’ll get 91 songs in the actual Hottest 100. Thanks to David Quach for the suggestion.

In summary: the Warmest 100 approach is statistically a very good indicator of the top 4 songs. The top 4 is almost certainly correct (except that 1&2 and 3&4 might swap around between themselves). Everything up to #7 will probably be in the top 10.

The sampling approach is less accurate at the bottom, but I’m pretty confident everything in the top 70 will be in the actual top 100.

I’m also pretty confident that 91 of the songs in the raw top 100 will appear in the actual top 100.

End

I’ll be making some notes on how these confidence intervals got borne out in the actual count on Monday. I’m very interested to see how this analysis gives us a better idea of how accurate the Warmest 100 actually is.

Talk: Portable Logic/Native UI

My first talk from DroidCon India 2013 (November, Bangalore). It’s an exploration into the approach that we’ve taken at AsdeqDocs in producing a properly cross-platform mobile app. We take the approach of separating our core application logic into a C++ codebase, and apply platform-specific user interfaces over that codebase.

This talk covers the software engineering principles that make that work; as well as the benefits, difficulties, and insights that we’ve learned over a few years of doing this. It’s probably the favourite of my mobile dev talks.

Announcing the LCA2014 Open Programming Miniconf

Very pleased to say that I’ll, once again, be running an Open Programming Miniconf at Linux.conf.au in January. This time around, the conference will be at the University of Western Australia in Perth.

I’m especially pleased, because after initially being rejected by the conference team, with limited time to assemble a line-up, I’ve put together what I think is the best Programming miniconf lineup in the five years I’ve been running it.

One of the goals of the Open Programming Miniconf is to be a forum for developers to share their craft: ideas for improving the way people code, and topics that are of benefit to people who develop using many open source programming languages. This year, for the first time, I think we’ve filled that remit.

This year’s talks cover everything from low-level mobile programming and driver development, to deployment of web applications, as well as talks about packaging, deployment, and development tools.

We also don’t have a single state-of-the-language talk. Everything’s about topics that can be transferred to any number of languages.

I’m excited! If you’re interested in the miniconf, check out our schedule and all of our abstracts at the conference wiki. See you in Perth!

A friendly PyCon Australia 2013 Early Birds reminder

We’re down to just over 20 early bird registrations left of our original quota of 80. That means that we’ll probably run out of Early Bird tickets before our deadline of Friday.

The big announcement to every mailing list I can think of will happen tomorrow, so today’s a great chance to to get in before the tickets suddenly disappear.

Early Bird Registrations start at $165 for individuals, with discount registration available for students at $44. All the details are at the PyCon Australia 2013 web site.

Announcing the LCA2013 Open Programming Miniconf!

TL;DR — submit a proposal at http://tinyurl.com/opm2013-cfp before the first round closes on Monday 29 October 2012.

***

I’m pleased to announce that The Open Programming Miniconf — a fixture for application developers attending Linux.conf.au since 2010 — is returning as part of Linux.conf.au 2013, to be held in January at the Australian National University in Canberra. The Miniconf is an opportunity for presenters of all experience levels to share their experiences in in application development using free and open source development tools.

The 2013 Open Programming Miniconf invites proposals for 25-minute presentations on topics relating to the development of excellent Free and Open Source Software applications. In particular, the Miniconf invites presentations that focus on sharing techniques, best practices and values which are applicable to developers of all Open Source programming languages.

In the past, topics have included:

  • Recent developments in Open Source programming languages (“State of the language”-type talks)
  • Tools that support application development
  • Coding applications with cool new libraries, languages, and frameworks
  • Demonstrating the use of novel programming

If you want an idea of what sort of presentations we have included in the past, take a look at our past programmes:

To submit a proposal, visit http://tinyurl.com/opm2013-cfp and fill out the form as required. The CFP will remain open indefinitely, but the first round of acceptances will not be sent until Monday 29 October 2012.

OPM2013 is part of Linux.conf.au 2013, being held at the Australian National University, Canberra in January 2013. Further enquiries can be directed to Christopher Neugebauer via e-mail ( chris@neugebauer.id.au ) or via twitter ( @chrisjrn ).

Linux.conf.au 2012 Open Programming Miniconf — Call for proposals now open

TL;DR — submit a proposal at http://tinyurl.com/opm2012-proposal
before the first round closes on Friday 7 October.

I’m pleased to announce that The Open Programming Miniconf, a fixture for application developers attending Linux.conf.au since 2010 is returning as part of Linux.conf.au 2012, to be held in January at the University of Ballarat. The miniconf has been an opportunity for presenters of all experience levels to share their experiences in in application development using free and open source development tools.

The Open Programming Miniconf for 2012 invites 25-minute presentations on topics relating to the development of excellent Free and Open Source Software applications. In particular, the Miniconf invites presentations that focus on sharing techniques, best practices and values which are applicable to developers of all Open Source programming languages.

In the past, topics have included:

  • Recent developments in Open Source programming languages (“State of the language”-type talks)
  • Tools which support application development
  • Coding applications with cool new libraries, languages and frameworks
  • Demonstrating the use of novel programming techniques

Past programmes can be found at http://lca2011.linux.org.au/programme/schedule/monday and http://www.lca2010.org.nz/wiki/Miniconfs/Open_Programming_Languages

To submit a proposal, visit http://tinyurl.com/opm2012-proposal and fill out the form as required. The CFP will remain open indefinitely, but the first round of acceptances will not be sent until Friday 7 October 2011.

OPM2012 is part of Linux.conf.au 2012, being held at the University of Ballarat on Monday, 16 January 2012. Further enquiries can be directed to Christopher Neugebauer via e-mail ( chris+opm2012@neugebauer.id.au ) or via twitter (@chrisjrn).

30 Days of Geek: 14 – Favourite Computer Conference

Oh. You probably won’t be surprised to hear this one, but the answer is Linux.conf.au the Australasian Free and Open Source Software Conference. I’ve been attending since Melbourne 2008, and have since “been” to Hobart in 2009 and travelled to Wellington to attend in 2010.

LCA is a great conference because it gives people in the broader FOSS-using community in Australia (people like me) the opportunity to meet the people who put together the software that we used on a day-to-day basis. It turns out that they’re an entirely friendly bunch of people, who are all too willing to share their experience: in 2008, Andrew Tridgell spent 20 minutes one-on-one with me explaining how a particularly awesome piece of code he’d written worked.

In 2010 I ran one of the short single-day conference streams (known as “miniconfs”), on the topic of Open Programming Languages. This was a fantastic opportunity to give back to the LCA community, and help bring more of the topics that I was interested in to LCA — we had a fantastic lineup of presenters, and the day went awesomely. I’m glad to have the opportunity to do this again: I’m running the Open Programming miniconf at LCA2011 in Brisbane, and along with my friend Peter Lyle, will be running the Research and Student Innovation Miniconf. Both of them are shaping up to be excellent miniconfs.

So yes, LCA is in Brisbane this January, and I thoroughly recommend you get along if you can!

DSCF1923.JPG
Linux.conf.au 2009 -- Day 5
LCA2010 Speakers Dinner
LCA2010 Open Day