Melbourne Datathon 2017, 13th April - 5th May
source link Data Science Melbourne presents
Datathon 2017
buy provigil forum 13th April - 5th  May 2017
Sign Up
Get Involved
Come and Participate
Novice to Experts
$5,000 Cash Prizes

It's a wrap - see you in August 2018

The 2017 Melbourne Datathon is now over. We have a video below and some photos here.

The next Melbourne Datathon will be in August / September 2018. More information will be posted shortly.


The 2017 Melbourne Datathon is underway

For the latest news, please keep an eye on the posts below…



  • To learn from each other and cross pollinate skill sets
  • To provide a stage for potential employers and employees to meet
  • To create a buzz in Melbourne around Data Science and reverse the brain drain
  • To solve a real world problem that could impact the lives of all Australians
  • To have fun!

How it works

  • Read this web page and sign up at the bottom
  • Attend one or more of the 3 events in order to sign the NDA (Non-Disclosure Agreement), get the data, form teams and start your analysis,
    Thu 13th April – kick off
    Sat 15th April – hackday 1
    Sun 23rd April – hackday 2
  • Continue working with your team and submit a slide deck before 3pm, Wed 3rd May
  • Five top teams will be pre-selected to pitch their findings on the night of Fri 5th May
  • The Kaggle competiion will continue for a further 4 weeks with the winners revealed at our conference on the 2nd June
  • The datathon is free to participate in
  • Catering will be provided on the hackdays
  • You are free to use your own tools to perform the analysis


Tastylia for sale Prizes will be awarded in the subsequent award ceremony

  • Insights: (5th May)
    • 1st.  $1,500
    • 2nd. $1,000
    • 3rd.  $500
  • Predictions (2nd June): Kaggle Competition
    • 1st.  $1,000
    • 2nd. $600
    • 3rd.  $400
  • The top 5 teams in the Kaggle competition as of 12 noon on 28th May will also be offered 2 free tickets to the conference on the 2nd June
  • Internships will also be up for grabs to entrants of the Datathon (ANZ, iSelect, KPMG, EstimateOne, Billcap, Transurban, Tiberius Data Mining, Northraine and others). You must submit a presentation to be eligible to apply and performance in the Kaggle part will also be taken into consideration by the companies as they review your applications
  • A potential speaking spot at the WombatMeDaScIn conference during Melbourne Data Science Week

If you are a company and would like the opportunity to take on participants of the datathon as interns, or help sponsor the prizes them please fill out the form here.


  • What is a Datathon?
    You work for an analytics consultancy that is pitching to a client for a major piece of work. The client collects data as a by-product of its operations and wants to see if any business value can be extracted from it. You have been given 3 weeks to demonstrate the potential usefulness of the data and put together findings to present to the client.
  • What is the data?
    Trust us that it is the best data set we could have hoped for. It is previously unseen and successful analytics could have a positive impact on the lives of all Australians. We’re keeping the exact content under wraps, so you’ll have to turn up to find out.
  • Do I need to be to be a data science rock star to enter?
    No, this is all about learning and knowledge transfer. Even if you’ve never done anything like this before, please come along. We offer tutorials and mentors on hack day to get you started.
  • What do I need to bring?
    You will need your laptop with your favourite tools installed. Bring lots of curiosity and energy. Don’t forget your power cord!
  • What software can I use?
    You can use whatever you like. We recommend you have a database set up to load the data into.
  • Do I need to already have a team?
    No, we are expecting most people will form teams on the hack day. The organisers will be around to facilitate this with a special event. Don’t worry if you don’t know anyone; lots of people won’t.
  • Can I enter as an individual?
    Yes, but the judging panel will favour teamwork for the pitching part. Each participant can only be part of one submission; you cannot be both on a team and an individual. The Kaggle part is considered separate and you do not have to be in a team – to increase your chance of getting an internship you should enter the Kaggle part as an individual.
  • Why do I need to sign a Non-Disclosure Agreement?
    This is real data from a real ‘client’. It is a condition of them releasing it to you that an NDA is first signed. It basically means you will not use the data for any other purpose, and that you will delete it at the end of the contest.
  • What if I need help?
    There will be a handful of very experienced ‘mentors’ floating around the room on hack day. The purpose of them being there is to give ‘training’ on tools and techniques to munge the data – please use them! We will also host a selection of tutorials.
  • What will be revealed about the data?
    Not much – it is your job to figure things out. On hack day 2, the data owners will be there to give a short presentation and answer any questions you have.
  • How ‘big’ is the data?
    In total it will be ~ 5GB unzipped and there will be about 50 million rows of data in total. It is split into several files of bite-sized chunks and each file can be worked on individually – you will not need to load in everything to start analysis.
  • Can we use additional data?
    Totally – but it has to be publicly available.
  • Are there set tasks?
    No, we provide very little initial guidance. As a true ‘data explorer’, you will have to come up with your own questions for the data. We want the datathon to be just like a real data science consulting task. Ask yourself what the data provider might want to learn, and how you might go about presenting that.
  • What, no guidance?
    Well maybe this year as the data is so awesome and vast, we will give some suggestions as to the type of problems that need to be solved. Also don’t assume that we know anything about the data already, so things like data quality and sanity checking should be addressed.
  • How will it be judged?
    The main focus of our panel will be on the team’s ability to translate their findings into meaningful, easily understandable, actionable and valuable insights. They have a hypothetical budget to allocate and you need to convince them it’s worth spending it on your analytics.
  • Is this like a Kaggle competition?
    There is a predictive component with separate prizes that will be run on Kaggle. This will run for an extra 4 weeks, with the winners being awarded the prizes at the Data Science Melbourne conference on June 2nd. You can enter as an individual or in a team and one member of the team must be at the presentation evening to be eligible for the prize.
  • How do we communicate and stay up to date?
    To ask questions, use the forum here or use the Kaggle forum if it’s about the prediction competition. Once you sign up you will be getting regular email updates closer to the event.
  • What are the rules?
    Each participant can only be part of one team in the pitching competition, and one team in the Kaggle competition. At least one team member should be present on the pitch night to be eligible for a prize. You can be in different teams for the pitching contest and the Kaggle competition, but we strongly encourage you to put in an entry for both the pitching contest and the Kaggle competition.
    You cannot pass on the data to anyone else – all participants must have signed the NDA and collected the data in person from one of the 3 events.
  • How do I apply for an internship?
    Instructions are on the read_me.pdf that is included with the data.
  • How do we submit our entries to the insights competition?
    Instructions are on the read_me.pdf that is included with the data.


Day 1
13 Apr 2017

Evening Launch

Come along after work to sign the non-disclosure agreement, get the data and hear a short presentation about proceedings. Attending the launch event is not mandatory, but will give you an early start. For those who are away over the Easter weekend it will be a chance to get the data. Don’t forget to bring your laptop if you want to get the data!
Day 2
15 Apr 2017

First Saturday – Hack Day I

Wondering what to do on the long Easter weekend? On Saturday, we will provide everything you need to work on your data investigation: food, drinks, a co-working space, wifi – and, of course, the dataset. If you are looking to join a team, this is a great opportunity to ask around and/or attend our special team formation event. We will host a couple of (optional) ‘master classes’ to demonstrate tools, techniques and skills to get you going.
Day 3
23 Apr 2017

One Week In – Hackday II

On the 2nd Sunday you can reconvene with your team, with some experts on hand to help you out.
Day 4
05 May 2017

Pitch Time

On the final night, we will decide which team takes home the honour of Melbourne Datathon champions! Five pre-selected teams will give their pitches before our professional panel. This session is also part of Melbourne Knowledge week, and anyone can attend whether you are a participant or not. After the pitches join us across the road at Platform 28 for a post datathon drink.


Click on the links below to see the venue locations

Zendesk Basement, 395 Collins St (Queen & Collins)

Gurrowa Innovation Lab, Telstra, Level 2, 242 Exhibition Street, Melbourne Note: The entrance is on Lonsdale St.




SAB, RMIT Building 80, 445 Swanston St.

nab Arena – 700 Bourke St

Platform28 – 82 Village St. Docklands


The Panel

This is our board of directors who you need to sell your story to!

Catherine Lopes

Lead Data Scientist at ANZ

Gregory Hill

Global Head of Analytics at Brightstar

Mike Da Gama

Director at NostraData Pty Ltd

Sarah Pizzey

General Manager at Sigma Pharmaceuticals


There will be a few experienced people floating around and available to help you out with technical things. Please use them, it’s a good opportunity to get a one on one tutorial.

If anyone else wants to help, just turn up on the hack day.

Hackdays Detailed Schedule

Saturday 15th April - Telstra, 242 Exhibition Street
Welcome to the 2017 Melbourne Datathon hackday number 1! If you have not yet signed an NDA, please do so upon signing in. If you are looking for a team, grab a name sticker and follow the instructions. After signing in, make your way to the data station to load up the dataset.
10:00-10:30Forming Teams
Attend this event if you are looking for a team. We will have muffins and instructions waiting for you.
11:15-11:45Getting Started - Phil Brierley
In this presentation we will give a short demo of loading the data in a couple of tools (example code for this will also be included with the data)
12:30Lunch - sitting 1
A pizza lunch will be served in the kitchen area. It will be busy so grab a pizza and take it to share with your team. Gluten free pizzas will be available in this first batch only
1:15Lunch - sitting 2
More pizza will arrive
Optional tutorials in the presentation area for those who want to join us.
2:00-2:30An Initial Analysis - Shane Butler
We'll be challenging Shane, a data scientists at Telstra, to see what he has been able to come up with in the first morning.
3:30 - 4:00Data Visualisation with Yellowfin - Edgar Kautzner
Edgar will show what he has discovered in the data using Yellowfin.
6:00pmEnd of our time at Gurrowa. I'm sure we can find a local pub to continue.


Sunday 23rd April - RMIT, 445 Swanston St
Welcome to the 2017 Melbourne Datathon hackday number 2!
If you have not yet signed the NDA and loaded the data, then there is still an oportunity to participate by attending this event. For those who have already started, it is a chance to get back together with your team.
10:00-10:30Forming Teams
For those who have not yet formed a team then there will be an opportunity to meet others at this session.
11:00-11:30Data walkthrough, Q&A
So far we have told you little about the data.
In this presentation, our data sponsor will give a quick overview of the data and answer any questions you may have.
Snacks will be available - for anything more substantial please bring your own.
Optional tutorials in the presentation area for those who want to join us.
1:30-2:00 Data Visualisation - Paul Hodge
a quick introduction to Storytelling/Narrative development and Design tips
2:00-2:30Delivering a Presentation - Mark Alexander
4:45pmEnd of our 2nd hackday. We're looking forward to seeing your findings.

Sign Up!

Registrations for the Melbourne Datathon 2017 are closed as of April 7. We are at absolute maximum capacity and we want to make sure we can put up a great event. If you missed out, join our Meetup to stay connected and keep an eye out for next year’s datathon!

The NDA that has to be agreed to can be found here.


Extreme Gradient Boosters





Logistic Regressors (Our Hosts)

zendesk130x500 nab130x500



Deep Learners (Data Provider & USB Deliverer)



Bayesian Believers

We appreciate all those who continue to support Data Science Melbourne  throughout the year.

La Trobe University, Yellowfin, Data Science Solutions, AGL, iSelect, Teradata, Rubix Consulting, SAS, Monash University, KPMG, Zendesk, northraine, Sportsbet, Tiberius Data Mining, nab


All Help Appreciated

If you are a company and would like the opportunity to take on participants of the datathon as interns, or help sponsor the prizes them please fill out the form here.



Joost van der Linden, Phil BrierleyLinden Jensen-Page, Kate van der Linden, Nick Makasis & Alexia Pogiatzi


The Melbourne Datathon is part of the Melbourne Data Science Initiative, MeDaScIn 2017 – 29th May – 2nd June

The datathon is also part of the Melbourne Knowledge Week, 1st – 7th May 2017, proudly presented by the City of Melbourne