Public policy students explain how they cleverly crunched data against statistics and computer experts.
“Let the data take you places” was the statement chosen to inaugurate DataFest Germany, a competition held at the University of Mannheim in which a small team of Hertie students participated in April 2017. In the hope that more Hertie teams will take part in the future, team members Carolina Matamoros Ferro, Byunggyu Park and Álvaro López Guiresse outline some of the lessons that they took away from their data (ad)venture.
Lesson 1: Lo barato sale caro (what’s cheap turns out to be expensive)
Our data venture began way before the event, namely when our team tried to save a few Euros by booking early train tickets (6 a.m. departure and 5 a.m. return). This turned out to be a very bad idea, especially for those who overslept and had to pay twice as much for a 10-hour bus ride to Mannheim.
Lesson 2: Be prepared
We arrived early at the University of Mannheim to find out that nobody actually knew about the DataFest and most of our questions were met with a blank expression. All was well by the evening, when the event began and the data were revealed with all the suspense of the unboxing of a new iPhone. Unfortunately for you, dear readers, we are not at liberty to talk about the sponsor or the contents of the dataset. What we can tell you, however, is that a dataset of over 10 million observations was thrown at us! What this meant was that, of the five laptops we had brought with us, only two of them were really of any use. We were surprised that one team even brought in their own desktop computer, which was all set up and ready by the time we arrived.
That was worrying, but we proceeded to our rooms to settle in. All the other participants seemed extremely skilled, dedicated and friendly, despite the competitive environment.
Lesson 3: Reputations carry a lot of weight
Over dinner we had the chance to speak to the other groups. We revealed our background as Public Policy students, and the reactions ranged from pleasant surprise to clear underestimation. Obviously, everyone else had backgrounds in the “hard sciences”, which included technical-sounding titles such as Statistician, Computer Scientist and Bio-Informatics. The host, Dr. Frauke Kreuter, a Data Science lecturer from Mannheim, made special mention of our school. Despite the fact that she mistakenly referred to us as the team from “Hertie Government School”, Dr. Kreuter said we “always make such good performances, and always [have] a lot to show”. So, it turns out we have a reputation to live up to.
Once the teams were left to “crunch” the data, our energy and focus were sustained by an incredibly large (and, thankfully, supplied free of charge) intake of sugar and caffeine. The University of Mannheim kindly provided us with free breakfasts, lunches and dinners. Perhaps it’s not just our reputation that’s now carrying a little extra weight…
Lesson 4: Bring very powerful computers
As we started, our computers crashed, over and over and over again. Nonetheless, our background as Public Policy students proved hugely advantageous for outputs and the insights that we could produce. We were using global data for the whole of 2015 on digital users’ online behaviour and purchasing activities. Our team proposed to focus on a policy issue that could affect consumer choices and would be of interest to both governments and the sponsor. Our chosen policy area was security and we used the open data sets for cities in the United States, specifically from Chicago. The crime dataset was huge as well, having around 6 million observations and ultimately eating up 1.5 GB of space.
Lesson 5: When offered it, always accept help
We were surprised by and grateful for the amount of assistance we received from the staff members. It was outstanding, and made fighting the dataset a lot more bearable. We learnt how to represent data geographically, and coded an algorithm to locate observations on a grid of latitudes and longitudes, given the awkward shapes of Chicago’s wards. We are especially grateful for the help we received from Christopher, a 2016 Hertie alumnus. He was one of the most involved members of staff, and provided a tutorial for ggplot (data visualisation). Since participating in DataFest 2016, he has landed a role with Blackwood 7, an advertisement and media market research company, where he now focuses on data science.
Chicago’s Wards – Crime Density
Our main idea was that security, or a lack thereof, influenced consumer choices. As a result, we wanted to overlap both – the crimes and the observations of the sponsor’s dataset – in a single heat map. The following is the end product of this process. (Perhaps we took Christopher’s tutorial a little too seriously).
Crime and Consumer Hesitancy
We were (and still are) very proud of this map – so much so that three of our team members went to Mannheim’s party district to celebrate our accomplishment at around 3 a.m., even though we needed to present it that same morning.
Lesson 6: Think outside of the box – be bold, be creative
On the final day, we had to deliver a presentation at 11:30 a.m. on all 190 aggregated hours of work. The biggest surprise of that day turned out to be Raju’s until then untapped poetry skills! We included a verse from his latest poem (originally written for his wife, Anna) which made for a wonderful presentation.
While listening to the presentations, we got an opportunity to see what the other teams were working on. The diversity of the work that had been undertaken was awe-inspiring. Some teams criticised the data provided during Cyber Monday and some analysed the effect of winter blues on consumer behaviour.
We also learnt that if a presentation runs out of time, forcing a big round of applause is a great way to get the presenters to wrap it up.
Lesson 7: Never, ever, underestimate Hertie Students
Half an hour after all the presentations were over and a group photo session had taken place, it was finally time to announce the winners. The three award categories were as follows: Best Insight, Best Use of External Data and Best Data Visualisation. We were equally amazed and honoured to be awarded second place in Best Data Visualisation, all thanks to our Christopher-inspired graph!
We would like to thank Benjamin Gaiser and Professor Mark Kayser for their guidance and support in preparing for DataFest Germany 2017. We are deeply grateful for their #hertielove.
This article was originally published by the Hertie School student magazine The Governance Post.