On hackathons: lessons learned, experience, advice
Although our daily job is to help companies with setting up their analytics processes, building machine learning solutions, and hiring data scientists, we never say no to a hackathon invitation. Recently, we participated in one exciting hackathon in Slovakia, so we decided to share our experience while it's still fresh in our memories.
Corporate-startup cooperation
With the speed of data growth in the last couple of years, there is a constant need to analyze and leverage the data that are collected. Companies have started to analyze the data, build machine learnings models, and make data-driven business decisions. However, there are still many that are lagging behind in this area and are trying to improve. If you belong among them, you have several options:
Cooperation with a consultancy firm. These are companies that will help you to set up the processes, build first machine learning solutions, and train your current employees. Such cooperation is a great starting point – the consultants will advise you on which data you can leverage, how to collect them, and will know from their previous experience which techniques are most suitable for you.
Hiring the first (=lead) Data Scientist. At one point, you might realize that cooperation with a consultancy firm does not make sense for you anymore. This might be because leveraging the data is at the core of your business, because the maintenance of the existing infrastructure built by an external company suddenly takes enough time to keep a full-time employee busy, or simply because you just want more control over the process. This can be a tricky task since hiring your first 'data guru' is not the same as hiring for any other position in the company. If you need some advice, you might find useful one of our past articles on this topic.
Look for the inspiration and/or potential partners by organizing hackathons. Although you cannot expect a ready-to-deploy solution from a hackathon, you will certainly get many fresh ideas, various approaches to a given problem (or a definition of a new challenge) and a look from several different angles.
Motivation
Nowadays, there are many smart, data-oriented people and small startups (like us ;)) that are constantly looking for ways to open doors into bigger companies to show them what they can do. This is one of the reasons we love hackathons. It's not only fun and a great learning experience but also gives small companies a chance to get to know the competition and/or potential future partners but mostly to meet the big, well-established players (usually the organizer of the event), who can otherwise be really hard to get to. We see hackathons as a win-win for both sides – the organizers will get a number of solutions and good ideas while the participants gain experience and knowledge about the specific industry.
Our first hackathon experience
The first time we participated in a hackathon was 4 years ago. It was organized by Daimler in Germany. The task was to build a model, that would check for outliers and data drift in different datasets. We decided to approach the problem in a robust way. We did an ensemble of different statistical tests, where weights in the ensemble would adjust based on the user feedback so that it took the most information from the test that was perfect for the specific dataset. This way, also people without math skills would be able to run these tests.
Before the final presentations, our hopes were high, because our solution was capable of identification of outliers and data drift in different types of data, numerical or categorical. But the reality struck us during the final presentations – while our solution would return a json with row ids of outliers, there were teams with functional interactive dashboards and a lot of other fancy features. The fact we didn't win came as no surprise. The beginner's luck wasn't really applicable this time, rather the luck favored the prepared.
Takeaways
Our most important takeaway from this hackathon was that it was an excellent learning opportunity. I had a lot of theoretical knowledge from the university. However, since I studied Mathematics, I lacked coding skills. We had some statistical courses where we used R, but at that time, I hadn't been smart enough to realize R was something I would use in the future :). During the hackathon, we used Python, and for me personally, it was 40 hours of constant googling and using of stack overflow for Python-specific errors. There is hardly a way you can learn more in two days than by participating in a hackathon. Although you will need to invest some extra time in learning best practices for coding afterward because you almost certainly end up with an ugly piece of code :).
Four years and four hackathons later
Since we liked the experience, we participated at other hackathons later on, such as a travel hackathon in St Moritz and a hackathon organized by Andritz in Graz. Each time we came a bit smarter and better prepared and left even smarter and even better prepared for the next one.
Our team at the Andritz hackathon in Graz. Full focus mode on.
Recently, we participated in another great hackathon organized by ZSE – a Slovak electric utility company based in Bratislava, Slovakia, in cooperation with ImpactHUB. ZSE gave us a chance to see real data from the energy sector and how they use it. On the other hand, we got a chance to show them what we could do with this data to innovate their business.
The task
There were two tasks given to teams. First, to build a dashboard showing how much energy is consumed and produced in each grid supply point. The other one was to come up with an interesting model for trading the electricity among different grid supply points to minimize the costs.
Our approach
Each hackathon is a bit different. This one was specific because we knew the tasks and had access to the data 2 weeks beforehand. However, we didn't use this much to our advantage, since we were quite busy with our clients and on top of everything, we traveled to the Websummit conference one week before the competition. We expected the other teams to come with a ready solution. Fortunately, this didn’t happen so they gave us a chance to compete with them :). Because of a delayed flight, one of our team members arrived late 5 hours late and we had a work to do to catch up with the other teams so we decided to approach the challenge in a real 'hackathon style' (=no sleep and a lot of coke). We ended up being the only ones who stayed up all night.
Building the solution, we mostly focused on thinking about how to differentiate from the other teams. We knew that everyone would be able to come up with some kind of a dashboard. We decided to use open source technologies and build our dashboard from scratch using Python and Dash.
This is what our dashboard looked like.
This way allowed us to easily add additional insights from custom machine learning models. In the end, the users of this dashboard wouldn't only be able to see the current consumption, but also future predictions and they could compare each grid supply point with those with similar consumer behavior using our own clustering model.
Income vs Spending for one specific grid supply point.
And the winner is…
Quite distinctively from the final presentations at the hackathon 4 years ago, this time we could see that our solution was objectively better than the others – a working demo, an interactive dashboard,... and it was us who walked off with the first prize!
It feels good to be the one walking away with the big check :).
Conclusion
Usually, it is very hard for a small company to get access to real data of a company such as ZSE. Therefore, hackathons are also a great way to get to datasets from different industries. We encourage all data enthusiasts, whether they are beginners or advanced, to take part in these competitions from time to time to learn new things, meet new companies and people, and have some fun!
If you would like to learn more about our winning solution feel free to reach out to me at juraj@knoyd.com