Blog

Blog Categories

Accessing and analyzing media content is a fascinating part of data analytics. It allows to follow trends of public interest over time or to see how stories evolve (e.g.  newslens ). While many media outlets offer APIs, it is cumbersome to collect them individually. News API closes that gap and allows to search and retrieve live articles from all over the web. .caret, .dropup > .btn > .caret { border-top-color: #000...
The random forest algorithm is the combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. It can be applied to different machine learning tasks, in particular, classification and regression. Random Forest uses an ensemble of decision trees as a basis and therefore has all advantages of decision trees, such as high accuracy,...
Since the dawn of the digital age, the amount of data stored on servers has risen dramatically. More and more firms are looking for talent that can handle their datasets and generate insights for business decisions. Data scientists are among the most popular for this task. Google Trends shows that the global volume of the search term “Data Scientist” has tripled over the last 5 years - but how does the increasing demand translate...
  The EU Open Data Portal gives access to open data published by EU institutions, agencies and other bodies. Around 70 EU institutions, bodies or departments use the platform to make over 12,500 datasets available. In this Jupyter Notebook we will retrieve data from open data portal " http://data.europa.eu/euodp/en/home ". The portal is based on the open source project CKAN. CKAN stands for Comprehensive...
In this Jupyter Notebook we will retrieve data from the European Central Bank (ECB). The ECB publishes through the European Open Data Portal, which we discussed in the previous tutorial . Before diving into the code, please take a quick look at the following websites, to get a feel for what we will be dealing with. EU portal: https://data.europa.eu/euodp/en/data/publisher/ecb ECB SDMX 2.1 RESTful web...
  Google became the main starting point for our online activities. Processing more than 40,000 search queries every second, Google captures a lot of what we’re thinking and worrying about all the time. Hidden racism, sexual orientation or ad returns - check out the work by Seth Stephens-Davidowitz to get some inspiration for the huge potential of Google Trends data. While the Google Trends cockpit...
Since the dawn of the digital age, the amount of data stored on servers has risen dramatically. With this increase, more and more firms are looking for talent that can handle their datasets and generate insights for business decisions. Google Trends shows that the global volume of the search term “Data Analyst” nearly tripled over the last 5 years. How does the increasing demand translate into earnings of data analysts in...
Das Erlernen neuer Programmiersprachen ist eine Investition ins Humankapital. Die Ermittlung des Return on Investment kann daher sehr aussagekräftig sein. Die Anforderungen für jede Branche und jeden spezifischen Job sind sehr spezifisch – eine verallgemeinerbare Antwort auf diese Frage zu finden, ist deshalb schwierig. Ein Ansatz könnte aber darin bestehen, die erforderlichen Softwarekenntnisse bei...
With the rise of the amount of data stored in servers, the demand has also risen for data engineers to help manage the vasts amount of data now available to us. Data Engineers are in high demand, and Google trends have shown that the global volume of the search term “Data Engineer” has tripled since 2014. More and more people are seeking skilled data engineers to help manage the vasts amount of data stored across the globe, and we...
Companies use machine learning to improve their business decisions. Algorithms select ads, predict consumers’ interest or optimize the use of storage. However, few stories of machine learning applications for public policy are out there, even though public employees often make comparable decisions. Similar to the business examples, decisions by public employees often try to optimize the use of limited resources. Algorithms may assist...
Are you looking for real world data science problems to sharpen your skills? In this post, we introduce you to four platforms hosting data science competitions. Data science competitions can be a great way for gaining practical experience with real world data, and for boosting your motivation through the competitive environment they provide. Check them out, competitions are a lot of fun! Kaggle Kaggle is the best known platform...
Curious about neural networks and deep learning? This post will inspire you to get started in deep learning. Why are we witnessing this kind of build up for neural networks? It is because of their amazing applications. Some of their applications include image classification, face recognition, pattern recognition, automatic machine translation, and so on. So, let’s get started now. Machine Learning is a field of computer science that...