Your Career Platform for Big Data

Be part of the digital revolution in Germany

 

Latest job opportunities

EyeEm Berlin, Germany
21/08/2018
Full time
DESCRIPTION EyeEm   is a photography company on a mission to discover and showcase new talent through technology. As one of the world’s fastest-growing photo communities, EyeEm connects over 20 million photographers with brands and agencies around the world. Since the beginning of EyeEm in 2011, we have become the largest and leading source for authentic, royalty-free images. Read more about our   story ! Thanks to a unique combination of advanced search technology, Market and Missions, EyeEm photographers have a chance to showcase their original work and license their photography to be seen by a global audience. What makes you passionate about data? EyeEm is searching for a Data Analyst to join our team. As our data analyst, you’ll be responsible for delivering insights to internal consumers such as product, supply and sales/marketing. You’ll work closely with key partners from these areas to gain an in-depth knowledge of business requirements, helping them to better understand performance as well as uncovering new opportunities. In addition, you’ll have the opportunity to work alongside our engineering team and gain insight into high-volume data processing and integration. Key responsibilities include (but are not limited to): Delivering analysis projects and ad hoc reports to key partners. Building and maintaining corporate dashboards. Responsibility for mobile and web tracking. Participating in rotating on-call duty. REQUIREMENTS A degree in Computer Science, Statistics, Mathematics or related field, or relevant industry experience. You are a self-driven and fast learner, and eager to work with new technologies. You are a great teammate and motivate yourself and people around you to improve every single day. You have good knowledge of SQL and Excel. You have experience with Tableau or similar dashboarding tools. Preferred Extensive Python skills, especially Pandas. Knowledge of both relational and non-relational databases and their respective models. Experience with distributed databases like Redshift and query languages like Hive/HQL. Knowledge of common data structures and ability to write efficient code in at least one language (preferably Python, Java or Scala are a plus). Knowledge of Apache Spark or other distributed computing engines is a plus. BENEFITS At EyeEm we work with ground breaking technology in a visionary organisation. It is a work environment that truly values diversity, where you can develop your skills and learn from the best. We are an international team that is highly motivated and fun to be with. You'll have a significant impact on our product and community. We also offer free onsite German lessons, focusing on using German beyond the workplace.
coliquio GmbH Munich, Germany
21/08/2018
Full time
Deine Challenge Als Senior Data Scientist (m/w) treibst du maßgeblich unsere Data-Products voran, analysierst unsere Datenpools kontinuierlich nach neuen und wertvollen Erkenntnissen und implementierst Features die unsere User begeistern werden. Als Teil des Data-Teams wird es dein Ziel sein, interne und externe Datenquellen zu integrieren und mittels AI-Technologien Produkte wie unsere Recommendation- oder Search-Engines auszubauen. Als praxisnaher Experte im Team bist du der Ansprechpartner für Stakeholder und Team-Member, der immer kompetent Antworten beisteuern kann. Deine Schwerpunkte dabei sind: Du konzipierst und entwickelst mit aktuellen AI-Technologien wie Machine Learning, Text Mining, Mustererkennung und vergleichbaren Verfahren unsere Data Applikationen Du entwickelst und optimierst kontinuierlich Algorithmen für unsere Recommendation- und Search-Engines Du analysierst neue Technologien zum Einsatz im Data Science Umfeld Du sorgst mit dem Team für die permanente Weiterentwicklung der Daten-Architektur zur Bereitstellung relevanter Daten Du treibst die Weiterentwicklung unserer Analyse- und Vorhersagemethoden im Hinblick auf unser Ärzte-Netzwerk voran Du unterstützt die Produktentwicklung bei der Einführung und Optimierung von datengetriebenen Features Dein Potenzial Erfolgreich abgeschlossenes Studium der Mathematik, Physik, Informatik, Statistik oder o.ä. Mehrjährige Berufserfahrung im Bereich Data Science im Internet-Umfeld mit Schwerpunkt Content-Distribution, Social Network oder E-Commerce Souveräner Umgang mit Tools & Frameworks wie z.B. TensorFlow oder Amazon Machine Learning Idealerweise Erfahrung in der Entwicklung von Recommendation-Engines zur Conversion-Optimierung und Traffic-Steigerung Talent zur Lösung komplexer Probleme Eine selbstständige, eigenverantwortliche Arbeitsweise sowie eine ausgeprägte Teamfähigkeit Ausgeprägte Kommunikationsfähigkeit, konzeptionelle Stärke und ein agiles Mindset Keep it simple: Bewirb dich ganz einfach mit deinem CV über den Button: Mit Einreichen deiner Bewerbung erklärst du dich automatisch mit unseren   Datenschutzrichtlinien   einverstanden Jetzt bewerben Noch Fragen? Gerne helfe ich Dir weiter Larisa Leonteva   (HR Manager) Telefon: +49 (0) 7531 / 363 939-113
coliquio GmbH Constance, Germany
21/08/2018
Full time
Deine Challenge Du verstehst Dich als einen wesentlichen Teil unseres Data Teams, in dem Du eine optimale Datenbasis bereitstellst. Dein Ziel ist es, unterschiedlichste interne und externe Datenquellen anzubinden und in einem Data Lake & DWH zu aggregieren. Du unterstützt dabei Deine Kollegen aus der Produktentwicklung bei der Definition von sinnvollen und hochwertigen Datenquellen aus Tracking und Logging-Systemen. Dein hohes Qualitätsbewusstsein nutzt Du, um eine jederzeit valide Datenbasis zu gewährleisten. Als Data Engineer sitzt Du an der Schnittstelle zwischen Operations, Development und Data Science und arbeitest agil im Team mit Data Scientists und Entwicklern zusammen. Deine Schwerpunkte dabei sind: Du bist verantwortlich für die einmalige und permanente Anbindung unterschiedlichster Datenquellen und die strukturierte Ablage in unserem Data Lake und Data Warehouse Du unterstützt uns beim Identifizieren, Sammeln, Speichern, Verarbeiten, Dokumentieren und Analysieren von Daten Du bist ein zentraler Wissensträger unserer Daten und kompetenter Ansprechpartner für unsere Data Scientists und Business Analysts Du übernimmst somit Verantwortung für die Konzeption und hochwertige Umsetzung von Schnittstellen, Analyse und Optimierung der Datenqualität und Transformation der Daten zur optimalen Speicherung in unserem Data Lake Dein Potenzial Du verfügst über ein erfolgreich abgeschlossenes Studium im Bereich Wirtschaftsinformatik oder hast vergleichbare Berufserfahrung im Bereich Big Data & Analytics und BI Idealerweise bringst Du Erfahrung im Umgang mit verteilten Systemen zur Datenanalyse, etwa Hadoop, Spark, etc. mit und verfolgst bei der Nutzung solcher Technologien einen DevOps-Ansatz Du verfügst über Erfahrungen im Bereich Data Warehouse und im Umgang mit BI-Tools Du beherrschst Skript- und Programmiersprachen, Datenbanksysteme und kennst mehrere Technologien zur Anbindung unterschiedlicher Systeme (z.B. Java, Python, Scala, Kotlin, etc.) Du hast Erfahrung in der Entwicklung und dem Betrieb von produktiver Software und kennst dich mit agilen Betriebsprozessen aus Unterschiedliche Datenverarbeitungsstränge und –werkzeuge in Realtime wie auch im Batch sind dir bekannt und wecken Dein Interesse (z.B. Kafka, Sqoop, Spark, Airflow etc.) Du zeichnest Dich durch „out of the box thinking“ und ein hohes Maß an Kreativität aus Du arbeitest gerne in interdisziplinären Teams mit lean & agile Methoden Keep it simple: Bewirb dich ganz einfach mit deinem CV über den Button: Mit Einreichen deiner Bewerbung erklärst du dich automatisch mit unseren   Datenschutzrichtlinien   einverstanden Jetzt bewerben Noch Fragen? Gerne helfe ich Dir weiter Larisa Leonteva   (HR Manager) Telefon: +49 (0) 7531 / 363 939-113
momox Berlin, Germany
21/08/2018
Full time
momox is Germany’s leading online buying-and-selling service, where everyone can turn their books, films, CDs, games and clothes into money. Via www.momox.de and the momox app (iOS and Android) we buy used products at fixed prices. We offer the purchased items for sale in our online shops www.medimops.de (media) and www.ubup.com (fashion) and other marketplaces such as Amazon and eBay, faithful to the slogan “We give products a second life” Since starting, we have grown to become a company of 1,300 dedicated people at four locations in Berlin, Leipzig, Neuenhagen and Stettin. We are active in Germany, Austria, France and Great Britain. In 2016 our employees generated a revenue of 150 million euros. Since our launch, we have bought to sell more than 125 million items. The people working for us are as diverse as our products. Logistics experts, team players, communication geniuses, creative minds, IT specialists and many more give their best at our four locations. We are still looking for new enthusiastic colleagues. If you want to be part of this success story, work in an ambitious and loyal team and are seeking a performance challenge please continue reading. YOUR MISSION You are responsible for the design and implementation of BI solutions (DWH, ETL, monitoring, etc.) It is your responsibility to implement work packages within projects You take care of ensuring the operation of the BI application and its continuous improvement (monitoring, alerting) You optimize the execution of technical and substantive tests, in addition you create use cases in cooperation with the BI analysts You are responsible for processing ad-hoc requests for data, reporting or further analysis questions You are involved in creation and definition of regular reports, dashboards and KPI YOUR PROFILE You have a successfully completed degree in (business) computer science, business economics with a focus on (economic) computer science, natural sciences or a comparable education You have expertise in the use of cloud storage solutions (BigQuery, S3, Redshift) as well as relational databases (MySQL, MSSQL, PostgreSQL) As a plus, you are already familiar with non-relational databases (MongoDB, Hadoop, MapReduce) You could gather deep experience in ETL processes, API integration and automated processing of structured as well as unstructured data You are familiar with connecting and modeling data in SQL as well as subsequently including this into associated reporting and visualization tools (Tableau, Looker, Qlikview) Preferably, you have knowledge of software development (such as Python, R, Java) or a similar analytically oriented programming language is a plus You have a strong passion for data analytics and understanding of business processes THAT MAKES MOMOX ATTRACTIVE We like to keep it practical. You will be working in an agile sphere with everything that comes with that – however, without infinite meetings We are open to new tools and technologies and use them if it makes sense We love challenges You will have the opportunity to develop both your professional and your private skills by attending special events and conferences WHAT YOU CAN EXPECT Exciting tasks and challenges in the center of Berlin Agile work environment with flat hierarchies Motivated, open-minded teams with openness for new impulses Support for open source projects Lounge with table football and PlayStation Work-life balance with flexible working hours Fit on the job through running groups, cooperation with the Urban Sports Club and healthy lunch from smunch.co Employee benefits such as discounted public transport ticket, day care allowances, shopping vouchers for our medimops shop, free drinks, fruits and much more This is the right challenge for you? Apply now! More information about us you will find under   momox.biz/en/career/ . We are looking forward to receiving your application!

DataCareer Blog

Das Erlernen neuer Programmiersprachen ist eine Investition ins Humankapital. Die Ermittlung des Return on Investment kann daher sehr aussagekräftig sein. Die Anforderungen für jede Branche und jeden spezifischen Job sind sehr spezifisch – eine verallgemeinerbare Antwort auf diese Frage zu finden, ist deshalb schwierig. Ein Ansatz könnte aber darin bestehen, die erforderlichen Softwarekenntnisse bei Stellenausschreibungen zu analysieren, die die aktuelle Nachfrage widerspiegeln und dadurch einen allgemeinen Return on Investment anzeigen können. Wir haben alle deutschen Stellenangebote mit einem Datenbezug auf Indeed heruntergeladen, um eine grobe Vorstellung der Beliebtheit jeder Software auf dem deutschen Arbeitsmarkt zu erhalten. 2807 datenwissenschaftliche Stellenausschreibungen in Deutschland Wir haben im Juni 2017 alle Stellenangebote in Deutschland auf datenwissenschaftliche Schlüsselwörter (Data Scientist, Data Analyst, Big Data, Machine Learning) abgesucht. Dies deckt zwar nur die aktuellen Stellenausschreibungen ab, da sie jedoch in der Regel mehrere Wochen online geschaltet werden, können wir davon ausgehen, dass wir so ein relativ genaues Bild der aktuelle Marktnachfrage erhalten. Die Suche umfasst 2807 Stellenausschreibungen, aber nur 70% der Ausschreibungen geben eine Software oder Programmiersprache an. Warum enthalten so viele Postings keine spezielle Anforderungen? Häufig stellen Arbeitgeber nur allgemeine Anforderungen (z.B. Kenntnisse im Bereich Machine Learning oder Data Analytics) und einige der Stellenausschreibungen enthalten datenwissenschaftliche Suchbegriffe, richten sich aber nicht in erster Linie an Datenwissenschaftler..   SQL und Python sind am beliebtesten Wir haben die 2807 Stellenausschreibungen nach den 25 beliebtesten Data-Science-Softwares durchsucht. 1971 Stellenausschreibungen erwähnen mindestens eine der Softwares, viele von ihnen mehrere. Die folgende Abbildung zeigt die Anzahl der Stellenangebote, die eine bestimmte Software erwähnen. Mit rund 1000 Stellenangeboten ist SQL die beliebteste Software, gefolgt von Python mit rund 900 Nennungen und Java mit 670. Ähnliche Entwicklung in den USA Ist die obige Verteilung charakteristisch für Deutschland oder spiegelt sie weltweite Trends wider? Robert Muenchen hat die gleiche Analyse für den US-Markt durchgeführt. Die Plätze eins bis drei sind identisch: SQL (18.000 Jobs), Python (13.000 Jobs) und Java (13.000 Jobs) dominieren den Markt. Einige Unterschiede bestehen weiter unten: So ist z.B. SAP auf dem deutschen Markt (6.) gefragter als auf dem US-Markt (12.). Insgesamt sind sich die beiden Grafiken jedoch sehr ähnlich, was bestätigt, dass die Softwaretrends global sind und die Anforderungen durch die technologischen Grenzen geprägt sind. Investitionen in Programmierkenntnisse Wenn Sie neu in der Datenwissenschaft sind oder darüber nachdenken, in diese Berufsrichtung zu gehen, gibt Ihnen diese Analyse eine gute Vorstellung davon, welche Programmierkenntnisse in naher Zukunft besonders wertvoll sein dürften. Die hohe Nachfrage nach SQL könnte ein Zeichen dafür sein, dass viele Unternehmen nicht nur Fähigkeiten in der Datenanalyse erwarten, sondern auch ein reibungsloses Zusammenspiel mit Datenbanken. Python scheint auf dem Vormarsch zu sein. Robert Muenchen zeigt, dass die Popularität von Python in den letzten drei Jahren stark zugenommen hat und das Wachstum den anderen großen Open-Source-Player R überholt. Insgesamt dürfte eine Kombination aus starken analytischen Fähigkeiten in Python und R mit soliden SQL-Kenntnissen eine gute Grundlage für eine Karriere im wachsenden Data-Science-Arbeitsmarkt sein.   Interessiert an der Analyse von Jobs auf Indeed? Sie können auf die Jobs über deren API zugreifen. Das jobbR-Paket auf R ist hilfreich; ähnliche Werkzeuge gibt es für Python .
Companies use machine learning to improve their business decisions. Algorithms select ads, predict consumers’ interest or optimize the use of storage. However, few stories of machine learning applications for public policy are out there, even though public employees often make comparable decisions. Similar to the business examples, decisions by public employees often try to optimize the use of limited resources. Algorithms may assist tax authorities in improving the allocation of available working hours, or help bankers make lending decisions. Similarly, algorithms can be employed to guide decisions taken by social workers or judges. // This blogpost lists three research papers that analyze and discuss the use of machine learning for very specific problems in public policy. While the potential seems huge, we do not want to neglect some of the many potential pitfalls for machine learning in public policy. Business applications often maximize profits. For policy decisions, however, the maximizable outcome may be harder to define or multidimensional. In many cases, not all relevant outcome dimensions are directly observable and measurable, which makes it more difficult to evaluate the impact of an algorithm. Tech companies would usually obtain training datasets through experimenting, while datasets for public policy often contain only one outcome for a specific group of people. If tax authorities never scrutinize restaurants, how can we form a predictive model for this industry? Predictions for public policy problems often face this so-called selected labels problem and it needs innovative approaches and the willingness to perform randomized experiments to get around it. This is just a brief list. Susan Athey’s paper provides more food for thought on the potential - and potential pitfalls - of using prediction in public policy.   Research on Machine Learning Applications in Public Policy Improving refugee integration through data-driven algorithmic assignment Developed democracies are settling an increased number of refugees, many of whom face challenges integrating into host societies. We developed a flexible data-driven algorithm that assigns refugees across resettlement locations to improve integration outcomes. The algorithm uses a combination of supervised machine learning and optimal matching to discover and leverage synergies between refugee characteristics and resettlement sites. The algorithm was tested on historical registry data from two countries with different assignment regimes and refugee populations, the United States and Switzerland. Our approach led to gains of roughly 40 to 70%, on average, in refugees’ employment outcomes relative to current assignment practices. This approach can provide governments with a practical and cost-efficient policy tool that can be immediately implemented within existing institutional structures. Bansak, K., Ferwerda, J., Hainmueller, J., Dillon, A., Hangartner, D., Lawrence, D., & Weinstein, J.; Science, 2018 Switzerland is currently implementing an algorithm based allocation of refugees. We are excited to see first results!   Human Decisions and Machine Predictions Can machine learning improve human decision making? Bail decisions provide a good test case. Millions of times each year, judges make jail-or-release decisions that hinge on a prediction of what a defendant would do if released. The concreteness of the prediction task combined with the volume of data available makes this a promising machine-learning application. Yet comparing the algorithm to judges proves complicated. First, the available data are generated by prior judge decisions. We only observe crime outcomes for released defendants, not for those judges detained. This makes it hard to evaluate counterfactual decision rules based on algorithmic predictions. Second, judges may have a broader set of preferences than the variable the algorithm predicts; for instance, judges may care specifically about violent crimes or about racial inequities. We deal with these problems using different econometric strategies, such as quasi-random assignment of cases to judges. Even accounting for these concerns, our results suggest potentially large welfare gains: one policy simulation shows crime reductions up to 24.7% with no change in jailing rates, or jailing rate reductions up to 41.9% with no increase in crime rates. Moreover, all categories of crime, including violent crimes, show reductions; these gains can be achieved while simultaneously reducing racial disparities. These results suggest that while machine learning can be valuable, realizing this value requires integrating these tools into an economic framework: being clear about the link between predictions and decisions; specifying the scope of payoff functions; and constructing unbiased decision counterfactuals. Jon Kleinberg  Himabindu Lakkaraju  Jure Leskovec Jens Ludwig  Sendhil Mullainathan; Quarterly Journal of Economics, 2018 // Using Text Analysis to Target Government Inspections: Evidence from Restaurant Hygiene Inspections and Online Reviews Restaurant hygiene inspections are often cited as a success story of public disclosure. Hygiene grades influence customer decisions and serve as an accountability system for restaurants. However, cities (which are responsible for inspections) have limited resources to dispatch inspectors, which in turn limits the number of inspections that can be performed. We argue that NLP can be used to improve the effectiveness of inspections by allowing cities to target restaurants that are most likely to have a hygiene violation. In this work, we report the first empirical study demonstrating the utility of review analysis for predicting restaurant inspection results. Kang, J. S., Kuznetsova, P., Choi, Y., Luca, M., 2013 , Technical Report Here is related paper on the same topic suggesting ways for governments on how to obtain the required expertise: Crowdsourcing City Government: Using Tournaments to Improve Inspection Accuracy Further readings: Two papers with an excellent overview on the topic Machine Learning: An Applied Econometric Approach Prediction Policy Problems The Economist on the same topic: Of prediction and policy, The Economist 2016  
Are you looking for real world data science problems to sharpen your skills? In this post, we introduce you to four platforms hosting data science competitions. Data science competitions can be a great way for gaining practical experience with real world data, and for boosting your motivation through the competitive environment they provide. Check them out, competitions are a lot of fun! Kaggle Kaggle is the best known platform for data science competitions. Data scientists and statisticians compete to create the best models for describing and predicting the data sets uploaded by companies or NGOs. From predicting house prices in the US to demographics of mobile phone users in China or the properties of soil in Africa, Kaggle offers many interesting challenges to solve real world problems. Check out their No Free Hunch Blog featuring the winners of each competition. The platform was recently acquired by Alphabet, Google’s parent company, and also offers a wide range of datasets to train your algorithms and other useful resources to improve your data science skill set.   // DrivenData Similar to other platforms, the dataset is available online and participants submit their best predictive models. The great thing about DrivenData competitions is that the competition question and datasets are related to the work of non-profits, which can be especially interesting to those who want to contribute to a good cause. Furthermore, the data problems are no less diverse and range from predicting dengue fever cases, to estimating the penguin population in the Antarctic and forecasting energy consumption levels.  For some challenges, the best model wins a prize, for others you get the glory and the knowledge that you applied your skillset to make the world a better place. DrivenData offers great opportunities to tackle real-world problems with real-world impact. Numerai Numerai is a data science competition platform focusing on finance applications. What makes their competitions particularly interesting is that the participants’ predictions are used in the underlying hedge fund. Data scientists entering Numerai’s tournaments currently receive an encrypted data set every week. The data set is an abstract representation of stock market information that preserves its structure without revealing details. The data scientists then create machine-learning algorithms to find patterns in the data, and they test their models by uploading their predictions to the website. Numerai, then creates a meta-model from all submissions to make its investments. The models get ranked, with the top 100 earning Numeraire coins, a cryptocurrency launched by Numerai. Numerai's mix of data science, cryptography, artificial intelligence, crowdsourcing and bitcoin has given the fledgling business an exciting flair.   Tianchi Tianchi is a data competition platform by Alibaba Cloud, the cloud computing arm of Alibaba Group, and has strong similarities with Kaggle. The platform focuses on Chinese data scientist, but most pages are also available in English. Tianchi boasts a community of over 150,000 data scientists, 3,000 institutes and business groups from over 80 countries. Besides the competitions, the platform also offers datasets and a notebook to run Python 3 scripts.       //
View all blog posts


Looking for data professionals?

 

Post a Job