Can you describe the projects you work on and your role in more detail?
I am usually responsible for multiple ongoing projects in parallel. The scope and objective of a Labs project are mostly defined by the Thomson Reuters business unit that is investing in it. The projects can vary greatly in terms of length and scope. Sometimes we just test a hypothesis and develop a methodology to solve a certain problem. On the other end of the spectrum, we develop completely new features or add-ons for existing products. Those kinds of projects require work in multiple areas such as data science, research, software engineering and user experience. I am not an expert in all of these areas but I need to have at least a high-level understanding of all the bits and pieces in order to lead and manage projects. Some of the most challenging aspects of my role are the communication with stakeholders and keeping a large group of people from many different backgrounds aligned on the objectives of the projects.
Which programming languages and tools are you primarily using?
For data science, we mostly use Python along with tools like Jupyter notebooks. We sometimes use other languages such as R or Scala. We also use a broad range of big data and public cloud tools as well as a variety of packages/frameworks for NLP, machine learning, deep learning and so forth. For Engineering, we primarily use Java for backend and Javascript for frontend in public cloud environments.
Which skills would you regard as vital for a career in data science?
I would say the most important skills are mathematics, programming and an understanding of commonly used algorithms, methodologies and tools. In addition to that a data scientist needs to have a curious mind and the ability to solve complex problems. It is also important to be able to visualize results and to present and communicate complex subjects to non-specialist audiences in an understandable way.
How do you see the development of data science/analytics over the next years?
I think there are a couple of major trends that touch on different aspects of data science. On the technical side, we see deep learning being used more and more often which drives a whole new generation of GPU focused hardware. We also have a lot of development being done on the public cloud which fundamentally changes the way systems are designed. Furthermore, there is an increasing awareness of privacy and ethics when working with data. In the past couple of years, many companies did not pay attention to these topics which led to a variety of problems. Privacy and ethics will likely have more and more impact on the work of a data scientist in the future. Finally, the field of data science is maturing and there is an ever-increasing number of universities which offer degrees and certificates in this field. The number of data scientists will increase further and we will probably see more specialization on the role.
Which three pieces of advice would you give to aspiring data scientists?
- Keep an eye out for upcoming new technologies and try them out on a regular basis to stay on the pulse.
- Don’t let job ads with long lists of required skills/qualifications scare you. Nobody ticks all the boxes.
- Be curious about topics which are adjacent to data science and get yourself up to speed on things like user-centric design or agile development. These types of skills are always useful.
Thank you for your time!