Have you ever been wondering about questions like these, “what does a data scientist do”? “Is data science a new thing”? “Is data scientist not just a sexed-up label for a statistician”?
With this article, I hope to clear up with some of the myths and some of the confusion. It’s not all black and white here in the world of data though. Companies don’t use a shared definition of what a particular job consists of and what not. Even universities courses sometimes share the same label but have different contents in this realm.
What is data science about?
Data science is a set of activities in more than just one field. If you’re a data scientist, you engage in data collection (or data mining), analyzing and modeling data, as well as preparing data in a way to it turns into information that can be used to drive decisions. This last bit is also known as “data-driven decision-making” in the field of entrepreneurship. In our time, data science is also considered to be a part of computer sciences or at least you could say that nowadays it is strongly related to the same.
Using scientific methods of processing data with algorithms and other systems, a data scientist works to transform data into information and possibly gain knowledge of something that was previously unclear or entirely unknown. For doing this, a data scientist can leverage both structured and unstructured data. The exact requirements depend heavily on the methodology used and the objective of the tasks.
History of data science
The term data science as such was allegedly coined by Peter Naur, Danish computer science pioneer, in 1960. He later also introduced the term “datalogy,” in 1974, which could be considered to share activities with what we know of data science today but possibly the word was just not meant to last. Data science was first mentioned as part of a conference title, in 1996, in Kobe, Japan, when Chikio Hayashi introduced it during a roundtable discussion.
When Thomas H. Davenport and D.J. Patil wrote the article “Data Scientist: The Sexiest Job of the 21st Century” for the Harvard Business Review magazine, in October 2012, it was the moment when the term data science and data scientist started to become a buzzword. Ever since, this “new” label started to replace job positions that were previously perhaps called business analyst, statistician and other similar job titles. Depending on where you’re looking you might also find a chief data scientist or even data wizards.
What does a data scientist do?
If you’re looking through the job descriptions of companies who are looking to hire data scientists you might find everything and nothing adequately defined. It’s often vague and not clear because commonly this type of work activities has no links to the HR department of an enterprise. However, if you get hired as a data scientist, what are they going to expect you to do most likely?
Basically, the output of your work needs to point towards opportunities, help to make decisions and increase the profitability of the company in a way. In order to achieve that you need to gather data, structure it, analyze it, and determine the best possible way to align in sets and variables. If you’re facing “big data” issues, you need to arrange the results in a way, so they make sense right now and in the future.
You’re likely to encounter different databases that you will need to source data from and manage a “golden source” by connecting all these historic sources into a kind of middleware. Data scientists also will need to find ways of structuring the unstructured data and slice continuous data in bits so they can be used. Getting rid of wasteful data and making sure that useful data is leveraged in a most optimal way.
After that is done, the data scientist will process the data possibly to spot patterns and trends. By doing that he or he will be able to “warn” the business of recurring issues or other events that might be harmful if ignored or that would allow the company to generate more revenue if they know about the event in advance and can prepare. The data scientist will also build cases and explain the results to stakeholders and management. This could happen for instance in the form of a visual presentation or other means.
Skills that can help you land a data scientist job
First of all, it helps you a lot if you have an interest in working with data, numbers, graphs, formulas and all the like. This might be not the right job for you if you’re not really comfortable with this kind of things.
If you think about becoming a data scientist, you should look into increasing your knowledge in subjects such as math, computer science, programming, statistics, machine learning, software engineering, and data visualization to communicate the results of your work.
Data science vs. statistics
I don’t think that data science is merely the new word for statistics. I believe that some of the activities might be similar, but I do believe that there are some differences to note down between these two branches. If you’re looking at industries, you see more use of data science in tech, finance, and manufacturing companies while statistics often find a home in commerce, trade, population studies, and supporting other fields of science.
Example of an evolutionary algorithm created within Jupyter Notebook, using Python and Matplotlib (by Shahin Rostami)
The real world practices of a data scientist and a statistician often differ. For instance, it is likely that a data scientist will need to create and look after data warehouses which could be middleware solutions. From what I know, statisticians rarely need to deal with administrative IT and software engineering tasks. They might make use of same but not prepare or maintain them.
If you’d see that in another way, I’d love to hear your thoughts about this. Make sure to share your remarks below in the comments section.
Summary
A few years back terms like Business Intelligence (BI) or Management Information (MI) shared activities with the current job descriptions of what a data scientist has to do. It appears that there is a buzzword effect and hype around the term of data science, especially in combination with all things Artificial intelligence (AI) and Machine Learning (ML), but still, the work of data scientists is highly sought after in innovative/disruptive companies.
Terms and job position names change based on trends but what really matters is that you need to check the activities and responsibilities in any job descriptions you might find. Names are just names but if the activities match what you’re good at and what you want to do. Go for it!
Further reading
- About Reason, Correlation, Causation and Smart Fools
- 3 Tips to Improve Your Strategic Decision Making
- Antifragility: When Systems Benefit from Chaos and Volatility
- What’s the Difference between Data, Information, Knowledge and Wisdom?
- What is Collaborative Intelligence?
Photo credit: The photographs (1, 2, 3) were all done by Sebastiaan ter Burg. The animation was done by Shahin Rostami.
Source: Thomas H. Davenport, D.J. Patil (HBR) / Wikipedia article on data science