Stylised as the sexiest job of the 21st century, data science has emerged as one of the most in-demand professions of recent years — taking hold with a hype that normally only surrounds celebrities. Companies worldwide put lucrative salaries, prestige and the privilege of wielding influence up for grabs to attract analytical talent. Behind all the hype is a growing importance of digital data that’s currently transforming the way we live and work.
It’s no wonder that more and more enthusiasts want to break into this new field. But before venturing into data science and analytics with one’s eyes closed, aspirants are well advised to inform themselves about available routes first. Interested candidates are encouraged to begin their journey by identifying entry points and requirements, by finding out more about how the various data subfields differ from one another, and how their CV needs refinement prior to submitting job applications.
Pursuing this train of thought further, one question naturally always arises. What exactly distinguishes the titles of data scientist and data analyst? After all, both professions seem to be tasked with extracting business value from data. The logic savvy reader perhaps already knows that sharing similarities doesn’t imply interchangeability. One could complement the other, as both work in unison towards achieving a common goal. Many phenomena out there in the wild share resemblances in some areas and differ starkly in others.
The objective of this article is to find an answer to that question of difference. Not only from the perspective of theory, but also from the lens of a seasoned professional who’s seen how data teams function in the real world. This is because it’s no secret that job titles and their fancy echo are one thing, while the reality of day to day workload an entirely different thing.
Having said that, let’s look into the common features of the two roles before we explore the aspects in which they differ.
Similarities – Data Analysts vs. Data Scientists
In fact, by affirming the few aspects which they share and outlining where they diverge, we get closer to grasping both roles better. A data science stack-exchange blogger registered as Stephan Kolassa attempted to visually demarcate differences by using a Venn Diagram (Entry 2403).
A plethora of noteworthy points can be inferred from this diagram. Among the rather more obvious that data scientist and data analyst roles are closely related, occupying quadrants adjacent to one another. That, in practice, means that they should always be found working in the same business units, unless the analysts are bound to specific project teams as part of squads in agile frameworks. Quite logical, right? Both use data in the service of business goals, and both need expertise in traditional statistics.
There also is this quadrant of communication that they mutually share: conveying useful insights to business leaders by means of data stories, or creating intuitive tools which bring about ‘data-driven’ decision making. Without much contemplation, it’s thus easy to understand that data scientists and data analysts are only worthwhile having inasmuch as they can prove their work as useful. Precisely for that reason you’ll find visual, as well as verbal communication skills demanded in almost every job advertisement for both jobs. Be that as it may, can we find more similarities by interrogating the ever-wise world wide web?
Using a Python script to load Google search terms data from a freely available source, we can see that the two job roles have yet another commonality. The kindred professions have witnessed a similar popularity trend in recent years, an explosive one for that matter. But that’s about it in regards to common ground, neatly summed up by three points: (1) data insight provision for commercial advantage, (2) superior communication, and (3) popularity in the public eye.
By the way, if you’ve ever wondered, Google trends data comes from unbiased samples of individual search engine queries — anonymised, categorised and grouped geographically to measure public interest in particular topics. A great enhancement of this data has been released in 2016, making sentiment across all subject categories available in real time. I happen to have years of experience leveraging that data.
Data Analyst – Job Descriptions
At any rate, we shall now turn to the details that separate. And this is best done through examples, by unearthing representative job descriptions of data scientist and data analyst adverts from the internet. An entry-level one reads:
The typical data analyst role is consulting-centric, as can be seen from the Indeed job spec example. What they are preoccupied with for the most part is wrangling data from Excel spreadsheets and SQL databases, extracting insightful conclusions via retrospective analyses, A/B tests, and generally providing evidence-based business advice. The last point illustrates why reporting routines with visualisation tools such as Tableau are as pivotal as pivoted tables. Data modelling on the other hand is often limited to basic supervised learning or its stats equivalent: regression analysis.
From experience, I can also say that novice practitioners sometimes forget that the stage of supplying recommendations is invariably the most important one. They can get side-tracked by buzzwords and trendy techniques far removed from the business context. That is why it’s so important that analysts learn how to excavate insights which can be acted upon, presentable in both visually compelling and digestible formats. Analysts are tech savvy investigative reporters who make insights accessible.
Data Scientist – Job Descriptions
Turning now to a similar example of a typical data scientist role from Indeed, we’ll explore some of the key differences. The first noticeable one is the length of the ‘must have’ and role responsibility sections. Certainly, much more is demanded of the average data scientist than it’s of a data analyst, which explains in part why the former command better salaries than the latter. But is there substance behind the hype or is data science merely a modern myth?
To be fair, data scientists are for that reason expected to be more than analytical wizards. They are supposed to be builders who employ advanced programming to create pipelines that predict and recommend in production environments with near perfect accuracy. Compared with analysts, who’re like investigative reporters, they are a lot more product development than consulting oriented. Although it’s also required of a data scientist to provide data-led commercial advice. Some say the title was coined to manifest that the role was a confluence of three fields: maths/statistics, computer science and domain expertise. And the following quote is said to best encapsulate that: “A data scientist is someone who is better at statistics than any software engineer, and better at software engineering than any statistician.”
Differences – Data Analysts vs. Data Scientists
Greater volumes of data mean stakes are higher: and so are expectations, too. For unlike analysts, who would on average be given spreadsheets with 500 thousand rows and 50 columns to make sense of on their first day, data scientists will likely see the keys to terabytes of data with tens of thousands of columns handed over to them on day one. Everyone would then expect them to magically summon the gems of insight and wisdom out of those volumes of data. Left to their own devices, they will be expected to ingest, transform, explore and model enormous volumes of messy and unstructured data. As some witty writers on medium have said: “Data scientist is a title that conjures up almost mystical abilities of a person garnering insights from deep data lakes with ease, someone who has supernatural hands for data like a 21st century Houdini!”
Data science is a lot more coding intensive. Even though data scientists and data analysts obtain data with the same objective in mind, their approaches and tools used differ substantially. While data analysts mainly work with SQL dialects to paste manageable chunks of data into spreadsheets and programming interfaces like R Studio and Jupyter Notebooks, data scientists are expected to be comfortable with working in cloud computing settings (AWS, Databricks, Hadoop, etc.).
There they ingest, process and model volumes of data whose magnitude is often referred to as big data. In view of that, it’s easy to see why data science jobs add those ridiculously long lists of tech-stack requirements. New hires in larger organisations inevitably inherit heaps of sometimes undocumented legacy scripts and custom algorithms that they either need to replace or maintain. With that in mind, it’s no wonder that advanced programming skills are a must-have, where it is only a good to have in most entry-level data analyst positions.
Data analysts are a lot more connected to business stakeholders. For as we’ve discovered in our long exposition of the differences, data analyst jobs are in actual fact less coding intensive, which reveals a rather more subtle point. The careful observer of the tech world would confirm that technical complexity almost always comes with barriers. They create voids between decision making stakeholders and hands-on engineers and scientists. That, in turn, is the space which product managers fill to bridge the gap in communication. And since data science work is commonly surrounded by a fog of mystery, ordinary employees of a firm tend to prefer to reach out to analysts for help.
It’s a phenomenon I’ve frequently noticed in the world of business: data scientists tend to be more siloed. Data analysts, on the other hand, tend to be more involved and engaged with other business units, readily helping with issues such as fixing Excel spreadsheets, aiding client pitches with analytical teasers and contributing to overall business performance with dashboards. So, if the reader of this article is more of a consultant who likes to make a difference in the micro context, an analyst position would infinitely be more fulfilling. In short, the grass is not always greener on the other side of the fence.
Data Analysts vs. Data Scientists – Conclusion
All things considered, we’ve explored how the professions of data scientist and data analyst compare and contrast. We’ve reached an understanding on how they differ in programming intensity, data volumes used in modelling, sophistication in regards to automation, and the educational backgrounds required by looking at sample job advertisements. Sure, we anticipated differences. But surprisingly, we’ve also come to appreciate how similar the two kindred professions actually are. In essence, they both seek to retrieve insights from datasets.
References
コメント