158 lines
9.1 KiB
HTML
158 lines
9.1 KiB
HTML
<!DOCTYPE html>
|
|
<html xmlns="http://www.w3.org/1999/xhtml" lang="en">
|
|
<head>
|
|
<meta charset="utf-8" />
|
|
<title>Data Scientist</title>
|
|
</head>
|
|
<body>
|
|
|
|
|
|
|
|
<article>
|
|
<hgroup>
|
|
<h1>Data Scientist</h1>
|
|
<p><small>(<a href="../">Artificial Intelligence</a>)</small></p>
|
|
</hgroup>
|
|
<section>
|
|
<address class="h-card">
|
|
by
|
|
<a rel="author" class="u-url" href="http://changelog.ca/"><span class="p-given-name">Charles</span> <span class="p-additional-name">Iliya</span> <span class="p-family-name">Krempeaux</span></a>
|
|
</address>
|
|
</section>
|
|
<section>
|
|
<p>
|
|
A <strong>data scientist</strong> is someone who does <ziba-link>data science</ziba-link>.
|
|
And typically doing it while working at a company.
|
|
</p>
|
|
</section>
|
|
<section>
|
|
<h2>Data Scientist Role and Title</h2>
|
|
<p>
|
|
The title of "<strong>data scientist</strong>" does not exist in-a-vacuum — does <em>not</em> stand alone, cut off from other influences, without links to the outside world.
|
|
"<strong>Data Scientist</strong>" is a title associated with a type of role (or roles) that companies hire for.
|
|
</p>
|
|
<p>
|
|
Who is a <strong>data scientist</strong>‽
|
|
Although an unsatisfying answer to some — in practice a <strong>data scientist</strong> is anyone a company gives the title of "<strong>data scientist</strong>" to.
|
|
</p>
|
|
<p>
|
|
What types of skills and specializations are more likely to get someone hired as a <strong>data scientist</strong>‽
|
|
It depends on the company, but —
|
|
</p>
|
|
<p>
|
|
When working at a company, many <strong>data scientists</strong>' —
|
|
</p>
|
|
<ul>
|
|
<li>do <ziba-link transform="lowercase">A-B testing</ziba-link>,</li>
|
|
<li>do <ziba-link>fraud detection</ziba-link>,</li>
|
|
<li>work on <ziba-link>online advertising</ziba-link>, or</li>
|
|
<li>create <ziba-link>visualization</ziba-link>s for others at the company they are working at.</li>
|
|
</ul>
|
|
<p>
|
|
Not all <strong>data scientists</strong> do this type of work — but many do.
|
|
</p>
|
|
</section>
|
|
<section>
|
|
<h2>Data Analysts</h2>
|
|
<p>
|
|
In practice, the name "<strong>data scientist</strong>" at many companies was just a new name for "<ziba-link>data analyst</ziba-link>".
|
|
Many <strong>data analysts</strong> became <strong>data scientist</strong> after the title of <strong>data scientist</strong> became popular — continuing to do the exact same job at the exact same company just with a new newly popular title.
|
|
</p>
|
|
<p>
|
|
This type of changing of titles is common in the tech-industry.
|
|
It is similar to the how <strong>system administrators</strong> (<abbr title="system administrators">sysadmins</abbr>) became <strong>operations</strong> (<strong><abbr title="operations">ops</abbr></strong>) specialists, and then became <strong>DevOps</strong> people, and then became <strong>site reliability engineers</strong> (<strong><abbr title="site reliability engineer">SRE</abbr></strong>).
|
|
</p>
|
|
<p>
|
|
At many companies the same people doing the same work for the same job simply had their title changed to whatever became popular at the time.
|
|
</p>
|
|
</section>
|
|
<section>
|
|
<h2>Software Developers</h2>
|
|
<p>
|
|
Although <strong>"data science"</strong> and the title of <strong>"data scientist"</strong> hadn't been coined yet — in the 2000s, it was more common for some of the more technical things that <strong>data scientists</strong> do to just be part of <strong>software development</strong>.
|
|
Especially for the things that would have been called "<strong>artificial intelligence</strong>" at the time.
|
|
</p>
|
|
<p>
|
|
However, this started to change around 2012.
|
|
Some of types of activities started to be done by people with <em>little</em> to <em>no</em> (often <em>no</em>) <strong>software development</strong> background.
|
|
</p>
|
|
<p>
|
|
From a <strong>software developer</strong> point-of-view, most <strong>data scientists</strong> are '<strong>users</strong>'.
|
|
The vast majority of <strong>data scientists</strong> will <em>never</em>, for example, implement an <ziba-link>artificial neural network</ziba-link>, or implement a <ziba-link>training</ziba-link> algorithm, or implement a visualization rendering algorithm, etc.
|
|
And the vast majority of <strong>data scientists</strong> will <em>never</em> do these things because they do <em>not</em> have the skills to do it.
|
|
The vast majority of <strong>data scientists</strong> use tools created by <strong>software developers</strong>.
|
|
</p>
|
|
<p>
|
|
Most <strong>data scientists</strong> cannot program.
|
|
But some can.
|
|
</p>
|
|
<p>
|
|
But the minority of <strong>data scientists</strong> who can program, the vast majority of <strong>data scientists</strong> who can program do <em>not</em> program like a <strong>software developer</strong>.
|
|
These <strong>data scientists</strong> use programming languages (such as the <ziba-link>r programming language</ziba-link> or the <ziba-link>python programming language</ziba-link>) in a way that a <strong>software developer</strong> would use <ziba-link>bash</ziba-link> from a <ziba-link>computer-terminal</ziba-link>.
|
|
I.e., most of the <strong>data scientists</strong> who can program use programming languages to just issue commands.
|
|
</p>
|
|
<p>
|
|
For an analogy that may help make this clearer —
|
|
When a <strong>software developer</strong> writes a program, they are (metaphorically speaking) creating a <em>robot</em> that is expecting to run independently 24 hours a day, 7 days a week, and be able to function independently in the world.
|
|
However, when a <strong>data scientist</strong> writes a program, it is a (from a <strong>software developer</strong>'s point-of-view) a very very 'hacky' tool that the <strong>data scientist</strong> would have to manually use themselves — and perhaps modify each time they use it.
|
|
</p>
|
|
</section>
|
|
<section>
|
|
<h2>Physicists<h2>
|
|
<p>
|
|
There has been a type of scam that has been going on in universities with the study of <strong>physics</strong>.
|
|
</p>
|
|
<p>
|
|
Many (probably most) people who <em>go to</em> and <em>pay money for</em> university do so for career and work related reasons.
|
|
They either want to get a higher paying job, after they graduate from university, than they would have been able to get without having gone to university.
|
|
Or they want to avoid doing physical labor, and want to be able to do a different type of job (that doesn't involve physical labor).
|
|
And they expect to do it doing whatever they studied.
|
|
</p>
|
|
<p>
|
|
The vast majority of people who study <strong>physics</strong> at university will never get a job as a <strong>physicist</strong>.
|
|
Never.
|
|
And the people working at the universities know this!
|
|
</p>
|
|
<p>
|
|
The people working at the universities know there are zero job prospects for the vast majority of these <strong>physicists</strong> they are graduating.
|
|
But not only do the people working at the universities not warn these students about this — but they happily take their money while they effectively letting them (and even encouraging them to) waste 4 to 10+ years of their lives getting BSc, MSc, and PhD in <strong>physics</strong>.
|
|
</p>
|
|
<p>
|
|
What happens to these <strong>physicists</strong>‽
|
|
In the past, some became <em>software developers</em>.
|
|
But since the title of <strong>data scientist</strong> got coined, many of them have become <strong>data scientists</strong>.
|
|
</p>
|
|
<p>
|
|
In fact, so many <strong>physicists</strong> have become <strong>data scientists</strong> that they caused a culture change.
|
|
Some ways <strong>data science</strong> culture seemed t haveo changed as a result of all these <strong>physicists</strong> flooding <strong>data science</strong> is:
|
|
</p>
|
|
<ul>
|
|
<li>
|
|
the <strong>physicists</strong> who flooded <strong>data science</strong> brought <strong>credentialism</strong> to the hiring of <strong>data scientists</strong>, even making it the norm, where before it wasn't common and previously there was more focus on whether a candidate <strong>data scientist<strong> did or did not have the skills the company was looking for,
|
|
</li>
|
|
<li>
|
|
the <strong>physicists</strong> who flooded <strong>data science</strong> made it so that the skills associated with installing and operating a <strong>database</strong> became uncommon among <strong>data scientists</strong>, where before that wasn't the case — this resulted in the creation of a new role to compensate for this — the <ziba-link>data engineer</ziba-link>,
|
|
</li>
|
|
<li>
|
|
the <strong>physicists</strong> who flooded <strong>data science</strong> made it so that software development skills became uncommon.
|
|
</li>
|
|
</ul>
|
|
</section>
|
|
<section>
|
|
<h2>Universities</h2>
|
|
<p>
|
|
Once the title of <strong>data scientist</strong> became popular, many universities rushed to exploit this for financial gain.
|
|
</p>
|
|
<p>
|
|
Many universities quickly created <strong>data science</strong> programs.
|
|
Many of these university <strong>data science</strong> programs were created by people who never actually worked at a <strong>data scientist</strong>.
|
|
And thus it was difficult for them to know what skills companies, who wanted to hire a <strong>data scientist</strong>, actually wanted <strong>data scientists</strong> they hired to have.
|
|
</p>
|
|
</section>
|
|
</article>
|
|
|
|
|
|
|
|
</body>
|
|
</html>
|