Data Scientists

And other made up titles...

The title ‘Data Scientist’ lit up my bullshit detector, but I took the job anyways.

I’d get to work with Python, and d3.js, even if much of what I was destined to work on was getting press to write about the company.

The LinkedIn recruiter mails that came in during the year I had 'Data Scientist' as my title were full of lunchtime anecdote gold. One company thought I could help them with their "big data problems,” noting the size of their databases, promising "a whole lot of data to sink [my] teeth into." Recruiters contacted me about companies that desperately needed me to “tell the stories in their data in a beautiful and witty way.” Did I have interest in making a promising career move into data engineering or exploratory data science? And when could they pencil me in for a quick call to chat about helping to "design meaningful metrics and visualizations to track the health of their features and platforms?" I should note that their ideal candidate "must eat, sleep and love data."

Fancying up a basic knowledge of statistics and the ability to manipulate matrices - which is what many “data scientists" spend a lot of time doing - by elevating it to a science seems a bit of a stretch. The joke by Svetlana Sicular gets passed around on Twitter and at conferences so much because it hits home: "data scientist is 1) a data analyst in California or 2) a statistician under 35."

But, judging from the flood of InMails, emails, and intros I received, it seems many companies - from big telephone companies to small startups – want a data scientist, but each company's idea of what they want is vastly different.

Data Scientist as Oracle

The other day I was talking with Kate Losse, an early Facebook employee and author of The Boy Kings, about how the title Data Scientist came into use there. In her memory, it seemed to happen suddenly: “in 2007 there was no data science and in 2008 there was.”

Indeed, the Harvard Business Review writes that the title Data Scientist was coined in 2008 by “D.J. Patil and Jeff Hammerbacher, then the respective leads of data and analytics efforts at LinkedIn and Facebook.” A flood of articles, reports, and surveys were published about the people with the skill set to put all this new, “big” data to good use, to find insights. “Insights,” of course, served as code for revenue for the business. The early 2012 story about how Target knew from analyzing shopping patterns that a teen was pregnant before her parents did was widely admired, and mentioned by some of the men in tech that I interacted with at the time in a way that struck me as disturbingly covetous. Where I read the story as breach of individual privacy by a corporation, they saw a business opportunity.

The Oracle character in the matrix sits at a table in a kitchen. The Smiths, a group of men in suits, stand in a group, appearing to question her. The Oracle character in The Matrix gives predictions and insights.

Though many large companies had for years had people whose job was to be aware of what users were up to, now you could hire someone to look into the "deep trends" -- to be an oracle that could say what "the user" was doing, and what they were about to do. Data Scientist as Oracle rang false for me - especially at the job I held at the time, where much of the data analysis and visualization I did was done in the name of getting press. These stories and links could then be leveraged for credibility, SEO and endlessly reused by funneling traffic to them through paid search. Having an engineering and product design background, I found myself putting “working on product” on a pedestal in my mind, ranking marketing work lower -- yet marketing is much of what Data Scientists are drooled over for.

The Oracle nature was often reflected in the hopeful wording of the job postings and recruiter emails I received. Who else but an oracle could build an "algorithm to weight for new customers, Contribution Margin and other signals of intent," transforming the act of marketing into science? What else does "Signals of Intent" mean other than predicting and forecasting, from mouse clicks and metadata, which customers are worth cultivating and then automatically turning the dials on the recommendation algorithms, the ad optimization algorithms?

A Crack in the System

The HBR story, published in late 2012 as the hype cycle continued to rev up, called data science "the sexiest job of the 21st century." The sexualization of the Data Scientist, both overt and implied, made having one even more desirable to companies. Merely working near a Data Scientist seemed to convey sexy benefits -- at one point last year I noticed the company I worked for at the time advertised working with me as a perk that potential new hires would enjoy along with a snacks and a budget for housecleaning.

While I found that a bit disturbing, I often just inwardly giggled at being introduced at events where I was speaking as "Data Scientist" or having the title on my business card. Many other women I know who have the title also put air quotes around it, feeling compelled to in some way acknowledge the absurdity in the statement. Many of the data scientists I’ve run into have academic backgrounds, MS degrees, have done "actual science" and analyzed numbers for some purpose not directly in the service of increasing traffic numbers or predicting and optimizing customer flow. Or maybe it’s that so much time is often spent doing not-so-glamorous tasks, like scraping and cleaning data, getting it into usable shape so we can even begin.

There's something interesting going on here with the invention of the title "Data Scientist" - specifically, there something about the fact that it's technical yet not engineering that seems to make it easier for people from an academic background, often women, to get access to high tech salaries. By being branded as "technical," the position of data scientist generally commands salaries on about the level of developers/engineers. A story in Tech Republic notes that the median salary of a data scientist is about twice that of someone with the title "business analyst" or "data analyst" - potentially opening up access to higher salaries for women from academic, statistical, or modeling backgrounds. And women make up 46% of data analysts, by some studies.

So at first, I mentally filed "Data Scientist" in the same place as "Growth Hacker" - just another one of those words men made up to masculinize and make important the work traditionally done by women.

But now I've come to start to see the invention and naming of these new types of technical roles as a crack in the system.

In conversation, there's this sense among many of the Data Scientists who are women that the position is more appealing because it's often NOT part of the development team - providing a place for doing technical work outside of one of the most male-dominated teams in tech. It is also significant in providing an alternative stereotype that hiring managers can latch onto (“Aha! academic backgrounds, statisticians!”). The alternative conception of the Data Scientist, combined with the overwhelming desire to have one, created access points to highly paid work in tech for people who might have otherwise been passed over.

Not a crack made on purpose, and not very wide. Still, many people will be left out of this and remain relegated to lesser-paying and gendered work. Though there are some Data Scientist focused classes and “dev-school” type trainings popping up, the hiring criteria still generally implicate having a strong academic background, which isn’t accessible to many and has pipeline problems of its own; whereas in programming the “self-taught” thing is more acceptable. But it’s good to see new types of technical work emerging and and being branded that have better representation of women, and might provide new types of role models for future tech workers and new images of what technical work means.

We will need to watch attrition in this field and see if proves to be a healthier career path, but still, this trend of creating new types of technical roles is something to watch.