C is Manly, Python is for “n00bs”: How False Stereotypes Turn Into Technical “Truths”

We need to question our “objective” and “technical” opinions about programming languages.

by Jean Yang & Ari Rabkin on January 20th, 2015

Language-based snobbery litters the software industry. As far back as 1991, the derisive term “toy language” was being applied to the Pascal language despite its substantial industrial use. We’ve heard computer science PhDs explain they were embarrassed to know Python “because it’s a language for idiots.” Linux creator Linus Torvalds once commented, “C++ is a horrible language… made more horrible by the fact that a lot of substandard programmers use it, to the point where it’s much much easier to generate total and utter crap with it.” Languages and tools divide programmers into more cliques than a high school cafeteria.

Google search results autocomplete for "Python is for", including "Python is for string comparison", "Pythone is for aspergers geeks" and "Python is for noobs".

Google searchers have strong feelings about Python.

More than just popularity is at stake: these language-based cliques determine who calls the shots. Subjective perceptions about languages impact more than just debates between idling programmers, affecting important decisions like hiring and funding. As an intern at companies like Google, Microsoft, and Facebook, Jean has heard industry experts state things like, “I don’t hire people who know .NET.” This comment is laden with the same kind of implicit bias as “I don’t trust a guy who comes to pitch in a suit”–full of assumptions and emphasizing cultural affiliations over talent.

Judgments about language use, despite being far from “objective” or “technical,” set up a hierarchy among programmers that systematically privileges certain groups. Software engineers sometimes deride statistical analysis languages like R or SAS as “not real programming.” R and SAS programmers, in turn, look down at spreadsheet developers. Software engineers also distinguish between front-end (client-facing) and back-end (server) code, perceiving writing server code to be more “real.”

What is considered “real programming” becomes just another bias to overcome for those already marginalized. One Reddit commenter describes her struggle to prove herself as a “real” programmer: “churn out elegant front-end code, submit to the repository, and wait for the ‘wow, you wrote this?’… eventually receive grudging acknowledgement that, ‘hey, you’re pretty good for a girl’; churn out elegant back-end code and submit to demonstrate that the qualifier ‘for a girl’ was unwarranted and insulting, that I was good, period…” While those on the outside are still struggling to prove themselves, the technically privileged have gone ahead to determine what the software that runs our lives should look like.

Life is easy if we can believe that knowing a “good” language signals a strong programmer while knowing a “bad” one warns of incompetence. Many of our notions about who uses what language are not based as much on “objective” or “technical” properties as we would like to believe. In reality, these biases are less about the languages than their communities, and much more social constructs than technical ones.

Our Stereotypes Are Often Wrong

The main reason we should stop having so much faith in our language-based stereotypes is that they’re often wrong.

One assumption people make is about age and languages: older developers are outmoded–and that they know older, outmoded languages. Developers even say that they are embarrassed to know older, outmoded languages for fear that people will think they are old. As one commentator told the New York Times, “Baby Boomers and Gen Xers tend to know C# and SQL,” while “Gen Y knows Python, social media, and Hadoop.” But in a sociological surveys of programmers that Ari and his UC Berkeley colleague Leo Meyerovich conducted in 2011, there was no substantial difference between the languages that older versus younger developers knew.

Slashdot comment that reads: Bah, Python is for girls anyways. Everybody knows that PERL is the language of true men.

There is also a gendered perception of language hierarchy with the most “manly” at the top. One Slashdot commenter writes, “Bah, Python is for girls anyways. Everybody knows that PERL is the language of true men.” Someone else responds, “Actually, C is the language of true men…” Such views suggest that women might disproportionately use certain languages, but Ari and Leo found in their programmer surveys that knowledge of programming languages is largely equivalent between genders. Women are slightly more likely to know Excel and men are slightly more likely to know C, C#, and Ruby, but not enough to establish any gendered hierarchy.

A major reason to eradicate these false stereotypes is that they perpetuate biases against women. Evidence shows that a hostile culture contributes to the “leaky pipeline,” the phenomenon of women leaving tech despite having the interest, skills, and education. (Despite higher numbers of women earning technical degrees, women make up 25% of the tech workforce and less than 15% of the technical positions.) In addition to making women feel underappreciated, viewing “feminine” skills as inferior makes people feel justified in rejecting female candidates or passing them up for promotions. Women seem to get a raw deal even though these “feminine” languages are not underappreciated in reality: while programmers using “girly” languages like Ruby and Python are actually among the most highly paid, there is still evidence that the gender wage gap in tech skews against women.

It is also important to note that women pioneered many forms of programming now viewed as “masculine” or “manly.” Ada Lovelace wrote the first computer program for the Babbage Analytical Machine. It was six female mathematicians who programmed the ENIAC, the first fully electronic general-purpose computer. Despite perceptions that assembly “hacking” is masculine, it was actually a woman–Kathleen Booth–who created the first assembly language. For decades, the number of women studying computer science was growing faster than the number of men–until 1984, roughly the same time personal computers became popular in US homes.

Lego figure of Ada Lovelace.

Photo by Maia Weinstock, published here with permission.

The “Language Wars” Are Not About the Languages

These stereotypes come about because programmers confuse their strong views about languages with their views about the users of the languages. Consider, for instance, this anonymous Reddit comment: “Node.js is not popular because of its non-blocking features. it is popular because now dumb javascript devs can write server code. Earlier they had to learn real languages like Perl/Java/Python etc. to do that.”

Our commenter assumes that using JavaScript, as opposed to a “real” language, is a sign of incapacity. But in technical terms, JavaScript, Perl, and Python are fairly similar (they are all interpreted, dynamically typed, multi-paradigm languages). The only difference is that our commenter has ideas about who uses each language–and about the languages that “real programmers” use.

Despite the extremes that the term “language wars” may suggest, mainstream programming languages are often technically similar. Sure, there are different paradigms and “pure” languages for each, for instance Smalltalk for object-oriented and Haskell for functional programming. There are also theoretical frameworks for comparing languages. Popular mainstream languages, however, are such a mix of features that theory provides little guidance on which is best. For instance, Python and JavaScript both support objects, functional idioms, and imperative styles. Both are usable for a range of programming, from front-end web scripting to backend high-performance computing. The reality is that arguments like “Language X is good/bad because it has paradigm Y” have little technical basis.

These preconceived biases arise because programming languages are as much social constructs as they are technical ones. A programming language, like a spoken language, is defined not just by syntax and semantics, but also by the people who use it and what they have written. Research shows that the community and libraries, rather than the technical features, are most important in determining the languages people choose. Scientists, for instance, use Python for the good libraries for scientific computing.

Languages often spread in disjoint real-world communities, making it easy for false perceptions to arise about a language’s user base. In addition to factions within computer science (machine learning people use Matlab; systems programmers use C; programming language researchers love Haskell), there are also factions across programmers in general. There are specialized astronomer languages (IDL), systems administration languages (Perl), economist languages (Stata), and statistician languages (R). Using an unfamiliar language becomes a proxy for belonging to an unfamiliar community–and becomes associated with all of the relevant stereotypes and biases.

Especially as more languages come about, socioeconomic factors can play a significant role in determining which languages programmers learn. A programmer learns a language not in a vacuum, but through working on substantive projects — typically with the help of more expert programmers of the language. Thus the path to a language also depends on education and employment history rather than on personal choice. Students learn the languages they are taught in school and working programmers learn the languages that their employers specify. Judging a programmer for knowing a “low-status” language is, then, often based on socioeconomic factors rather than technical aptitude. For long-term employment it may be better to look for other signals of competence such as project experience.

Knowledge of certain languages can signal cultural allegiance and socioeconomic affiliation more than technical skills. Strongly statically typed languages such as Haskell and Idris have theoretical advantages in some domains, but many consider them to be more research languages than ones that are ready for industrial use. Programmers constrained on time or money will tend not choose these first. Knowledge of such languages is often limited to the programmers who learned them in school (often elite institutions or in graduate school) or have sufficient leisure time–and access to a community–for self-teaching. The bias goes both ways: there are people who overvalue knowledge of these languages and also people who dismiss the technical value of these languages simply because they are “academic” or “elite.” To avoid perpetuating social bias, both sides should be more open-minded to the languages of other social groups.

As a social construct, programming languages are yet another channel for social bias to masquerade as “objective” and “technical” facts–and thus perpetuate existing social hierarchies. The community seems to have closed the loop on the recursive arguments “X and Y are the best languages because the smartest people use them” and “These are the smartest people because they use languages X and Y”–where “smartest” seems to have replaced “highest-status” without our noticing. It may be inevitable for the “in-group” to perpetuate the existing social hierarchies that benefit them. And based on the diversity numbers as well as as evidence of a wage gap, it seems that these hierarchies are indeed keeping people out. Fortunately, programming languages give us a technical framework for challenging the dominance hierarchies associated with them. With a little work, we should be able to prevent technical hierarchies from copying the biases in existing social hierarchies.

What Now?

Lego figure of Grace Hopper.

Photo by Maia Weinstock, published here with permission.

More people are learning to program than ever before–and have access to more languages. This is exciting, because in this new technological world we have the chance to avoid reproducing the injustices of existing social hierarchies. We need to be careful about how we are using these to let some people in while we keep others out. We should not beckon people into our field only to then ostracize them for the sin of learning PHP. It is important to look beyond what the self-appointed guardians of “real programming” have decreed.

Change begins with small steps, such as giving a programmer a second look even if they don’t know a language you deem “real” or fashionable. We should remember that “knowing” a language is a poor proxy for being able to think rigorously. It is one thing to have learned the syntax of a language and another to have grasped the underlying paradigms. As Ed Post observed decades ago, “the determined Real Programmer can write FORTRAN programs in any language.” Especially given that it takes only a few months for a professional software engineer to learn most mainstream languages, we encourage employers to make hiring decisions based on better metrics than the languages that a candidate knows.

As programmers, we should be more thoughtful about our language choices. As we’ve discussed, technical features are but one reason to use a language–libraries and community are other major factors. Given this knowledge, we should question our “objective” and “technical” opinions about programming languages. We should make a point of being open to languages off the beaten path, or that are less prestigious, especially if the relevant libraries and community are well-suited to the task at hand. Through making more rational language choices we can remove stereotypes in our own minds–and change the perceptions of those around us.

As a community, we need to do a more thorough analysis of the social aspects of programming. Ari’s work is one of the first to gather large-scale survey data to reveal hidden and surprising beliefs about programmers. Such work has pushed the social nature of programming to the forefront and gained acceptance of the accompanying empirical techniques.

Many of us like to think of the software industry as a meritocracy, rewarding those with the best skills who work the hardest. To truly achieve this, we need to remove the hidden biases that can cause us to exclude great programmers. And it is only through looking at technology as a social construct that we can make it a socially inclusive one.

We thank Kelly Buchanan, Cliff Chang, Tim Chevalier, Madeleine Corbett, Will Knight, Adam Marcus, Leo Meyerovich, and Frank Wang for their comments.