The 10x Engineer Is Complete BS

If you’ve spent a significant amount of time around engineering orgs, you’ve probably heard some flavor of the claim that the best developers are 10 times more productive than average. I’ve had people tell me that their hiring processes are so stringent because they only hire 10x engineers. If 10x engineers do exist then it does have interesting ramifications for how you build teams – I’d gladly pay 5X the typical salary for someone who generates 10x the output for example, and I’d also love to know if the 10x skillset can be trained. So let’s dig into the evidence around 10x engineers.

Where Did This Idea Come From?

The study that seems to have spawned this idea is described in this paper from 1968. The authors took 12 developers employed by their company, gave them two standardized programming assignments (termed “Algebra” and “Maze”), and measured both how long it took them to complete the tasks, how quickly the programs executed, and a few other things summarized below. Six of the programmers performed the tasks in “offline” mode meaning they had no access to a computer. The offline developers wrote and debugged programs in their heads, and when they wanted to execute the programs, they physically handed them to a second person who would go run it. Apparently, this is how things used to work back in the mainframe days.

The results are summarized in this table from the paper. For the offline group debugging hours includes 2 hours of waiting each time the program is run. Comparing the best performer to the worst performer showed the best folks were around 20x faster than the worst at coding and debugging tasks. There was also a second study with 9 developers showing similar results.

Oh god, where do we start…

Trying to draw conclusions about all developers from a SAMPLE SIZE OF 12 is not so great. Those 12 developers were not just random people either. They all worked for the same company which raises issues around selection bias.

Another major problem is the results only compared the best against the worst in each individual category. This makes absolutely no sense. First off, comparison against the median is far more meaningful, especially if we assume a something resembling a normal distribution with a few extreme outliers. Second, the comparisons look at each category in isolation – maybe the fastest coder was the slowest debugger, meaning the total time to fully accomplish the task for each individual is likely much less than 20x difference.

Oh, and there’s the part where HALF OF THE PARTICIPANTS HAD NO ACCESS TO A COMPUTER AND HAD TO WAIT TWO HOURS EVERY TIME THEY WANTED TO RUN THEIR PROGRAM. Just a hunch, but that may have had something to do with the poorest performers taking a long time to accomplish the tasks.

Are there any better studies?

Ok, the study that started the 10x idea is not particularly compelling. What else is out there? This paper is much more interesting – professors at Carnegie Mellon measured 494 students with professional programming backgrounds from 2000-2006 to assess the relative effort on ten different programming assignments, and the results are reproduced below.

A few interesting trends stick out. First, the poorest performers are sooooo much worse than the best, easily replicating the 20x performance difference reported in the 1968 study and often exceeding it. Second, the 25th, 50th, and 75th percentile are pretty tightly bunched together, showing a pretty consistent ~2x difference between 75th and 25th. The best performers never achieve 10x productivity of the median, and average ~5x more productivity.

Right now you’re saying to yourself, “if someone is consistently 5x better than the median, that’s massive!” And you’d be right, so let’s go to the data to see if they are consistently better.

This chart shows the range of results for each student sorted by their median rank, so the rightmost bar represents the top performer. Interestingly there were a few top performers that were consistently at the top of the rankings (i.e, the vertical gray bars are short on the far-right part of the chart). The vast majority of people had very little consistency though. Restated, there was a huge variation in how people performed across assignments. The average programmer had difference of 249 places between their high and low rank for the 10 assignments (i.e., over half of the class). 482 of the 494 students finished at least one project faster than the median. This has significant implications for management, since one can get huge productivity gains by ensuring people are at their best, rather than trying to hire more effective people.

Ok, so evidence for the 10x engineer isn’t there, but maybe the 5x one exists. What other evidence do we have? Here are some papers that came up with 5x, 2x, 5x, and 3.5x differences between the median and the best. Some of these studies have flaws, but they all land in the same neighborhood which is quite interesting.

Does any of this even matter?

So 5x engineers are probably a thing that exists. But let’s take a step back and remember what those studies were mostly measuring, namely the time it takes individuals to complete a programming task. Any serious real-world project is going to involve a team of people though. There are no studies I know about that take into account how effective an individual is on the team. It’s easy to envisage scenarios where the 5x person writes code that’s incredibly difficult for others to maintain or a median code-contributor making 5x contributions through mentoring other team members, fostering team cohesiveness, or their domain expertise. This isn’t even touching on the fact that programming is rapidly changing with the rise of generative AI, where it’s not hard to imagine a future where the median engineer will have a much more level playing field with the people who are 5x better at individual tasks.

So to summarize, there’s no evidence that 10x engineers exist when measuring individual productivity. 5x probably do but they’re exceptionally rare. Aside from poor performers, most engineers fall within a relatively tight band of performance with very high individual variation, so your best bang-for-the-buck as a manager involves helping people be their most productive selves. I would love to see studies that measure productivity in terms of team contribution instead of skill at individual tasks – maybe 10x does exist when examined in that light.

Nate Clark's Writing