Measuring developer productivity has long been a holy grail of business. And like the Holy Grail, it was elusive. But based on our work with companies from different industries, we think we may have found a way to do this that could work.
In 2020, McKinsey surveyed 440 large companies about their “developer velocity” – meaning the practices that best leverage the full potential of development talent. The results were striking. Companies in the top quarter achieved four to five times faster sales growth than those in the bottom quarter. The top performers also saw 60% higher shareholder returns and 20% higher operating margins. Their customers were more satisfied and their business colleagues had a better employee experience.
And this doesn’t just apply to technology companies. For example, in retail, software is the fastest growing job category; about half of the world’s software engineers work outside the tech industry. There are currently approximately 27 million developers working, of which 4.4 million are in the United States. The U.S. Bureau of Labor Statistics has predicted that the number of software developers will grow by 25% between 2021 and 2031. Given the rise of generative artificial intelligence, this could well be a huge underestimate.
All this data leads to a simple conclusion: leaders need to know that they are using developer talent in the best possible way. That’s not easy. The relationship between input and output is murky, and software development is inherently collaborative and creative. Additionally, system, team, and individual productivity all need to be measured. Well-known metrics such as deployment frequency are useful when it comes to tracking teams, but not individuals. So it’s complicated. But we believe it is possible.
The developer productivity metrics that matter most
The reason we believe this is because we work with twenty technology, financial and pharmaceutical companies that do this. The results are not yet definitive, but they are promising. Based on internal research, when these companies acted on the following process, they achieved positive results on customer defects (a 20% to 30% decrease); employee experience (20% higher); and customer satisfaction (60% up).
This is how it works. We started with two established sets of metrics, developed by Google and Microsoft respectively: DORA (an acronym for DevOps Research and Assessment team), which measures outcomes; and SPACE (an acronym for satisfaction, performance, activity, communication/collaboration and efficiency), which is good at evaluating measures related to optimization, such as interruptions. We then supplemented this with the following four ‘opportunity-based metrics’.
Time spent on inner/outer loop. The inner loop includes activities directly related to creating the software product: coding, building, and unit testing. The outer loop includes activities related to putting the code into production: integration, testing, release, and deployment. When developers spend more time in the inner loop, they are more productive; for top performers this is approximately 70%.
Developer Speed Index Benchmarking. By comparing a company’s practices with those of its peers, it is possible to uncover specific areas that can be improved, whether it is backlog management, testing or security and compliance. Greater maturity in development practices is associated with better business performance.
Contribution analysis. This relates to assessing contributions to a team’s backlog. With tools like Jira, which measures backlog management, it is possible to identify trends that are harmful to optimization. The process can also reveal opportunities, such as improving the work environment, increasing automation or improving individual skills, to solve problems that could harm performance. For example, one company found that its top contributing developers were spending too much time on non-coding activities. The company changed its business model to ensure they could focus on what they did best.
Talent ability. The idea here is to make sure the right people are in the right place. By using industry standard capability maps it is possible to create a score that summarizes the individual knowledge, skills and capabilities of a specific organization. This can reveal both holes and bulges. For example, one company felt it had too many inexperienced developers. In response, the company took action, including offering personalized learning paths, and moved 30% of its developers to the next level of expertise within six months.
Combined with DORA and SHAPE, these tools effectively create a refined view of software productivity. The insights that emerge are intrinsically interesting. The value comes from using it to figure out how to keep developers motivated; whether they have the right tools and expertise; how they use their time; and whether staffing levels are correct.
Improving an imperfect model
Like the Holy Grail, there are those who think that measuring developer productivity is a myth and that we are wrong. But the twenty companies we work with disagree.
Furthermore, we do not accept that software engineering is so complicated or mystical that measurements are impossible. On the contrary, McKinsey has been able to estimate the improvements associated with using generative AI-based tools in several areas, including drafting new code and deploying updates.
The system we have described is undoubtedly imperfect; We welcome criticism that can improve this situation. But given the ever-increasing importance of software development and the increasingly fierce competition for talent, it’s too important not to give it a try.