Are AI Coding Assistants Really Saving Developers Time? Study Suggests These Tools Don’t Increase Coding Speed.

Oct 04, 2024

Check out the top 100 free bestsellers in computer programming!

A study challenges claims that AI coding tools improve developer productivity, revealing that using GitHub Copilot led to a 41% increase in bugs, raising concerns about code quality. While both control and test groups experienced reduced working hours, developers using GitHub Copilot saw no relief from burnout, highlighting the tool’s limited effectiveness in reducing work-related stress. Moreover, developers are spending more time reviewing AI-generated code, which may negate any potential time savings.

AI coding tools appear to have minimal impact on developer productivity.

The debate within the developer community continues, with differing opinions on the actual productivity benefits of AI coding assistants. This underscores the need for a careful evaluation of their real-world impact. While some companies report significant productivity gains from AI tools, others find that they introduce errors and complicate the debugging process. Junior developers, in particular, often struggle to match the efficiency of senior developers, even with assistance from AI tools.

Coding tools have been an obvious early application in the rise of generative AI, but a recent study by analytics firm Uplevel suggests that the anticipated productivity gains may be overstated, if they exist at all. Uplevel, which analyzes coding and collaboration data, reports that the use of GitHub Copilot resulted in a 41% increase in bugs.

“This suggests that Copilot may be negatively impacting code quality. Engineering leaders might need to investigate pull requests with bugs and implement safeguards for the responsible use of generative AI,” says the report, Can Generative AI Improve Developer Productivity.

The study measured pull request (PR) cycle time—the duration it takes to merge code into a repository—and PR throughput—the number of merged requests. It found no significant improvements in these metrics for developers using GitHub Copilot. These findings were part of Uplevel’s research, conducted to answer three key questions:

Does access to GitHub Copilot help developers code faster?
Does GitHub Copilot help developers produce better-quality code?
Does GitHub Copilot mitigate developer burnout?

Uplevel analyzed data from its customers, comparing the output of around 800 developers using GitHub Copilot over a three-month period to their performance during the three months prior to adoption. The firm’s other two findings were as follows:

No Significant Change in Efficiency Measures

“When comparing cycle time, throughput, and the complexity of pull requests (PRs) with and without tests, GitHub Copilot neither helped nor hindered developers in the sample, nor did it improve coding speed. While some of these measures were statistically significant, the changes had no meaningful impact on technical outcomes—for instance, cycle time decreased by just 1.7 minutes,” Uplevel’s report states.

Mitigation of Burnout Risk

Uplevel’s "Sustained Always On" metric (which tracks extended work outside of regular hours and is a leading indicator of burnout) decreased in both groups. However, it dropped by 17% for developers using GitHub Copilot and nearly 28% for those not using the tool.

A Study Published by GitHub Came to Different Conclusions

“Uplevel’s study was motivated by curiosity about claims that AI coding assistants would become ubiquitous,” said Matt Hoffman, a product manager and data analyst at Uplevel. In contrast, a GitHub study published in August 2024 found that 97% of software engineers, developers, and programmers reported using AI coding assistants. Other studies have yielded similar findings.

In its study, GitHub reports that over 97% of respondents have used AI coding tools at work at some point, a finding consistent across all four countries surveyed. However, a smaller percentage said their company actively encourages or allows the use of AI tools, with this varying by region. Key findings from the survey include:

The wave of generative AI in software development continues to grow. The survey expanded to 2,000 respondents, with nearly all (over 97%) having used these tools at some point, whether on or off the job (though not all companies have officially endorsed their use).
While many respondents said their companies welcome AI, there is still room for improvement. Survey data reveals that between 59-88% of respondents across different markets reported that “their company actively encourages or allows the use of these tools.”
Software development teams are recognizing more benefits to AI coding tools than previously acknowledged. These include creating more secure software, improving code quality, generating better test cases, and faster adoption of new programming languages. Ultimately, this has resulted in time savings that developers are dedicating to more strategic tasks.

“Respondents to our survey said AI helps them work more productively, using the time saved to design systems, collaborate more, and better meet customer needs. AI doesn’t replace human jobs; it frees up time for human creativity. Now, let’s dive into the research,” said Kyle Daigle, GitHub’s chief operating officer, in a blog post.

However, the Uplevel study challenges this view. While GitHub’s productivity metrics are strong, Uplevel found that developers are spending more time reviewing AI-generated code, which may offset any time savings. Similarly, earlier this year, GitClear reported that while AI assistants can aid in programming, they don’t always improve code quality and can introduce more bugs.

GitClear researchers found that AI tools like GitHub Copilot primarily offer suggestions for adding new code, without recommending updates or code removal. This often leads to redundant code. Additionally, they observed a sharp increase in "code churn," meaning that code is being modified frequently—typically a negative sign for code quality.

"Each new iteration of AI-generated code becomes less consistent as different parts are developed using varying prompts. As a result, the code becomes increasingly difficult to understand and debug, making troubleshooting so resource-intensive that it's sometimes easier to rewrite the code from scratch," said one user, noting that AI has yet to improve productivity.

A Cautious Strategy for Adopting AI Coding Assistants

The introduction of AI tools like GitHub Copilot raises several important questions: Will AI help developers work faster? Can it improve code quality and prevent burnout? "Not yet for this population. However, innovation is moving quickly, and GitHub has found that Copilot improves developer satisfaction," Uplevel responded in its report. Engineering leaders may want to consider a cautious approach to adopting Copilot in preparation for further advancements in the tool:

Set Specific Goals: Define clear outcomes for integrating GitHub Copilot into your team’s workflow. What specific improvements are you hoping to achieve?
Provide Team Training: Offer initial training to explain when and where GitHub Copilot should or shouldn’t be used, and establish safeguards to ensure proper implementation.
Continue Experimenting with Generative AI: Identify specific use cases where Copilot excels and refine prompts that yield the best results. Share successful strategies across the organization to replicate successes.
Monitor Technical Efficiency Metrics: Conduct A/B tests to gather objective, quantitative data on whether AI is truly improving developer productivity and helping achieve your operational goals.

Sources: Uplevel , GitHub

Shenisha’s Substack

Discussion about this post