AI Coding Tools Linked to Rising Software Defects

Key Takeaways:

AI-generated code shows a higher concentration of defects than human-written code.
Logic, security, and maintainability issues are disproportionately affected.
Strong review and testing controls are critical when using AI coding tools.

As AI coding tools gain widespread adoption in software development, new research suggests they may be accelerating defects instead of reducing them. The report finds that AI-generated code introduces substantially more issues, particularly across logic, security, and long-term maintainability.

According to a new study from CodeRabbit, AI-powered coding tools have raised concerns about increased bugs and incidents. The study analyzed 470 GitHub pull requests, which involved 320 AI co-authored and 150 human-only input, to identify patterns in code issues.

AI-generated pull requests tend to have a significantly higher issue volume compared to human-authored ones, which averages about 1.7 times more problems. At the 90th percentile, the gap widens further, with AI PRs containing 26 issues compared to 12.3 for humans, which indicates heavier tails and more complex reviews. Moreover, AI-generated code introduces 1.4 to 1.7 times more critical and major defects, which increases the overall risk to production environments.

Issue types most commonly amplified by AI

CodeRabbit’s study found that this AI-generated code raises several categories of issues compared to human-authored code. Logic and correctness problems are the most pronounced, with a 75% higher prevalence, which often manifests as algorithmic errors and misconfigurations. Moreover, readability issues are three times more frequent, and formatting or naming inconsistencies nearly double in frequency. Error handling gaps, particularly in exception paths, occur almost twice as often, and security vulnerabilities increased by about 1.5 times, with improper password handling nearly doubling.

High-risk problem areas in AI-written code

Additionally, AI-generated code shows notable weaknesses in specific problem areas. Algorithm and business logic mistakes occur more than twice as often in AI pull requests, and null-pointer risks are similarly elevated at 2.27 times higher. Moreover, security flaws such as improper password handling appear 1.88 times more frequently, and performance inefficiencies stand out with excessive I/O operations occurring nearly eight times more often than in human-authored code.

Interestingly, humans perform worse in a few areas, particularly spelling errors and testability issues. These problems remain more common in human pull requests due to their tendency to include more inline documentation and descriptive comments.

Why do AI coding models make these mistakes?

AI-generated code tends to make mistakes for several reasons. These models often lack project-specific context, which leads to errors in business logic and configuration. They also prioritize surface-level correctness, which produces code that looks structurally sound but omits critical safeguards like null checks and robust control-flow handling.

Additionally, AI struggles to consistently follow repository conventions, which results in naming, formatting, and style inconsistencies. Security best practices can shift as models replicate outdated or unsafe patterns, and resource usage remains naïve, which favors clarity over efficiency and causes performance inefficiencies.

AI Coding Assistants Linked to Rising Software Defects and Security Risks — Reviewers should focus more on logic & correctness issues (Image Credit: CodeRabbit)

Best practices for reducing AI-powered code defects

Organizations adopting AI-assisted coding should implement targeted safeguards to reduce the risks associated with AI-generated code. The findings show that AI amplifies issues in logic, security, maintainability, and performance, so IT teams need proactive measures to prevent these defects from reaching production.

It’s highly recommended that organizations provide project-specific context to AI tools to minimize logic and configuration errors, and enforce style and formatting rules through automated CI checks to address readability and naming inconsistencies. Moreover, organizations should strengthen correctness and safety rails by requiring comprehensive pre-merge tests, nullability checks, and standardized error-handling patterns.

For security, businesses must centralize credential management and use security linters and SAST tools to detect vulnerabilities like improper password handling. Moreover, they should guide performance behavior through best practices (such as batching I/O operations and efficient data structures) to avoid resource inefficiencies. It’s also advised to adopt AI-aware review checklists and leverage third-party AI code review tools to provide an additional layer of scrutiny.

Rabia Noureen News Editor

Rabia has a master's degree in Software Engineering and she has years of experience writing professionally about Microsoft products and other technologies. Rabia has also written for OnMSFT.com as well as Windows Report. She is always up to date on t...

AI Coding Assistants Linked to Rising Software Defects and Security Risks

Issue types most commonly amplified by AI

High-risk problem areas in AI-written code

Why do AI coding models make these mistakes?

Best practices for reducing AI-powered code defects

SHARE ARTICLE

Related Articles