Over one million pull requests were authored by AI coding agents on GitHub between May and September 2025 alone β and most engineering teams have no system for tracking which of those PRs landed in their own codebases.
The Attribution Gap Is a Governance Gap
The volume of AI-generated code in enterprise repositories has crossed the threshold where deferring attribution tracking is no longer defensible. GitHub's Octoverse 2025 report shows that over 80% of new developers used AI coding assistance in their first week on the platform. McKinsey's developer productivity research cites aggregated telemetry placing AI-authored production code at roughly 27% of all commits across millions of measured developers β up from 22% the previous quarter.
Attribution β knowing which pull requests, files, or commits were AI-generated β is rarely captured as structured metadata. It lives in freeform commit messages when engineers choose to note it, or not at all. Without attribution, you cannot route AI-generated PRs to reviewers who know what to look for, calculate a meaningful risk profile for your review queue, or answer auditor questions about what your agents actually shipped.
This is not a theoretical risk. A peer-reviewed study published at ICSE 2026 analyzed over 33,000 AI-generated pull requests and found that flawed AI contributions frequently merge because reviewers respond to social and process signals rather than security content. The top recurring weakness categories included OS command injection (13.0% of findings) and path traversal (10.3%). The AI-generated code security vulnerability rates measured across thousands of production PRs confirm this is a systemic pattern, not an edge case. Reviewers who cannot identify AI-generated code cannot adjust their scrutiny accordingly.
Compliance Pressure Is Already Here
The EU AI Act, enforceable in full from August 2026, imposes transparency and labeling obligations on AI-generated content. The EU Commission published a voluntary Code of Practice on marking AI-generated content in June 2026 β the first structured guidance on what AI provenance documentation should look like in practice. For software teams, that means the pull request log needs to capture AI authorship as a structured field, not an optional annotation that depends on developer discipline.
The NIST AI RMF reaches the same conclusion via its GOVERN and MAP functions: organizations must document and track AI model provenance, decision logs, and versioning across deployments. Research presented at ICSE 2026 by Queen's University and the Linux Foundation documents active work to extend SPDX/ISO SBOM specifications to capture AI-specific supply chain elements β model provenance, training lineage, and governance metadata β as machine-readable, auditable fields. The standards layer is moving; enforcement will follow.
What compliance frameworks are converging on, in practical terms:
Document which AI models contributed code to your pipeline and under what conditions
Maintain audit records of AI-generated contributions by date, repository, and model version
Apply defined human oversight processes to AI outputs that cross defined risk thresholds
Capture provider and model version at the point of code contribution β not reconstructed retroactively from logs
How to Build Attribution Into Your PR Workflow
Before designing a review routing strategy, check your AI code governance maturity baseline. Attribution is the data layer that makes every downstream governance control reliable β without it, you are applying policies to a codebase you cannot fully see.
Add a required AI authorship field to your PR template. A simple yes/no/partial flag creates a minimum-viable audit trail with no tooling cost and establishes the disclosure norm across your team.
Use CI/CD environment variable injection to capture which AI coding tool was active during a commit session. Most modern CI platforms support this at the pipeline configuration level β it takes one pipeline step to surface what would otherwise be invisible.
Apply differentiated review gates for AI-attributed PRs. The ICSE 2026 peer-reviewed study found OS command injection and path traversal as the most frequent recurring patterns in AI-generated code β risks that justify stricter baseline static analysis before human review begins.
Capture attribution at the PR level, not just the individual commit. Most AI-assisted PRs mix human and agent contributions across commits; the review decision β and the compliance record β happens at the PR level.
Anchor attribution to a sanctioned tool list from your acceptable use policy before enforcement. Attribution only creates accountability when your policy defines which AI agents are permitted to contribute code and under what conditions.
The disclosure requirement starts with policy. Your acceptable use policy for AI coding agents should specify which tools are sanctioned, what disclosure is required at PR creation, and the review conditions that apply when AI agents contribute to production code. Without that policy layer, attribution tracking surfaces data with nowhere to route it.
What re-entry.ai Does About This
re-entry.ai scores pull request risk at the PR level and surfaces AI authorship signals automatically β without relying on engineer self-disclosure β giving engineering teams a structured, risk-weighted view of their AI-generated PR queue that holds up under audit. See how it works at https://re-entry.ai.
Attribution is not a reporting nicety. It is the data layer that makes every downstream AI code governance decision β from review routing to compliance evidence β grounded in something auditable. Without it, you are governing outputs you cannot trace to their source.