From Gatekeeping to Curation: How AI Will Restructure Peer Review

Peer review has always done two jobs. AI only makes one of them obsolete.

Mar 05, 2026

scott cunningham recently wrote a sharp analysis of what happens when AI collapses the cost of producing research papers while journal publication slots remain fixed. His short-term predictions are sobering: submission volumes explode, acceptance rates plummet, desk-rejection rates hit 90%, and the referee pool — already stretched thin by unpaid labor — cannot possibly scale. The system strains to its limits.

His post crystallized something that has been lingering in my mind for some time. I think he is right about the diagnosis, but the disruption will go further than he suggests. What AI does is not just overwhelm the peer review system with volume. It exposes a deeper problem: peer review has always been doing two jobs at once, and AI is about to make one of them redundant.

Two Jobs, One Referee

Think about what we actually ask peer reviewers to do. We ask them to assess quality: Is the methodology sound? Is the statistical analysis correctly implemented? Are the results consistent with the claims? Is the argument logically coherent?

And we ask them to assess significance: Is this an important contribution? Does it advance the field? Is it worth the scarce journal space?

These are fundamentally different tasks. The first is verification — checking whether the work meets a standard. The second is curation — judging whether the work matters. We have bundled both into the same process, handed to the same two or three anonymous referees, for so long that we have stopped noticing they are distinct.

That distinction is about to matter a great deal.

The End of Quality Assessment by Committee

AI is becoming very good at the first task. It can check whether code runs correctly, whether statistical methods are appropriate for the data and research question, whether results reported in a paper match the actual empirical output, whether mathematical derivations are valid, and whether the argument is internally consistent. It can do this faster, cheaper, and — here is the uncomfortable part — more consistently than human reviewers.

We know that human quality assessment through peer review is deeply imperfect. Inter-reviewer agreement is notoriously low: two referees reading the same paper often reach opposite conclusions about its quality. Reviewers exhibit status bias — papers from prestigious institutions get more favorable readings. They display conservatism toward novel methodologies. The process is slow, taking months or years. And it is unpaid. We have tolerated all of this because we had no alternative mechanism for quality control.

Now we do. And the alternative is not just cheaper. It is arguably better at the specific task of quality verification, precisely because it can be made systematic, transparent, and consistent in ways that human review cannot.

What This Looks Like in Practice

This is not hypothetical. Consider what is already emerging for empirical papers. Journals increasingly require replication packages — the code and data needed to reproduce the results. Now extend that requirement one step further: before submission, authors upload their paper and replication package to an AI-powered verification system provided by the journal.

The system checks the code for errors. It runs the code. It verifies that the results described in the paper correspond to the actual empirical output. It flags questionable choices in the empirical strategy — selective samples that conveniently produce desired results, specification searches that go unreported, inconsistencies between the methodology section and what the code actually does.

This shifts the burden of quality assurance from unpaid referees — who have no time and receive no compensation — back to the author, where it arguably should have been all along. But it shifts it in a manageable way, because the author now has AI assistance to meet the standard before submitting. This is just an extension of what many researchers are already doing: using tools like Refine to get detailed feedback on drafts, or uploading code to GitHub with automated test suites that verify correctness. The difference is that the journal formalizes the process and makes the quality standards transparent and pre-specified.

Notice what this does. The author knows exactly what standards the paper must meet. The AI checks whether those standards are met. The process is fast, transparent, and reproducible. No more waiting six months to discover that Referee 2 thinks your standard errors should be clustered differently.

And this is not limited to empirical work. As AI becomes increasingly capable at mathematics and economic modeling, the same logic extends to theoretical papers. Can the AI verify that the proofs are correct? That the model’s implications follow from its assumptions? That the equilibrium characterization is complete? The quality-assessment function is automatable because it decomposes into checkable steps. Each step has a right answer — or at least a defensible standard against which to evaluate.

What AI Cannot Replace

Curation is different. It does not decompose into checkable steps.

Which questions are important right now? Which findings change how we think about a problem? How do different papers speak to each other — where do they agree, where do they conflict, and what does the pattern reveal? What combination of contributions, taken together, moves a field forward?

These judgments require understanding not just what a paper says, but why it matters in context. They require taste, experience, and a sense of where a field is heading. They are inherently subjective — not in the sense of being arbitrary, but in the sense of requiring human intellectual engagement that cannot be reduced to a checklist.

This is where human comparative advantage remains. And it is, I would argue, the more valuable of the two functions. Quality assessment keeps bad work out. Curation helps good work find its audience and its place in the larger conversation.

No Referees, More Editors

If quality assessment can be automated and curation cannot, the implication for journal publishing is straightforward: we need no traditional referees and more editors focused on curation.

Here is what this could look like. Instead of individual submissions trickling into a journal’s general queue, imagine more topical special issues. A journal assembles a panel of editors — say five, instead of the usual one — and issues a call for papers addressing a specific question. There are no external referees in this model. No traditional referee reports. That entire function is handled by AI.

AI handles the first round of selection: screening submissions for methodological quality against transparent, pre-specified standards, combined with an initial assessment of relevance to the call. This is not a black box — the quality criteria are published in advance, and authors can verify their work meets them before submitting.

The editors then do what only humans can do well: curate. From the papers that pass the quality screen, they select those that complement each other — papers that, taken together, tell a richer story than any one of them tells alone. They combine them into a coherent collection. And they write an integrative introduction: a synthesis that draws out the connections between the papers, highlights the main findings, identifies where the disagreements lie, and points to what remains unresolved.

That introduction alone would add more intellectual value than most referee reports do today.

I am experimenting with a small version of this principle on this Substack. The Knightian Uncertainty Dispatch — a monthly curated reading list — is essentially curation-first publishing: select the best recent work on a theme, explain why each piece matters, and show how they connect. It is tiny compared to a journal, but the principle is the same. The value is in the selection and the synthesis, not in gatekeeping.

A Better Allocation of Intellectual Effort

Right now, the profession’s intellectual effort is badly misallocated. Enormous amounts of time go to satisfying referees — revising papers to accommodate idiosyncratic preferences, navigating contradictory reports, resubmitting through cascading rejections at journal after journal. Referees themselves spend hours on reports for which they are not compensated and rarely recognized. Meanwhile, far too little time is spent on what actually advances science: intelligent people reading the best new work carefully and telling the rest of us what it means and why it matters.

The shift I am describing flips that ratio. Authors spend their time meeting transparent quality standards — which AI helps them achieve efficiently. Editors spend their time on the intellectually demanding work of curation — reading broadly, thinking about connections, assembling collections that are more than the sum of their parts. And no one spends months writing or responding to referee reports that often amount to “I would have written a different paper.”

AI will not fix academic publishing by speeding up the existing system. It will fix it by making the existing system’s gatekeeping function redundant — so we can finally build something better around what has always been the more important task: curation.

Now I can’t stop thinking about what it would look like to actually build this.

Discussion about this post

Ready for more?