Large language models are transforming how scientists write and publish. While output rises sharply, especially among non-native English speakers, reviewers and funders face a growing challenge: separating scientific substance from AI-assisted noise.
A new Cornell study shows that AI tools like ChatGPT significantly increase scientific output — while making it harder for reviewers to distinguish high-quality research from well-written but low-value papers.
After ChatGPT became available to the public in late 2022, scientists began talking among themselves about how much more productive they were using these new artificial intelligence tools, while scientific journal editors complained of an influx of well-written papers with little scientific value.
These anecdotal conversations represent a real shift in how scientists are writing up their work, according to a new study by Cornell researchers. They showed that using large language models (LLMs) like ChatGPT boosts paper production, especially for non-native English speakers. But the overall increase in AI-written papers is making it harder for many people — from paper reviewers to funders to policymakers — to separate the valuable contributions from the AI slop.
“It is a very widespread pattern, across different fields of science — from physical and computer sciences to biological and social sciences,” said Yian Yin, assistant professor of information science in the Cornell Ann S. Bowers College of Computing and Information Science. “There’s a big shift in our current ecosystem that warrants a very serious look, especially for those who make decisions about what science we should support and fund.”
Yin’s group investigated the impacts of LLMs on scientific publishing by collecting more than 2 million papers posted between January 2018 and June 2024 on three online preprint websites. The three sites — arXiv, bioRxiv and Social Science Research Network (SSRN) — cover the physical, life and social sciences, respectively, and post scientific papers that have yet to undergo peer review.
The researchers compared presumably human-authored papers posted before 2023 to AI-written text, in order to develop an AI model that detects papers likely written by LLMs. With this AI detector, they could identify which scientists were probably using the technology for writing, count how many papers they published before and after adopting AI, and then see whether those papers were ultimately deemed worthy of publication in scientific journals.
Their analysis showed a big AI-powered productivity bump. On the arXiv site, scientists who appeared to use LLMs posted about one-third more papers than scientists who weren’t getting an assist from AI. The increase was more than 50% for bioRxiv and SSRN.
Not surprisingly, scientists whose first language is not English, who face the hurdle of communicating science in a foreign language, benefited the most from LLMs. Researchers from Asian institutions, for example, posted between 43.0% and 89.3% more papers after the AI detector indicated a switch to using LLMs compared to similar scientists not using the technology, depending on the preprint site. The benefit is so large, Yin predicts a global shift in the regions with the greatest scientific productivity, to areas previously disadvantaged by the language barrier.
The study uncovered another positive effect of AI in paper preparation. When scientists search for related research to cite in their papers, Bing Chat — the first widely adopted AI-powered search tool — is better at finding newer publications and relevant books, compared to traditional search tools, which tend to identify older, more commonly cited works.
People using LLMs are connecting to more diverse knowledge, which might be driving more creative ideas.
Keigo Kusumegi, First Author, Doctoral Student; Cornell
In future work, first author Keigo Kusumegi hopes to explore whether AI use leads to more innovative, interdisciplinary work.
While LLMs make it easier for individuals to produce papers, they also make it harder for others to evaluate their quality. For human-written work, clear yet complex language — with big words and long sentences — is usually a reliable indicator of quality research. Across all three preprint sites, papers likely written by humans that scored high on a writing complexity test were most likely to be accepted to a scientific journal. But high-scoring papers probably written by LLMs were less likely to be accepted, suggesting that despite the convincing language, reviewers deemed many of these papers to have little scientific value.
This disconnect between writing quality and scientific quality could have big implications, Yin said, as editors and reviewers struggle to identify valuable paper submissions, and universities and funding agencies can no longer evaluate scientists based on their productivity.
The researchers caution that the new findings are based solely on observations. Next, they hope to perform causal analysis, such as a controlled experiment, where some scientists are randomly assigned to use LLMs and others can’t.
Date: 08.12.2025
Naturally, we always handle your personal data responsibly. Any personal data we receive from you is processed in accordance with applicable data protection legislation. For detailed information please see our privacy policy.
Consent to the use of data for promotional purposes
I hereby consent to Vogel Communications Group GmbH & Co. KG, Max-Planck-Str. 7-9, 97082 Würzburg including any affiliated companies according to §§ 15 et seq. AktG (hereafter: Vogel Communications Group) using my e-mail address to send editorial newsletters. A list of all affiliated companies can be found here
Newsletter content may include all products and services of any companies mentioned above, including for example specialist journals and books, events and fairs as well as event-related products and services, print and digital media offers and services such as additional (editorial) newsletters, raffles, lead campaigns, market research both online and offline, specialist webportals and e-learning offers. In case my personal telephone number has also been collected, it may be used for offers of aforementioned products, for services of the companies mentioned above, and market research purposes.
Additionally, my consent also includes the processing of my email address and telephone number for data matching for marketing purposes with select advertising partners such as LinkedIn, Google, and Meta. For this, Vogel Communications Group may transmit said data in hashed form to the advertising partners who then use said data to determine whether I am also a member of the mentioned advertising partner portals. Vogel Communications Group uses this feature for the purposes of re-targeting (up-selling, cross-selling, and customer loyalty), generating so-called look-alike audiences for acquisition of new customers, and as basis for exclusion for on-going advertising campaigns. Further information can be found in section “data matching for marketing purposes”.
In case I access protected data on Internet portals of Vogel Communications Group including any affiliated companies according to §§ 15 et seq. AktG, I need to provide further data in order to register for the access to such content. In return for this free access to editorial content, my data may be used in accordance with this consent for the purposes stated here. This does not apply to data matching for marketing purposes.
Right of revocation
I understand that I can revoke my consent at will. My revocation does not change the lawfulness of data processing that was conducted based on my consent leading up to my revocation. One option to declare my revocation is to use the contact form found at https://contact.vogel.de. In case I no longer wish to receive certain newsletters, I have subscribed to, I can also click on the unsubscribe link included at the end of a newsletter. Further information regarding my right of revocation and the implementation of it as well as the consequences of my revocation can be found in the data protection declaration, section editorial newsletter.
Yin is also planning a symposium that will examine how generative AI is transforming research — and how scientists and policymakers can best shape these changes — to take place March 3-5, 2026 on the Ithaca campus.
As scientists increasingly rely on AI for writing, coding and even idea generation — essentially using AI as a co-scientist — Yin suspects that its impacts will likely broaden. He urges policymakers to make new rules to regulate the rapidly evolving technological landscape
“Already now, the question is not, have you used AI? The question is, how exactly have you used AI and whether it’s helpful or not.”
Original Article: Scientific production in the era of large language models; Science; DOI:10.1126/science.adw3000