How to Become an Integrity Sleuth in the Library

Matthew Goddard; Zachariah Motts

doi:10.1146/katina-07102025-1

CREDIT: AirinLaim via Shutterstock

The Future of Work

Series: Publishing Integrity

How to Become an Integrity Sleuth in the Library

Open access agreement management creates an opportunity for library staff to help identify publishing integrity problems. Here are some techniques to try.

By Matthew Goddard and Zachariah Motts

|

07.10.2025

LAYOUT MENU

Insert PARAGRAPH

Insert H2

Insert H3

Insert Unordered List

Insert Ordered List

Insert IMAGE CAPTION

Insert YMAL WITH IMAGES

Insert YMAL NO IMAGES

Insert NEWSLETTER PROMO

Insert QUOTE

Insert VIDEO CAPTION

Insert Horizontal ADVERT

Insert Skyscrapper ADVERT

Note: this is the second of a three-part series about how libraries can respond to publishing integrity problems in the context of open access agreement management. Part One describes the role libraries could play and discusses strategies for managing OA agreements in light of these problems. Part Two drills down into specific practices through which libraries can identify and respond to publishing integrity problems and explores the significant challenges ahead. Part Three will review several online tools being developed to assist such efforts.

Sleuths in the Library

If a history of publishing integrity in this era is ever published, it will tell the story of the science integrity sleuths who have worked tirelessly and often without remuneration to draw attention to the damage that fake papers are doing to the scientific record: Anna Abalkina, digging into paper mill connections. The eagle eyes of Elisabeth Bik. The visionary infrastructure of PubPeer and the Retraction Watch Database. Guillaume Cabanac’s Problematic Paper Screener. The unflinching snark of Leonid Schneider and the pseudonymous Smut Clyde blogging on For Better Science. The analytic insights of Alexander Magazinov, Nick Wise, Csaba Szabo, Sholto David, and the many anonymous sleuths reading, posting, sending emails, and raising alarms to institutions, publishers, and editors.

Much of this work requires specialized technical skills or scientific knowledge that are not often found among library staff. But academic libraries do cultivate expertise in nearly every aspect of scholarly communication itself, which is clearly transferable to publishing integrity efforts.

In this article, we’ll describe indicators of publishing integrity issues that can be identified using tools and abilities already at hand in the library. Rather than attempt to provide a comprehensive list of methods currently in use, we’ll detail some methods with particular relevance to this new direction in library work.

Red Flags

When article submissions come to the library for open access funding approval under our OA publishing agreements, staff typically receive limited information from the publisher: corresponding author name, email address, a claim of affiliation, names of co-authors, article title, journal title, and a date. But even this minimal metadata can create opportunities to spot potential misconduct.

The red flags for research misconduct that we describe below have been collected from a variety of sources, discussions, and collaborations. We have benefited from conversations with the research integrity staff at major publishers, interactions with our institution’s research integrity officer, presentations by vendors of new research tools, the Fostering Accountability for the Integrity of Research Studies (FAIRS) Conference held at St. John’s College, Oxford, in April 2025, PubPeer threads and Retraction Watch articles, and the hands-on experience of investigating retracted papers and networks of authors. This work by library staff to identify indicators of research misconduct is running to catch up with work that has come before, like that of Abalkina (2023), Parker, Boughton, Bero, & Byrne (2024), and Matusz, Abalkina, & Bishop (2025), among others. The excellent analysis of paper mills and mapping of evidence in these papers shed light on misconduct that would otherwise remain invisible by design.

Author Checks

Within open access agreement workflows, verification of the corresponding author’s affiliation is perhaps the simplest means to protect the integrity of our OA agreements. The opportunity to verify affiliation is also a point in the OA workflow where we can glean useful information:

Email Address Validation: Often, we receive funding approval requests for authors who are not using their institutional email. This is not suspicious in itself; an author may be graduating or changing positions. But in order to get an APC covered under an institutional agreement, a bad actor might impersonate a real institutional researcher using a generic email address. Emeritus professors can be especially vulnerable to this kind of fraud. For this reason, libraries should confirm the identity of corresponding authors who do not use their institutional emails, especially if no other author is using one. This might be done by contacting the author using an official institutional email or phone number. Submission systems can address this problem by requiring validation of an institutional email address, even when a different address is listed on the paper.
Title Search: Occasionally, spotting an issue is as easy as Googling a title and finding an authorship sale, a plagiarized paper, or an obvious “salami slice” (the term of art for the practice of publishing as many articles out of a single study as possible).
Subject Match: The title and journal name offer a sufficient basis for a librarian to ask questions about the paper’s subject matter. Does the subject matter match the institution’s profile? Does it match the author’s field of study and publication history?
Unusual Number of Submissions: An unusual increase in the volume of submissions by a single author can be a red flag. This is a situation where being able to watch the publishing process through OA approval dashboards can be a major advantage. If someone is attempting to spam a particular article title to multiple journals across publishers, the library may be uniquely situated to notice.
Spike in Publishing Output: At the other end of the publishing process, an unusual increase in publishing output can be a helpful indicator. The more extreme the output increase, the more likely that there is something unusual going on, though it can be difficult to distinguish rare genius from papermilling or authorship sales.

Co-Author Checks

When an anomaly is spotted, it can also be revealing to ask who else could be benefitting. Who are the co-authors?

Suspicious Network: Does a spike in publications take place within a circle of previous co-authors? Are there many authors with seemingly random institutional affiliations, widespread geographical locations, or unusual specialties given their institution and/or publishing history? Someone from a Russian school of dentistry might stick out as an odd choice for a co-author on a paper from Chinese researchers about, say, electro-optic modulators. (For a recent statistical approach to co-author network analysis, see Porter & McIntosh, 2024.)
Previous Bad Actors: The network of co-authors can be one of the strongest indicators of an issue. Authors who are connected to networks of paper mills and who have been caught buying authorship continue to publish. If an institutional author attempts to publish a paper with a confirmed papermiller as a co-author, that paper deserves extra scrutiny.
Odd Collaborations: Some indicators related to co-authorship offer helpful starting points, though they cannot rise to the level of strong evidence on their own. Broad international collaboration is one. While science is, now more than ever, international in scope, an author list of six authors from six geographically and culturally disparate countries may be a red flag. This is particularly true when the countries include those with conditions known to be accommodating to paper mills (for country-specific examples, see Ansede, 2024; Mallapaty, 2025; Richardson, Hong, & Nunes Amaral, 2024; Schneider, 2024). Of course, care should be taken not to stereotype researchers based on their country of origin.
Questionable Affiliations: Certain institutional affiliations may also raise questions for librarians familiar with the relevant research. Meho & Akl (2025) provide a fascinating study of institutions that showed signs of artificially boosting their publishing output (one common yardstick for measuring reputation) through the fraudulent inclusion of affiliated authors in papers (see also Catanzaro, 2023).
Swapped Authors: While there are many benign explanations for replacing authors between submission and publication, it can suggest that an authorship sale has taken place, particularly when it happens after acceptance. Additionally, when library staff receive APC funding approvals at submission rather than acceptance, as is common for fully open journals, they can identify suspicious changes to the author list for articles that are rejected by one publisher, but then accepted and published by another. For example, if a corresponding author is listed on an article that is rejected by a publisher where an open access agreement is in place, but omitted as an author when that paper is accepted and published by a second publisher where there is no institutional agreement, then that strongly suggests that the author was only present to have the APC covered by the institutional agreement, a practice sometimes referred to as “APC tourism” (de Castro, et al., 2024).

Bibliographic Anomalies

Librarians are especially well-situated to evaluate bibliographies, which they can examine regardless of their particular expertise in the article’s subject matter. Excessive self-citation, coerced citation, and citation boosting all leave marks on the bibliography.

Coercive Citation: Manipulation of the publishing process can sometimes be seen in odd “bulk citations,” where a large number of references support a single mundane sentence. It can be revealing to ask whether someone is benefiting inappropriately from these bulk citations. If the subject matter seems disconnected from the citation or names/publications repeat unusually, there may be cause for concern. We found one example in which bulk citations had been shoehorned into a poorly written introduction in order to cite works by the editors and other authors in the special issue where this article was published. These indicators pointed toward editorial/authorial malpractice, perhaps through coercive citation or a cooperative citation ring.
Excessive Self-Citation: Self-citation is a normal and appropriate outcome of building a body of work in a subject. High percentages of self-citation, though, can be another indicator of someone attempting to game the system.
Editorial Malpractice: An editor’s record can also show patterns of coercive citation. If an editor is not a major researcher within a field, it could be a red flag to find that most of the papers they have edited cite the editor or the editor’s common co-authors. Emerging open science practices can help here; it would be easier to track abuses of this sort if editors on particular papers were listed more prominently and changes to the manuscript were part of the public record.
Citation-Level Checks: At a more detailed level of analysis, cited references can be checked one by one. If AI has been used to create a paper, cited articles may have been hallucinated. If they have some familiarity with problematic authors, fraudulent journals, and citation boosters, library staff may be able to spot familiar names. (PubPeer offers a powerful browser extension that can aid this process, which we will discuss in the third part of this series.)
Paper Mill Tells: Bibliographies can be difficult to get right even when you are not cutting corners, boosting citations, or fabricating data. Paper mills that paste in references can make the same mistakes over and over again. One prominent example is the “Vicker’s Curse,” where an article on butterflies ended up cited in over 1,000 papers, evidencing sloppy citation manipulation (Magazinov, 2023). Papers that cite a large number of retracted papers may also point toward involvement in a paper mill.

Textual Indicators

Of course, there is always the strategy of reading the article in question after it has been published. Doing a close reading should always be in a librarian’s toolbox.

Plagiarism: One way to become an extremely prolific author of scientific articles is to self-plagiarize and salami slice. Even though we cannot read the article prior to publication, comparing the title and subject matter to an author’s past publications may raise a red flag. Comparing similar papers can reveal template use, data reuse or repurposing, and use of tortured phrases to avoid plagiarism detection.
Tortured Phrases: The presence of “tortured phrases” indicates the use of AI or thesaurus/paraphrasing programs to replace natural phrases and idioms with risible substitutes for the sake of avoiding plagiarism detection. So, for example, "bosom peril” is used instead of “breast cancer,” “choice timber” instead of “decision tree,” or “invulnerable framework” instead of “immune system.” As a starting point for sleuths, flagging tortured phrases has been an especially profitable technique for identifying suspicious papers.

Strategies Summary

Detection of papermilled papers is a cat-and-mouse game; new ways to identify fraudulent papers are always made less effective by clever fraudsters. AI’s ability to rapidly create natural-sounding text is likely to add frantic energy to this process. It may be that fraudulent papers will become even more difficult to identify and that fabricated data will require even more expert evaluation. But human networks of co-authors will continue to be traceable, and humans will continue leaving evidence in the public record.

Challenges

While library staff have a real opportunity and at least some degree of responsibility to be more involved in promoting publishing integrity, this is undeniably a challenging endeavor. One of those challenges is money. With budgets often flat or declining, research grant funding under threat, and AI raising uncomfortable questions, very few libraries are searching for broad new mandates. But as with open access itself, not to mention AI, the historical moment makes demands of its own, and libraries have always risen to meet them. The current challenges to scholarly communication provide, whether we are ready for it or not, just such a moment.

While we can imagine some well-resourced libraries hiring a full-time “publishing integrity librarian,” the investment in staff time at most other institutions can start small and eventually scale up in proportion to the library’s open access investments, as a complement to work that is already being done, ensuring the value of investments that are already being made. At the very least, academic library leadership should consider how to cultivate greater awareness and engagement with these issues among their staff.

Even so, the path ahead for this work is not well-trod or signposted, and it is important to be honest about the challenges it brings. First of all, the librarian who conducts all of the checks suggested above will discover very few smoking guns. Evidence gathering and interpreting is an unavoidably subjective process. Deciding what is sufficient evidence to justify conducting a deeper investigation or raising an alarm requires careful consideration, for the sake both of the people involved and of the time and resources of the library.

When it comes to publishing integrity, publishers, authors, and universities are not the only entities with reputations on the line. Libraries entering the fray could have unexpected repercussions. Not everyone will apply the same hierarchy of values when ranking protection of the scientific record against institutional esteem, career progression, and profit. Is one image duplication worth marring the record of a star researcher? Two? Does this calculus change at all relative to researcher’s level of grant funding? These questions can be fraught.

Contributing to these difficulties is the novelty of this approach. Universities have particular conceptions of the role of their libraries that are notoriously hard to budge. While the scale of the problem surely merits “all hands on deck,” libraries’ involvement may not be welcome in all quarters.

It is therefore important to be clear within the library and with our campus partners about the boundaries of our role, which is not judge, jury, or counsel. It is, rather, to gather evidence to be passed on to others without a verdict. These others—in particular publishers and our own institutions’ research integrity officers—can conduct fuller investigations. This division of labor is an important safeguard on the objectivity of these processes. Library staff must maintain a genuine disinterest in the outcomes of these investigations.

A variety of pressures encourage silence on these issues; the threshold for speaking up can feel quite high. The great value of PubPeer, a post-publication peer review platform we’ll discuss in Part Three, is that it lowers this threshold by making it easy to point out anomalies anonymously to a wide audience. But while PubPeer (and the sleuthing community more broadly) have their place, library staff have two more direct alternatives: direct lines to the research integrity staff of our institutions and to our publishing partners. We should intentionally cultivate these connections, building relationships of trust that naturally lower the bar for raising issues and establish the firm expectation of confidentiality and due consideration. It would be beneficial for these communications to become more routine, even formalized. Science, after all, thrives in an environment of continued testing and inquiry, not fear of asking questions.

Reference List

Abalkina, A. (2023). Publication and collaboration anomalies in academic papers originating from a paper mill: Evidence from a Russia-based paper mill. Learned Publishing, 36(4), 689–702. https://doi.org/10.1002/leap.1574

Ansede, M. (2024, December 5). Dozens of the world’s most cited scientists stop falsely claiming to work in Saudi Arabia. El Pais. https://english.elpais.com/science-tech/2024-12-05/dozens-of-the-worlds-most-cited-scientists-stop-falsely-claiming-to-work-in-saudi-arabia.html

De Castro, P., Herb, U., Rothfritz, L., Schmal, W. B., & Schöpfel, J. (2024). Galvanising the open access community: A study on the impact of Plan S. Zenodo. https://doi.org/10.5281/zenodo.13738478

Catanzaro, M. (2023, May 5). Saudi universities entice top scientists to switch affiliations–sometimes with cash. Nature, 617, 446–447. https://doi.org/10.1038/d41586-023-01523-x

Magazinov, A. (2023, July 31). The Vickers curse: secret revealed! For Better Science. https://forbetterscience.com/2023/07/31/the-vickers-curse-secret-revealed/

Mallapaty, S. (2025, March 4). China’s supreme court calls for crack down on paper mills. Nature, 639, 285–286, https://doi.org/10.1038/d41586-025-00612-3

Matusz, P., Abalkina, A., & Bishop, D. V. M. (2025). The threat of paper mills to biomedical and social science journals: The case of the Tanu.pro paper mill in Mind, Brain, and Education. Mind, Brain, and Education. https://doi.org/10.1111/mbe.12436

Meho, L. I., & Akl, E. A. (2025). Using bibliometrics to detect questionable authorship and affiliation practices and their impact on global research metrics: A case study of 14 universities. Quantitative Science Studies, 6, 63–98. https://doi.org/10.1162/qss_a_00339

Parker, L., Boughton, S., Bero, L., & Byrne, J. A. (2024). Paper mill challenges: past, present, and future. Journal of Clinical Epidemiology 176, 111549. https://doi.org/10.1016/j.jclinepi.2024.111549

Porter, S. J., & McIntosh, L. D. (2024). Identifying fabricated networks within authorship-for-sale enterprises. Scientific Reports, 14, 29569. https://doi.org/10.1038/s41598-024-71230-8

Richardson, R., Hong, S., Nunes Amaral, L. A. (2024, October 1). Hidden hydras: Uncovering the massive footprint of one paper mill’s operations. Retraction Watch. https://retractionwatch.com/2024/10/01/hidden-hydras-uncovering-the-massive-footprint-of-one-paper-mills-operations/

Schneider, L. (2024). Retraction blackmail – new service by Iranian papermills. For Better Science. https://forbetterscience.com/2024/08/05/retraction-blackmail-new-service-by-iranian-papermills/

10.1146/katina-07102025-1

Matthew Goddard heads the Access & Acquisitions Department of the Iowa State University Library, which is responsible for e-resources management, acquisitions, and the implementation of open access agreements.

Zachariah Motts is the electronic resources librarian at the Iowa State University Library. In addition to managing the library's licensed e-resources, he facilitates the workflows for the library’s open access publishing agreements.

The Future of Work

How to Become an Integrity Sleuth in the Library

LAYOUT MENU

LAYOUT MENU

Sleuths in the Library

Red Flags

Challenges

Reference List

**Join our community! Get content that elevates libraries and librarians in a free weekly newsletter from Katina.**

Republish