Asked to name key or primary research outputs—for the sake of this discussion, let’s say these “first-class” outputs reflect and reinforce a healthy research ecosystem—most people would surely list journals and books, probably also preprints and research data. What about research software?
Earlier this year, Invest in Open Infrastructure (IOI) released The State of Open Research Software Infrastructure, a report that presents the key findings and recommendations of a Sloan Foundation-funded project (of which I was part) to review the research software infrastructure landscape.
There are no commonly accepted criteria for evaluating if outputs have attained “first-class” status. But gaps in their recognition, funding, and/or supporting infrastructures can be signs that the ecosystem does not yet fully support the breadth and interconnectedness of the outputs that reflect its work. If we are to recognize how research software fits into the overall picture of research, we first need to understand what is unique about it and its contributions.
What Is “Research Software”?
As the report Defining Research Software: a controversial discussion (Gruenpeter et al., 2021) makes clear, there are complexities to defining research software, which comprises a number of components, user roles, and lifecycle stages. In interviews, the IOI research team (Kemp & Tsang, 2024) found that, in conversation, experts will generally use a shorthand or personal definition that the report summarizes as “software specific to research.”
There is general agreement that software used in research but not created by or for researchers, such as Microsoft Excel, is not research software.
What started several decades ago in a handful of fields is now common in a wide range of disciplines. Names like Jupyter and Python will probably sound familiar to many; they have large user communities. But there is a very long tail of niche research software that doesn’t have (or necessarily need) ongoing open-source development, as well as research software at various levels of use and development in between.
Research software is developed for a variety of functions, but analyzing and managing research data of different types is probably most relevant to this discussion; in research, data and software are closely paired.
Related Outputs and Communities
In research and supporting communities, the software and data ecosystems are fairly distinct. The research software community developed from grassroots efforts, largely within academia, to fill gaps in a wider landscape that has historically overlooked research software as a primary output. For example, faculty are often not formally trained in creating or managing research software. Other areas of focus include peer review for research software, discovery, publishing and archiving, advocacy, and best practice guidance—for example, on citing software, which is still not a routine practice. In addition, the community has developed a set of FAIR Principles specifically for research software (known as the FAIR4RS Principles).
Of course, most of these issues are not limited to research software. For example, research software, like data, needs to be prepared to be shared and used by others. And openly available does not always mean easy to find. Whether or not an output is indexed or where it is hosted (an institutional or dedicated software repository, a personal website, or a department server) can significantly affect whether it is found and reused.
Perhaps it is not surprising that the research software community is largely independent of others, given that the functional lifecycle of research software faces a distinctive set of issues. For example:
- Funders tend to focus on what is new and novel. In the research software context, that means that it is hard to get support for maintenance, though perhaps that is starting to change. The Software Sustainability Institute recently received £4.8m in funding from UK Research and Innovation (UKRI) to support maintenance.
- Research software may involve versioning, as well as eventual sunsetting and related archiving. These lifecycle stages for research software may require different management than related phases in text-based or less “dynamic” research outputs.
- Though research software is vital for its specialized purpose, its use and reuse are hard to gauge; product management functions (such as tracking downloads and usage and user experience work) are uncommon.
A recent report of the American Council of Learned Societies (ACLS), Other Stories to Tell: Recovery Scholarship and the Infrastructure for Digital Humanities, discusses many of these issues as they relate the digital humanities; it also mentions the role of research software engineers. The emergence of this professional role, while certainly not universal, highlights the specialized needs of research software, as well as the evolution of the landscape.
Functional and Symbolic Importance
Openly available research software has benefits, to its own and to related fields and stakeholders. For example, reuse of software is a key part of research reproducibility and research integrity, i.e., using the software developed for the data on which conclusions are based. Research software helps to show the full breadth of outputs from an institution or funder. These practical benefits and the signals they send require the visibility of research software as much as its availability.
The Software Sustainability Institute maintains a list of journals focused on research software. Publications such as The Journal of Open Source Software (JOSS) register DOIs (along with performing other publishing functions, such as peer review and archiving), providing the kind of structured, interoperable metadata that helps with discovery. DataCite Commons now has over 650,000 DOIs for software records and nearly 12 percent of all 112+ million records have cited software (as of May 21, 2025).
Still, policies related to research often omit or make only passing references to research software. The US OSTP Nelson Memo, for example, which directs federal agencies to make federally funded research results and data publicly accessible, does not mention research software at all, though it does talk about scientific data, while the US National Institutes of Health (NIH) has a page devoted to Best Practices for Sharing Research Software. Several recent public statements on research, such as the recent Barcelona Declaration, the UNESCO Recommendation on Open Science and the San Francisco Declaration on Research Assessment (DORA), call for research software to be treated as an output in its own right (with some variation in language). Funder-specific policy guidance is a focus of several recent reports, including one from Science Europe and a just-released report from the Policies in Research Organisations for Research Software (PRO4RS) Working Group (Hernandez et al., 2025) that focused on a framework for institutional research software policies (this is one of a few examples of joint efforts on research software with the Research Data Alliance (RDA)).
A Different Kind of Class Division
Absent an agreed set of criteria or framework for determining what a first-class research output is, the status of emerging outputs will remain up for debate. Is it first-class when included in “enough” existing social and technical infrastructures, for example? A litmus test for many is to what degree an output type is factored into tenure and promotion in higher education (even though tenure has been on the wane for a while). Some may wonder: why is this not on the programs at conferences I attend? What do the LibGuides and publisher and funder policies look like?
Citations, that currency well understood to be a contributor toward first-class output status, are not as straightforward or common for software (and data) as for text-based outputs, though principles and guidance are available: the FORCE11 Software Citation Working Group published their principles nearly ten years ago (Smith et al., 2016).
Through open metadata and citations, outputs can be linked together. For contributors, institutions and funders, visibility provided through such linking fills gaps, showing, for example, the full set of outputs from a particular research grant and providing insight into the entire research workflow. These connections also provide an opportunity to connect outputs and their impact across disciplines, institutions, and funders, at a system level.
Many stakeholders have a role to play in facilitating these connections through policy, best practices and systems that make collecting this information easier. In this time of distrust in science and dismantling of research funding and administration, the ability to recognize, register, reuse, and connect the various interconnected functions and outputs of research seems valuable, necessary, and overlooked at our peril.
References
Barker, M., Chue Hong, N.P., Katz, D.S., Lamprecht, A., Martinez-Ortiz, C., Psomopoulos, F., Harrow, J., Castro, L.J., Gruenpeter, M., Martinez, P.A., & Honeyman, T. (2022). . Introducing the FAIR Principles for research software. Sci Data 9, 622. https://doi.org/10.1038/s41597-022-01710-x
Gruenpeter, M., Katz, D. S., Lamprecht, A.-L., Honeyman, T., Garijo, D., Struck, A., Niehues, A., Martinez, P. A., Castro, L. J., Rabemanantsoa, T., Chue Hong, N. P., Martinez-Ortiz, C., Sesink, L., Liffers, M., Fouilloux, A. C., Erdmann, C., Peroni, S., Martinez Lavanchy, P., Todorov, I., & Sinha, M. (2021). Defining Research Software: a controversial discussion (Version 1). Zenodo. https://doi.org/10.5281/zenodo.5504016
Hernández Serrano, P. V., Barker, M., Katz, D. S., Martinez-Ortiz, C., & Shanahan, H. (2025). Identifying Gaps in Research Software Policy: A report from Subgroup 3/4 of the ReSA & RDA Policies in Research Organisations for Research Software (PRO4RS) Working Group (1.0). Zenodo. https://doi.org/10.5281/zenodo.15411757
Kemp, J., & Tsang, E. (2024). [IOI] Research Brief: Community Infrastructure to Further Open Research Software. Zenodo. https://doi.org/10.5281/zenodo.14178127
The American Council of Learned Societies (ACLS). (2025). Other Stories to Tell: Recovery Scholarship and the Infrastructure for Digital Humanities. . https://www.acls.org/resources/other-stories-to-tell/
Smith, A.M., Katz D.S., & Niemeyer, K.E. (2016). Software citation principles. PeerJ Computer Science 2:e86 https://doi.org/10.7717/peerj-cs.86
Further Reading
If your curiosity about research software is piqued, resources for more information are abundant. The following are a few starting points for some of the topics raised in the article.
Funding:
A recent analysis discusses the international research software funding landscape: Jensen, E.A. & Katz, D.S. (2025). Strategic priorities and challenges in research software funding: Results from an international survey [version 2; peer review: 2 approved, 2 approved with reservations, 1 not approved]. F1000Research, 13:1447. https://doi.org/10.12688/f1000research.155879.2
Infrastructure:
The IOI project produced an infrastructure diagram that gives a useful visual overview of terms, functions and roles, and a Zotero bibliography for more reading: Tsang, E. (2025). [IOI] Research Software Infrastructure Landscape Overview. Zenodo. https://doi.org/10.5281/zenodo.14886707
Policies:
CHORUS tracks publisher policies for software citations.