Since the founding of Invest in Open Infrastructure (IOI)—a nonprofit initiative dedicated to advancing the adoption of and investment in open research infrastructure where I serve as executive director—my colleagues and I have spent countless hours examining the outlines, descriptors, contextual limitations, criteria, and properties of “open infrastructure” as a phrase and concept. Each conversation we’ve had or hosted about this fraught term has led at best to fleeting points of alignment and ongoing tests to see whether one’s interpretation of “open” or “infrastructure” was included in others’ definitions.
I’d like to say that these conversations led to clarity and agreement on core concepts. More often than not, they instead challenged the boundaries and assumptions that get baked into the term by whatever community is invoking it. We have found that any attempt to cast a definitional lens over a concept that is so reliant on context only produces more questions. As our resident linguist has reminded us, some terms are better described than defined (Collister, 2024).
“Open infrastructure” is a particularly important term for us, one that’s in our organizational name and at the heart of our mission and strategy. It is intrinsically bound up with our choices regarding where and how we invest our energy and time to effect change. As with so many terms, it is not straightforward at all (you can revisit some of our team’s earlier attempts at definition here, here, and here).
What’s important to us at IOI are the conversations themselves, along with a continued examination of where definitions might intersect. It’s always subjective, but we think making infrastructure visible, legible, and interpretable to a wider audience is critical to driving more interest, support, and funding to the underlying systems powering research.
To that end, in 2021–22 our team conducted a review of definitions of “infrastructure” (including cyberinfrastructure, open infrastructure, e-infrastructure) across a body of literature that includes works in anthropology, scholarly communications, international development studies, science and technology studies, and infrastructure studies, as well as more recent examples, such as the definitions and frameworks outlined by UNESCO, the Digital Public Goods Alliance, and others.
Exploring the nuance in how and where these definitions align, overlap, build on one another, and add descriptive characteristics can help us understand more fully the challenge in applying a one-size-fits-all frame to technical and social systems—and highlight the need for a change in approach to these conversations.
A Look at Definitions … and How They’ve Evolved
The definitions we surveyed tended to frame the concept of infrastructure as a network. Infrastructure is described as consisting of disparate entities—both technical (hardware and software) and social (practices, norms, and structures)—that together facilitate the linking and/or movement of ideas, signals, objects, and people (Larkin, 2013).
For example, a 2003 definition described cyberinfrastructure (the term of art at the time, preceding discussions of “e-infrastructure” and then “open infrastructure”) as the “layer of enabling hardware, algorithms, software, communications, institutions, and personnel. This layer [provides] an effective and efficient platform for the empowerment of specific communities of researchers to innovate and eventually revolutionize what they do, how they do it, and who participates” (Atkins, 2003).
Jumping forward to 2007, definitions of infrastructure begin to encompass the organizational networks critical to support the physical and technical aspects of infrastructure, as described by Schroeder. According to Schroeder, the term “e-infrastructure” is used “in the first instance to designate the physical or material components of [a large] technological system, the advanced electronic networks that make use of the Internet and the Web, as well as, secondarily, the organizational networks that are supported by this system.”
In 2010, we also see an emphasis on social infrastructure, as Edwards describes knowledge infrastructure as the ‘‘robust networks of people, artifacts, and institutions that generate, share, and maintain specific knowledge about the human and natural worlds” (Edwards, 2010).
UNESCO’s Recommendation on Open Science, ratified by 197 member states in 2021, describes “open infrastructures” as both “virtual or physical, including major scientific equipment or sets of instruments, knowledge-based resources such as collections, journals and open access publication platforms, repositories, archives and scientific data, current research information systems, open bibliometrics and scientometrics systems for assessing and analyzing scientific domains, open computational and data manipulation service infrastructures that enable collaborative and multidisciplinary data analysis and digital infrastructures.” They continue by stating that shared research infrastructures “should be not-for-profit and guarantee permanent and unrestricted access to all public to the largest extent possible” (UNESCO, 2021).
The Digital Public Good Alliance’s DPG Standard, first released in 2020, considers “digital public goods” to be open-source software, open data, open AI systems, and open content collections that adhere to privacy and other applicable best practices, do no harm by design, and are of high relevance for attainment of the United Nations 2030 Sustainable Development Goals (SDGs).
And this is just a sampling of definitions connected to “open infrastructure.” Others more explicitly describe the labor and (often invisible) human infrastructure that has developed atop the physical and digital infrastructure layers as they have evolved over the decades. Still others focus on the values and principles that undergird conversations around infrastructure to advance open science, open access, and digital public good initiatives.
In our work to better understand where these definitional boundaries lie, we’ve found that these terms can risk doing harm if they’re not interpreted alongside and in concert with the realities in which they’re due to be applied. Community context is key, and continued, deep engagement and conversation is critical to maintaining that link.
The Importance of Community Context
Each of these definitions, sets of criteria, and descriptions builds on some variation of the notion that infrastructure refers to the underlying systems, tools and instruments, protocols, platforms, and networks that a community relies on. At IOI, we are focused specifically on the research ecosystem, with “reliance” meaning that a disruption in the availability of a core infrastructure significantly affects one’s ability to conduct, disseminate, or discover research.
Think for a moment of the physical infrastructure many of us engage with daily: the roads and bridges in our local communities. We like to draw a parallel to that level of reliance to help determine whether a new piece of software or tool counts as research infrastructure—and where to draw that line.
“Reliance” for a community isn’t always straightforward to qualify or quantify. What a research discipline, or, on a more granular level, a regionally or culturally defined research community, depends on as critical infrastructure for its work varies significantly in ways that pure usage numbers or attributes of technical platforms don’t always signify.
The same can be said about core attributes that a specific community may prioritize when it comes to “open infrastructure.”
At IOI, we engage deeply with research communities across four continents, ranging from major research institutions to tribal states and colleges, research-performing organizations and NGOs, and research collectives advancing critical studies on climate, health, food scarcity, and cultural heritage. In our work with these communities, we have seen both the benefits and the drawbacks of approaching “open infrastructure” from solely a definitional lens.
On one hand, drawing a definitional boundary can allow us to more clearly speak to the specific needs of a community and what they rely on for their scholarship, such as a specific digital collection or a dataset that is critical to an area of study and irreplaceable as core “infrastructure” (think, for example, of the Sloan Digital Sky Survey, or of cultural heritage collections that may be hyper-specialized and central to a discipline). A drawback to applying a definitional lens can be that certain infrastructures and organizations feel excluded or “missed” in the conversation, which can lead to the splintering of communities that could achieve more collective progress if they worked together rather than individually. This can be seen in discussions of scholarly communications infrastructure versus data, research software, and other digital infrastructure, with each dialogue catering to a specific set of funders or organizations, but which often have similar aims, pain points, and challenges. The balance of community specialization with collective coordination and generalization can be difficult even in the best circumstances.
Without the voice and active engagement of the research communities we aim to support, we can easily lose sight of why we are advocating for the adoption and use of open infrastructure altogether—and unintentionally draw boundaries that could exclude the perspectives we should be honoring most.
Conclusion
Definitions serve as useful starting points in research. They attempt to place boundaries around a term in order to capture its exact meaning. But, in a project of any complexity, definitions quickly show their limitations.
We’ve thought a lot about how to delineate what counts as “open infrastructure” (or doesn’t) for research, including looking at various frameworks and thinking of degrees of openness or embeddedness on a spectrum. We’ve also engaged in deep conversations about the complexities of determining what counts as “infrastructure” without community context and participation. The term is simultaneously incomplete and somewhat unlimiting.
Especially perplexing is that now so much of what we in scholarly communications and tech circles used to describe with the term “open infrastructure” is no longer pinpointable as an object at all. Rather, it involves networks of active, changing, and dynamic stuff in a cloud. It’s a process, and a murky one at that. There is no “final” framework or phrase; instead, there is a spectrum that might include open code, open access, open APIs, open tools, open network services.
In the end, really, we want to talk, not just about dogmatic “openness” but about what is useful for the communities we aim to support—the researchers looking for ethical systems, platforms, and tools that don’t put up barriers to use via pricing or paywalls and that are designed to represent their evolving needs in authentic, community-driven ways.
To us, that means more flexibility in describing infrastructure to meet the communities we care about where they are, without inadvertently keeping anyone out of the conversation.
References
Atkins, D. (2003). Revolutionizing Science and Engineering Through Cyberinfrastructure: Report of the National Science Foundation Blue-Ribbon Advisory Panel on Cyberinfrastructure. https://repository.arizona.edu/handle/10150/106224
Collister, L. (2024). Describing Open. Katina. https://katinamagazine.org/content/article/open-knowledge/2024/describing-open
Edwards, P. (2010). A Vast Machine: Computer Models, Climate Data, and the Politics of Global Warming. The MIT Press.
Larkin, B. (2013) The Politics and Poetics of Infrastructure. Annual Review of Anthropology, 42:327–343. https://www.annualreviews.org/content/journals/10.1146/annurev-anthro-092412-155522
Schroeder, R. (2007). e‐Research Infrastructures and Open Science: Towards a New System of Knowledge Production? Prometheus, 25(1):1–17. https://www.scienceopen.com/hosted-document?doi=10.1080/08109020601172860
UNESCO. (2021). UNESCO Recommendation on Open Science. https://unesdoc.unesco.org/ark:/48223/pf0000379949.locale=en