There is a petition on-line asking the President to require the posting of “published results” of federally funded research on-line for public access. A story about it is here. The effort is led by John Wilbanks, who previously led Science Commons and now is a senior fellow with the Kauffman Foundation, where he is continuing his public advocacy for open science. In general this is a good thing. If you want to sign the petition, it’s here.
If one focuses on university research, perhaps the first question is why such a petition is even necessary. OMB Circular A-110 (2 CFR 215) is the primary regulatory document that comes into play. It sets up a government license (.36(a)) to publish “any work that is subject to copyright” developed or acquired under a federal award, and to permit others to do so. A similar right is established for data (.36(c)). Furthermore, if federally funded work is (i) published and (ii) becomes the subject of rule-making with the force of law, then the backing data is required to be disclosed if requested under FOIA (.36(d)). “Data” is defined as “recorded factual material commonly accepted in the scientific community as necessary to validate research findings”. The definition excludes “preliminary analyses, drafts of scientific papers, plans for future research, peer reviews, or communications with colleagues” as well as “physical objects” and “trade secrets, commercial information, materials necessary to be held confidential by a researcher until they are published, or similar information which is protected under law” and “personnel and medical information”.
Why does one need (d) when one already has (a) and (c)? Part of the reason has to do with grant deliverables. While the government may have a license to copyright works and data, grant awards may not require the delivery of such works to the government. Thus, while the government has the right to publish such materials, it can’t if it doesn’t have them in its possession. A second reason is that just because the government does have possession of materials, this does not mean the government will use its licensed authority to publish the works it receives. To so publish, it has to have a “Federal purpose”, and indeed there may be good reasons why the government chooses not to publish.
There is more to it, of course. Under a grant, the investigators typically have an obligation to submit reports, including a final report, and there may be requirements for delivery of data as well. However, there typically is not a requirement to publish results. A grant may be evaluated on the basis of such publications–especially where a grant is multi-year, and future grant applications may be evaluated based on the publication record developed under prior funding. But typically one is not required to publish unless the deliverables of the grant work are inherently of a public nature, such as building a web site for public information or education on a given subject.
Thus, a simple move for the government is to require agencies to act on the licenses they hold in copyright works and data developed with federal funding. For this, one might use an executive order that establishes that it is a Federal purpose to inform the public timely of results of publicly funded research, with exceptions where access would be determined to be damaging–such as the location of an archeological site that had not yet been secured. Such an executive order would also remind agencies to require delivery of backing data for any final report, or deposit of that data in a repository.
But “data” is a fickle thing. There are raw measurements, data structures that hold recorded measurements, metadata about the calibration and conditions under which the data were recorded, adjustments and weightings of data and reasons for these adjustments and weightings, and analysis of data, often performed by software that is itself configured or custom developed for the purpose. Raw data without meta data is useless. Data without the analysis software requires someone else to create their own software–and even then, it is not clear that anyone can replicate the operation of the original software–so results cannot be checked for the operation of the analysis. Thus, “data” means much more than one might expect, and it is not clear at all that section (d) establishes a sufficient scope to get at all such data.
There is a further issue and that is of publication. The present petition concerns itself with published science. In essence, if a work is published, then it also has to be posted for open access. One work around, then, is not to publish, but rather circulate results privately. Skip the publishers. It doesn’t matter for faculty who have tenure, and in fact keeping less important stuff out of the journals might be a good thing. Another work around is to publish a core finding, but wait until the award has ended and then publish a re-analysis or extended analysis of the data, without “taxpayer” funding. Such articles would then fall outside the scope of the petition. A third work around is to limit the statement of work to results directed at a particular application. Thus, research directed at improving a given algorithm for rendering 3d images might focus on reporting improvements in the algorithm, but an immediate application in representing, say, surgery planning would be then outside the scope of the research and not subject to the requirement to post any account of that application on-line.
There is also a significant challenge in dealing with the costs of “posting” materials, especially if metadata and tools are all to be provided in clear, working order. Research is messy, and some investigators are much messier than others. Given the rise in accounts of “bad science”, perhaps it would be good to require more public reporting, as this might lead to greater public accountability for messiness that leads to badness. But especially in on-line environments, digital materials may require maintenance long after an award has ended. While some public repositories exist, such as the Public Library of Science, getting things into useable formats is sometimes non-trivial, and posting things “as they are” means they are pretty much useless to others.
One might think that universities would be in the forefront of finding ways to resolve these issues and make sure that research was broadly available. This is not the case. Indeed there are strong arguments made internally *not* to make results and data and metadata and tools available. One class of arguments are political. In the case of the University of Virginia, the administration was more than happy to make even personal emails of a scientist public in response to a public disclosure request when the views of the scientist did not square with the administration’s politics, but denied a similar request when it appeared that the requested data would be used to challenge published results that the university depended on for future research funding. See a discussion here. Here is a set of similar stories of data and results being withheld.
This leads to the second class of arguments, which are tied to our old nemesis “commercialization”. We may separate out “technology transfer” from “commercialization”. Technology transfer is the movement of the capability to do something from one group to another, which acquires that capability or techne. For that, a transfer may include knowledge, data, tools, related resources, and intellectual property rights. Transfer is a very social thing, encompassing not only widgets but meta widgets and the relationships of widgets and meta widgets to the big wide world, including risks and benefits. Commercialization, by way of contrast, is an activity that leads to the development of a product to be sold in a marketplace. While commercialization itself may take on a broad range of characteristics, from entirely open to tightly controlled and proprietary, the general practice of “commercialization” in universities, and especially by technology licensing operations, tends to be toward proprietary. Inventions are not reported publicly so that that patent applications may be filed. Data are not reported so that others do not have easy opportunities to replicate the invention and add improvements or design-arounds before the university’s own team can (if ever) get to these things. Devices and materials are transferred under material transfer agreements that limit use and claim an interest in any inventions made with the materials, as if the provision of material were of the nature of “sponsoring” research.
These sorts of “protections” of inventions are justified by a prevailing model that starts with the idea of exclusively licensing patent rights to support investment in making commercial product. While any number of technology licensing offices protest that they also do non-exclusive licensing, the reality is that their default operating model favors exclusivity, and non-exclusive arrangements come as a matter of granting licenses to research sponsors and in the context of software and web media, based in copyrights. For inventions, technology licensing policies and practices are set up to assume exclusive licensing, if for no other reason than as a “precaution”–“just in case” such opportunities arise.
Thus, when it comes to open publication, technology licensing offices operating with a presumption of exclusivity will likely see such a requirement as a threat. Open publication will make it difficult to obtain patent “protection”, will create competition in the private sector that can design around or file blocking improvement patents, diminishing the value of the university’s patent and making licensing more difficult. These things might even be true in some circumstances, but the difference between technology transfer and “commercialization” then becomes apparent as well. The very arguments in support of commercialization are also ones that show how significantly university administrations actively seek to delay transfer of technology until they have found a partner willing to (i) make commercial product rather than simply practice or study; and (ii) pay a share of sales of the product as a continuing royalty. To get such deals–or at least to make the attempt–technology licensing offices see concerns for confidentiality, non-circulation of data, and limitation of publication to be key.
There are, of course, open models for both transfer and commercialization that are particularly appropriate for a majority of research developed assets–whether data, tools, materials, devices, software, reports, inventions, or insights. Transfer may take the form of augmented instruction, providing more than lecture and lab–data, materials, and rights. Commercialization may arise from a robust common platform of technology and users, and not be an express condition of access or even involve a formal agreement with a university. In either case, public access to information is a fundamental starting point for success, not a threat to that success.
In terms of social network development, one has to create critical mass before congestion in the network. Intellectual property is a form of congestion. Applying it too soon prevents the formation of a critical mass, and therefore works against the very social adoption that is necessary for a research asset (or the relationships it helps to create) to have the chance to become broadly useful or financially valuable. The conventional “commercialization” approach at many university licensing offices, whether focused on licensing to established companies or following the latest trend to create shell startup companies and then seek investment funding, imposes congestion on open models in the hope of raising the value of intellectual property–and inventions in particular–for future licensing deals.
Public access to research assets of all sorts creates the potential for working relationships, establishment of trust, trade in research information, tools, and insights, and development of shared libraries, platforms, and ad hoc standards. Such an approach is critical to creating the “reef” or “city” of activity that allows ideas to mix and match and find the people who are ready to act on them to develop the adjacent possible variations, to add unexpected new elements, to find new properties or applications. Without the reef, there is no research ecosystem. Everything is trenches and ugly sharp-toothed fish with luminescent lures to attract the foolish and the desperate.
Open access should be the default for *most* research assets. The government as the major sponsor of most university research has a responsibility to level the playing field for access, and provide an opportunity for open advocates to make the case for restoring openness both of information and IP as a distinctive anchor of university research enterprise practice.