open science and information professionals

“Service opportunities have been revealed for supporting the research process, sustaining and capturing the non-published conversations of science, and curating the resulting data” (Ogburn).

The changes brought by open science have important implications for information professionals, who must now rethink what it means to collect and provide access to scientific findings. Already well underway are efforts to curate the masses of data entering the public domain thanks to the open data movement. Because preserving data in an accessible format requires proper planning and funding, data archivists must be involved, proactively, at the beginning of the research cycle – before funding decisions are made and software platforms selected (Ghosh). This early participation helps ensure that the data formats are appropriate and non-proprietary, data management schemes are sensible and clearly communicated, and that the funding is sufficient.

The knowledge and resources required to properly collect and preserve such data given its particular characteristics – compliance or not with (competing) standards, (possible) dependency on proprietary software, its intrinsically technical quality, and so forth – require new competencies from information professionals. Such a shift is already affecting the curricula of library schools.

Another factor is that, when we consider the varied artifacts of open research, the journal article is no longer the unit to be collected, preserved, and distributed. How do information professionals – and should they – attempt to curate living content, whether that be merely early versions and post-publication revisions of the published article, or artifacts as ephemeral as wiki pages, blog posts, discussion boards, and even tweets? Dorothea Salo states:

While a few scientific publishers are beginning to accept and even require supplementary data deposition, and a few research libraries are evaluating data curation as a potential professional specialization, even these have no useful response as yet to the ‘Open Notebook Science’ movement.

The answers to these questions are not yet clear, but the information professional may need to take a leadership role in convincing researchers that what they produce along the way to a published paper is worth preserving.

One of the most essential roles for the information professional in the age of open science (and, more generally, of open access) is that of advocacy. For instance, the members of the ALA who opposed the Research Works Act may rightly feel some satisfaction that perhaps their strong position had an effect on the act’s defeat. In addition, information professionals can help find a place for the “grey literature” that has historically been ignored by traditional publishers. And finally, as interest in open data grows, information professionals can expound the virtues of the Panton Principles, helping researchers design projects that will yield preservable data suitable for the public domain.



Flood, Alison. “Scientists sign petition to boycott academic publisher Elsevier.” Guardian. 2 Feb. 2012. Web.

Ghosh, Maitrayee. “Information professionals in the Open Access Era: the competencies, challenges and new roles.” Information Development 25.1 (2009): 33-42. Web.

Ogburn, Joyce L. “The imperative for data curation.” portal: Libraries and the Academy 10.2 (2010): 241-246. Web.

Panton Principles. “Principles for open data in science. ” Web. Retrieved June 1, 2013 from

Salo, Dorothea. “Who owns our work?.” Serials: The Journal for the Serials Community 23.3 (2010): 191-195. Web.


open science: the future?

One possible future for open science is the broadening and democratizing of the peer review process. Certainly, more web-based tools are coming available that enable greater public participation in establishing the “truth” of web content – and not just scientific findings. Their arrival is of great interest not only to information professionals but also, indeed, to any consumer of content on the internet. One such tool is, open‐source software that allows users to annotate anything found online without fear that the content owner can revise or remove the comment (Giles 46). inventor Dan Whaley calls it “the Internet, peer reviewed” (qtd. in Giles 46).

Another possible future is that high-impact journals may become involved primarily after the peer review process is complete, collecting what are deemed by the peer reviewers (which may be the public) to be the most valuable articles. Publishing them formally at such a point in the cycle might retain the notion of an impact factor, still important to many authors, while also meeting some of the demands of open science, namely transparency and open collaboration.



Giles, J. “Truth Goggles.” New Scientist 15 Sept. 2012: 44‐47. Print.

open science, closed science: a continuum

The following chart shows where traditional science and scientific publishing as well as the newer open movements fall along the continuum that lies between closed and open science. This chart is derived from a similar picture presented by JC Bradley of Drexel University at the 2007 American Chemical Society Symposium on Communicating Chemistry. Bradley’s version of this chart did not include “open research” or “open resources”; I added these at what I think are appropriate positions. My reasoning is described below.


open science, closed science: a continuum

The traditional unpublished lab book is the epitome of closed science (Bradley). Not only are methods and data hidden from view, but also the results themselves are private. One step up from that is to make those results (and methods, too, if things are done properly) available in a traditional journal. Still, however, we do not have access to the data and we may have to pay to access the results. Enter the open access journal or repository, where we can freely access the findings regardless of our affiliations. Open data, another step along, allows us to scrutinize and reuse the data that underlie the findings. And finally, in open research, which includes open peer review and open notebook science, we do not have to wait for the published article, but can see what the researcher is working on before the work is complete and the results are published. And we can participate in the process if we so choose.

I place the notion of “open resources” outside of the continuum because while this approach is important for the practice of science, and while it represents similar values to the other open movements described here, it is not as essential for access to, or evaluation and reuse of, scientific results by the community at large.

Read more:



Bradley, J. C. (2007). Open notebook science using blogs and wikis. Nature Precedings for the American Chemical Society Symposium on Communicating Chemistry. Web.


open science: open platform

Processing extremely large datasets soon exhausts the resources of most data centres owned and administered by a single institution. Not wanting, or not being able to afford, an expensive and proprietary supercomputer, many institutions opt instead for clusters of servers cabled together to create a distributed pool of computing resources, on which parallel programs run rather effectively. These servers typically run Linux and other open-source software: the Open Grid Engine job scheduler, for instance, accepts a program, schedules it, and allocates resources such as CPUs, disks, and software licences, all while hiding the complexity of the system from the user. A computing cluster like this need not be maintained by a single institution only; indeed, some of the largest and most successful “home-grown” supercomputers are those shared by several institutions. An example is the Open Science Grid project, “a multi-disciplinary partnership to federate local, regional, community and national cyberinfrastructures to meet the needs of research and academic communities at all scales.”

Not only do such partnerships save on costs, bringing the power of supercomputing to those who may not have been able to afford it on their own (and thereby democratizing the practice of science), but their common platform and toolsets help ensure that the various parties involved can more easily share data with each other.

Read more:



Open Science Grid. “Open Science Grid.” Web. Retrieved June 15, 2013 from

open science: open research

“[The] accumulation of reliable knowledge is an essentially social process” (David, “Understanding the Emergence”).

Open research means being transparent about methods. At the minimum, this means describing research methods in a published paper such that anyone with the requisite skills and resources can scrutinize and attempt to reproduce the results. Open research may also involve an open peer review process, whereby journals expand the circle of peer reviewers to include members of the public. K. Thomas Pickard, a healthcare advocate, describes the value of rethinking the peer review process for medical research, arguing that online social networks give the public new opportunities to engage in the scientific enterprise:

Critics argue that the peer review process is slow, stifles innovation, and lacks transparency (most reviewers remain anonymous). With social networks, alternatives to peer review are emerging. The most commonly employed model is based on comment crowdsourcing, similar to how buyers rate products and sellers on Amazon or eBay. Anonymous peer review is replaced with public reviews that can include the reviewer’s reputation (as determined by peers) to weight the review score. Weighting an author’s reputation can be achieved with concepts such as the author’s scholar factor, h-index, or other “altmetrics.”

In its most open form, open research may involve what is known as “open notebook” science: using the technologies of the internet to make available research details, early findings, and iterations well before, or even instead of, formal publication, and to encourage participation from others.

Advocates of open notebook science maintain that such transparency in the early stages of a scientific endeavor has a couple of important advantages. First, it allows for iterative adjustments in methodology that may improve the quality of the results. Second, publishing early findings allows researchers to establish primacy over their methods and results much more quickly than would be possible were they to wait the normal months-long cycle for their results to appear in a published journal article. In the words of Dorothea Salo, scholarly research services librarian and author of “Who Owns Our Work?”:

Adherents of Open Notebook Science open their entire research process on the web using wikis, Google Docs and similar online tools. Notably, Open Notebook Science allows its practitioners to establish visible, verifiable primacy over their processes and the results thereof, which potentially undercuts publishers both by reducing scientists’ pre-publication ‘scooping’ fears and by providing a substitute for the supposed primacy verification offered by formal publication.

It is important to note, however, that there is nothing in the definitions of “open” provided previously that requires any researcher to practice open notebook science; it is perfectly in line with the values of the Budapest Open Access Initiative and the OKF to conduct, publish, and disseminate open science without (say) engaging with the public on a blog every step of the way. Nonetheless, this is indeed what some people mean when they speak of open science.

We have already seen how open research might work. One of the best‐known recent examples is in the work of Rosie Redfield, a microbiologist from UBC. When scientists funded by NASA reported in 2010 that they had found arsenic‐based life forms on Earth, Redfield expressed concerns on her blog about the quality of the research. Her assessment spread rapidly over social media – through Twitter among other means – and in the press, casting immediate doubts on NASA’s findings (Zimmer).

Known as the #arseniclife affair, Redfield’s “is one of the first cases in which the scientific community openly vetted a high‐profile paper, and influenced how the public at large thought about it” (Zimmer). In 2011, Nature magazine included Redfield in its list of the top 10 “people who mattered” for the year (Hayden).

Read more:



Budapest Open Access Initiative, September 12, 2012. Web. Retrieved Jun. 16, 2013 from

David, Paul A. “Understanding the emergence of ‘open science’ institutions: functionalist economics in historical context.” Industrial and Corporate Change 13.4 (2004): 571-589. Web.

Hayden, E. C. “365 Days: Nature’s 10. Ten People Who Mattered this Year. Rosie Redfield, Critical Enquirer.” Nature 480 (22 Dec. 2011): 437–445. Web.

Open Knowledge Foundation. “Open definition.” Web. Retrieved June 13, 2013 from

Pickard, K. T. “Impact of open access and social media on scientific research.” J Participat Med 4 (2012): e15. Web.

Salo, Dorothea. “Who owns our work?.” Serials: The Journal for the Serials Community 23.3 (2010): 191-195. Web.

Zimmer, C. “The Discovery of Arsenic‐Based Twitter: How #arseniclife Changed Science.” Slate 27 May 2011. Web.

open science: open data

As defined by the Open Knowledge Foundation, “a piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.” Open data advocates believe that data gathered during scientific inquiry should be made available for scrutiny and reuse by other researchers. However, the current system works against open data in several ways. There are barriers on all sides: publishers may have licensing restrictions; data may be inaccessible or in a proprietary format; and researchers themselves may be reluctant to give up control. Nonetheless, despite these obstacles, the open data movement is growing.

The Panton Principles, launched in 2010 by the Open Knowledge Foundation Working Group on Open Data in Science, are an attempt to facilitate the widespread adoption of open data. In brief, these principles state that publishers should make their wishes explicit regarding the reuse and repurposing of some or all of the data by way of a legal statement. This legal statement should be appropriate for use with data; the Panton Principles website lists waivers and licenses that are appropriate for data as well as those are not. In declaring data available for reuse by others, the Principles state that publishers should use the definition of “open” as defined by the Open Knowledge Foundation. And finally, publishers should dedicate data to the public domain in conformance with the Science Commons and OKF guidelines.

Read more:



Open Knowledge Foundation. “Open definition.” Web. Retrieved June 13, 2013 from

Panton Principles. “Principles for open data in science. ” Web. Retrieved June 1, 2013 from

open science: open access

The principle of open access is simple enough. It is about being able to find and read research and scholarship online at no additional cost (Willinsky).

Relevant to any discussion of open science is the definition of “open access” as set out in the Budapest Open Access Initiative in 2002 and reaffirmed in 2012:

… its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

While it is still the case that a small proportion of peer-reviewed literature is available without charge, the majority of journals now grant permission for authors to post work in institutional repositories. The number of open access journals is growing, and these tend to be cited more often than those that require fees-for-access (Willinsky). These are signs that the open access movement is making progress.

As I argue here, it is particularly important that research results that have come about as a result of public funding be made available to that very same public without additional charges. Fortunately, public research funding agencies are coming to the same conclusion. In 2007, the Canadian Institutes of Health Research (CIHR), for example, became the first public research funding agency in North American to adopt an open access policy. From that policy:

Grant recipients are now required to make every effort to ensure that their peer-reviewed publications are freely accessible through the Publisher’s website (Option #1) or an online repository as soon as possible and in any event within six months of publication (Option #2)…. Under the second option, grant recipients must archive the final peer-reviewed full-text manuscripts immediately upon publication in a digital archive, such as PubMed Central or the grantee’s institutional repository.

Likewise, in the United States, the National Institutes of Health (NIH) adopted a similar policy in 2008:

The Director of the National Institutes of Health shall require that all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine’s PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication.

The public has access to these findings through the Open Science Directory, which provides a single interface to more than 13,000 open science journals, including those in the PubMed Central archive that is named in both the CIHR and NIH policies.

It is worth noting, however, that there have been attempts in the United States to challenge open access policies. In 2011, for instance, the Research Works Act was introduced in the House of Representatives; its goal was to rescind open access policies and to limit the sharing of scientific data. This bill was backed by the Association of American Publishers (AAP), an association no doubt concerned for its future in the face of the open access movement. It appears that protests by those opposed to the bill, including the American Library Association (ALA), have effectively halted its progress (Howard). Nonetheless, this close call serves as a reminder that open science, and open access in general, has powerful opponents in the for-profit publishing world.

Read more:



Association of American Publishers. “Publishers Applaud ‘Research Works Act,’ Bipartisan Legislation To End Government Mandates on Private-Sector Scholarly Publishing.” Web.

Budapest Open Access Initiative, September 12, 2012. Web. Retrieved Jun. 16, 2013 from

ePrints. “OA Self-Archiving Policy. Canadian Institutes of Health Research (CIHR).” Web.

HLWIKI International. “Open access.” Web. Retrieved June 13, 2013 from

Howard, Jennifer. “Legislation to Bar Public-Access Requirement on Federal Research Is Dead.” Chronicle of Higher Education. 19 Jun. 2013. Web.

National Institutes of Health Public Access. “NIH Public Access Policy Details.” Retrieved June 13, 3013 from

Open Science Directory. “Open Science Directory.” Web. Retrieved June 13, 2013 from

Willinsky, John. “The unacknowledged convergence of open source, open access, and open science.” First Monday 10.8-1 (2005). Web.