open science: open research

“[The] accumulation of reliable knowledge is an essentially social process” (David, “Understanding the Emergence”).

Open research means being transparent about methods. At the minimum, this means describing research methods in a published paper such that anyone with the requisite skills and resources can scrutinize and attempt to reproduce the results. Open research may also involve an open peer review process, whereby journals expand the circle of peer reviewers to include members of the public. K. Thomas Pickard, a healthcare advocate, describes the value of rethinking the peer review process for medical research, arguing that online social networks give the public new opportunities to engage in the scientific enterprise:

Critics argue that the peer review process is slow, stifles innovation, and lacks transparency (most reviewers remain anonymous). With social networks, alternatives to peer review are emerging. The most commonly employed model is based on comment crowdsourcing, similar to how buyers rate products and sellers on Amazon or eBay. Anonymous peer review is replaced with public reviews that can include the reviewer’s reputation (as determined by peers) to weight the review score. Weighting an author’s reputation can be achieved with concepts such as the author’s scholar factor, h-index, or other “altmetrics.”

In its most open form, open research may involve what is known as “open notebook” science: using the technologies of the internet to make available research details, early findings, and iterations well before, or even instead of, formal publication, and to encourage participation from others.

Advocates of open notebook science maintain that such transparency in the early stages of a scientific endeavor has a couple of important advantages. First, it allows for iterative adjustments in methodology that may improve the quality of the results. Second, publishing early findings allows researchers to establish primacy over their methods and results much more quickly than would be possible were they to wait the normal months-long cycle for their results to appear in a published journal article. In the words of Dorothea Salo, scholarly research services librarian and author of “Who Owns Our Work?”:

Adherents of Open Notebook Science open their entire research process on the web using wikis, Google Docs and similar online tools. Notably, Open Notebook Science allows its practitioners to establish visible, verifiable primacy over their processes and the results thereof, which potentially undercuts publishers both by reducing scientists’ pre-publication ‘scooping’ fears and by providing a substitute for the supposed primacy verification offered by formal publication.

It is important to note, however, that there is nothing in the definitions of “open” provided previously that requires any researcher to practice open notebook science; it is perfectly in line with the values of the Budapest Open Access Initiative and the OKF to conduct, publish, and disseminate open science without (say) engaging with the public on a blog every step of the way. Nonetheless, this is indeed what some people mean when they speak of open science.

We have already seen how open research might work. One of the best‐known recent examples is in the work of Rosie Redfield, a microbiologist from UBC. When scientists funded by NASA reported in 2010 that they had found arsenic‐based life forms on Earth, Redfield expressed concerns on her blog about the quality of the research. Her assessment spread rapidly over social media – through Twitter among other means – and in the press, casting immediate doubts on NASA’s findings (Zimmer).

Known as the #arseniclife affair, Redfield’s “is one of the first cases in which the scientific community openly vetted a high‐profile paper, and influenced how the public at large thought about it” (Zimmer). In 2011, Nature magazine included Redfield in its list of the top 10 “people who mattered” for the year (Hayden).

Read more:

—————-

References

Budapest Open Access Initiative, September 12, 2012. Web. Retrieved Jun. 16, 2013 from http://www.budapestopenaccessinitiative.org/boai-10-recommendations

David, Paul A. “Understanding the emergence of ‘open science’ institutions: functionalist economics in historical context.” Industrial and Corporate Change 13.4 (2004): 571-589. Web.

Hayden, E. C. “365 Days: Nature’s 10. Ten People Who Mattered this Year. Rosie Redfield, Critical Enquirer.” Nature 480 (22 Dec. 2011): 437–445. Web.

Open Knowledge Foundation. “Open definition.” Web. Retrieved June 13, 2013 from http://opendefinition.org/.

Pickard, K. T. “Impact of open access and social media on scientific research.” J Participat Med 4 (2012): e15. Web.

Salo, Dorothea. “Who owns our work?.” Serials: The Journal for the Serials Community 23.3 (2010): 191-195. Web.

Zimmer, C. “The Discovery of Arsenic‐Based Twitter: How #arseniclife Changed Science.” Slate 27 May 2011. Web.

open science: open data

As defined by the Open Knowledge Foundation, “a piece of data or content is open if anyone is free to use, reuse, and redistribute it — subject only, at most, to the requirement to attribute and/or share-alike.” Open data advocates believe that data gathered during scientific inquiry should be made available for scrutiny and reuse by other researchers. However, the current system works against open data in several ways. There are barriers on all sides: publishers may have licensing restrictions; data may be inaccessible or in a proprietary format; and researchers themselves may be reluctant to give up control. Nonetheless, despite these obstacles, the open data movement is growing.

The Panton Principles, launched in 2010 by the Open Knowledge Foundation Working Group on Open Data in Science, are an attempt to facilitate the widespread adoption of open data. In brief, these principles state that publishers should make their wishes explicit regarding the reuse and repurposing of some or all of the data by way of a legal statement. This legal statement should be appropriate for use with data; the Panton Principles website lists waivers and licenses that are appropriate for data as well as those are not. In declaring data available for reuse by others, the Principles state that publishers should use the definition of “open” as defined by the Open Knowledge Foundation. And finally, publishers should dedicate data to the public domain in conformance with the Science Commons and OKF guidelines.

Read more:

—————

References

Open Knowledge Foundation. “Open definition.” Web. Retrieved June 13, 2013 from http://opendefinition.org/.

Panton Principles. “Principles for open data in science. ” Web. Retrieved June 1, 2013 from http://pantonprinciples.org/

open science: open access

The principle of open access is simple enough. It is about being able to find and read research and scholarship online at no additional cost (Willinsky).

Relevant to any discussion of open science is the definition of “open access” as set out in the Budapest Open Access Initiative in 2002 and reaffirmed in 2012:

… its free availability on the public internet, permitting any users to read, download, copy, distribute, print, search, or link to the full texts of these articles, crawl them for indexing, pass them as data to software, or use them for any other lawful purpose, without financial, legal, or technical barriers other than those inseparable from gaining access to the internet itself. The only constraint on reproduction and distribution, and the only role for copyright in this domain, should be to give authors control over the integrity of their work and the right to be properly acknowledged and cited.

While it is still the case that a small proportion of peer-reviewed literature is available without charge, the majority of journals now grant permission for authors to post work in institutional repositories. The number of open access journals is growing, and these tend to be cited more often than those that require fees-for-access (Willinsky). These are signs that the open access movement is making progress.

As I argue here, it is particularly important that research results that have come about as a result of public funding be made available to that very same public without additional charges. Fortunately, public research funding agencies are coming to the same conclusion. In 2007, the Canadian Institutes of Health Research (CIHR), for example, became the first public research funding agency in North American to adopt an open access policy. From that policy:

Grant recipients are now required to make every effort to ensure that their peer-reviewed publications are freely accessible through the Publisher’s website (Option #1) or an online repository as soon as possible and in any event within six months of publication (Option #2)…. Under the second option, grant recipients must archive the final peer-reviewed full-text manuscripts immediately upon publication in a digital archive, such as PubMed Central or the grantee’s institutional repository.

Likewise, in the United States, the National Institutes of Health (NIH) adopted a similar policy in 2008:

The Director of the National Institutes of Health shall require that all investigators funded by the NIH submit or have submitted for them to the National Library of Medicine’s PubMed Central an electronic version of their final, peer-reviewed manuscripts upon acceptance for publication, to be made publicly available no later than 12 months after the official date of publication.

The public has access to these findings through the Open Science Directory, which provides a single interface to more than 13,000 open science journals, including those in the PubMed Central archive that is named in both the CIHR and NIH policies.

It is worth noting, however, that there have been attempts in the United States to challenge open access policies. In 2011, for instance, the Research Works Act was introduced in the House of Representatives; its goal was to rescind open access policies and to limit the sharing of scientific data. This bill was backed by the Association of American Publishers (AAP), an association no doubt concerned for its future in the face of the open access movement. It appears that protests by those opposed to the bill, including the American Library Association (ALA), have effectively halted its progress (Howard). Nonetheless, this close call serves as a reminder that open science, and open access in general, has powerful opponents in the for-profit publishing world.

Read more:

—————-

References

Association of American Publishers. “Publishers Applaud ‘Research Works Act,’ Bipartisan Legislation To End Government Mandates on Private-Sector Scholarly Publishing.” Web.

Budapest Open Access Initiative, September 12, 2012. Web. Retrieved Jun. 16, 2013 from http://www.budapestopenaccessinitiative.org/boai-10-recommendations

ePrints. “OA Self-Archiving Policy. Canadian Institutes of Health Research (CIHR).” Web.

HLWIKI International. “Open access.” Web. Retrieved June 13, 2013 from http://hlwiki.slais.ubc.ca/index.php/Open_access_in_Canada

Howard, Jennifer. “Legislation to Bar Public-Access Requirement on Federal Research Is Dead.” Chronicle of Higher Education. 19 Jun. 2013. Web.

National Institutes of Health Public Access. “NIH Public Access Policy Details.” Retrieved June 13, 3013 from http://publicaccess.nih.gov/policy.htm

Open Science Directory. “Open Science Directory.” Web. Retrieved June 13, 2013 from http://www.opensciencedirectory.net

Willinsky, John. “The unacknowledged convergence of open source, open access, and open science.” First Monday 10.8-1 (2005). Web.

open science

First, a brief statement about what I mean by science. The following definition is taken from the UK Science Council: “Science is the pursuit and application of knowledge and understanding of the natural and social world following a systematic methodology based on evidence.”  The last part of this definition is especially important: “systematic methodology based on evidence.”  Without it, science is impossible. To go a step farther, without transparency about such methodology, a truly open science is impossible. Transparency is what allows scientific findings to reviewed, challenged, retried, and confirmed or tossed out. Above all, open science requires that others can access, evaluate, and attempt to reproduce the work of others.

Science has gone digital. Open science is not this maverick idea; it’s becoming reality. About 35 percent of scientists are using things like blogs to consume and produce content (Bly, qtd. in Burke).

Just as open‐source software makes computer code available for examination and reuse for non-commercial purposes by anyone with sufficient interest and ability, open science advocates want to make research findings and the data behind them available for examination and reuse by anyone for the sake of the public good: “The more data is made openly available in a useful manner, the greater the level of transparency and reproducibility and hence the more efficient the scientific process becomes, to the benefit of society” (Molloy).

Open science does not mean giving up proprietary claims to research findings. In past centuries, scientists were moved to publish their work – and therefore make it publically available – in order, as David puts it, to “forestall prior claims to ideas by other scientists” (qtd. by Wallinsky 48). He mentions Newton’s Principa and Darwin’s On the Origin of Species as important examples of this race to publish. Both works have brought incalculable benefits to humankind. Of course, simply publishing a work is not sufficient to make it what we would call “open” nowadays. However, even today, “open science is both communal and competitive, open to free exchanges and proprietary claims” (David, “Historical Origins”). Indeed, some open science practices allow researchers to more quickly establish primacy over their work than is possible in a traditional model.

Open science is not a single concept, and it varies substantially in practice. It includes aspects from several different “open” movements. Certainly, the broad category of open access is an essential part of open science and – because science is meaningless without data – so is open data. Less essential, but still important, are the notions of open research, which may include such practices as open peer review and open notebook science, and open platform.

Read more:

——————

References

Burke, Adrienne J. “From open-access journals to research-review blogs, networked knowledge has made science more accessible to more people around the globe than we could have imagined 20 years ago.” Seed Magazine. June 5, 2012. Web.

David, Paul A. “The historical origins of ‘open science’” working paper, Stanford University, Economics Dept. June, 2007. Web.

Molloy, J.C. “The Open Knowledge Foundation: Open Data Means Better Science.” PLOS Biology Dec. 2011. Web. 24 Sept. 2012. Web.

Science Council. “What is Science?” Web. Retrieved June 16, 2013 from http://www.sciencecouncil.org/definition

Willinsky, John. “The unacknowledged convergence of open source, open access, and open science.” First Monday 10.8-1 (2005). Web.

closed science

In its most extreme form, closed science is unpublished scientific research, available only to the researcher and a select few others: in previous centuries, this may have been a patron. Such unpublished science is ungenerous at best and dangerous at worst. It is ungenerous when it deprives other researchers and therefore society of the benefit of the latest findings; dangerous when it results in flawed policy decisions made without the benefit of full scrutiny or debate.

In most cases these days, however, closed science involves publishing proprietary, often corporately funded, research in pay-per-use journals. Some argue that private enterprise fosters innovation, taking scientific endeavors in new directions and leading to breakthroughs that would not happen without market forces. This may or may not be true. In any case, it can be argued that private funders can make their own decisions about where their research findings are published. Certainly, patent rights and intellectual property rights make this a murky area. Some open science advocates, such as Paul A. David, argue that harm is done by withholding access regardless of the funding agency:

High access charges imposed by holders of monopoly rights in intellectual property have overall consequences for the conduct of science that are particularly damaging to programs of exploratory research that are recognized to be vital for the long–term progress of knowledge–driven economies. (“Economic Logic of Open Science”)

Setting aside this question, closed science is clearly problematic when the results of publicly funded research are unavailable to that very same public except through pay-per-use journals. In this case, the public pays twice: first by spending tax dollars to fund the research in the form of a government grant or similar; and second by paying, usually through university library subscriptions, to access the results of that very same research. There have been various outcries against this practice in recent years; one example is a boycott started in 2012 against the publisher Elsevier; the boycott’s accompanying petition has been signed by more than 13,000 scientists to date.

Even when the results of scientific research are freely available, problems arise when insufficient details are provided regarding the research methodology or the findings themselves. Without transparency, other researchers are unable to properly assess the findings.

Neylon and Wu remind us, however, that “there will always be places where complete openness is not appropriate – for example, where personal patient records may be identifiable or where research is likely to lead to patentable results.” The Human Genome Project, for instance, is an example of scientific endeavour that raises the question of what should be public and what should be kept private.

Open science and closed science are not binary concepts. A single research project may be open in some ways but closed in others. This means that the compelling arguments in favour of data privacy, in some cases, need not sway us from the overall goals of open science.

Read more:

———–

References

Cost of Knowledge. “The Cost of Knowledge.” Retrieved June 10, 2013.

David, Paul A. “The economic logic of open science and the balance between private property rights and the public domain in scientific data and information: a primer.” The role of scientific and technical data and information in the public domain: Proceedings of a symposium. Basic Books, 2003.

David, Paul A.“The historical origins of ‘open science’” working paper, Stanford University, Economics Dept. June, 2007.

Eisenberg, Rebecca S., and Richard R. Nelson. “Public vs. proprietary science: a fruitful tension?” Academic Medicine 77.12, Part 2 (2002): 1392-1399. Web.

Flood, Alison. “Scientists sign petition to boycott academic publisher Elsevier.” Guardian. 2 Feb. 2012. Web.

Neylon, Cameron, and Shirley Wu. “Open Science: tools, approaches, and implications.” Pacific symposium on biocomputing. Vol. 14. 2009.

Can I be agile as an individual?

I work for a very large company, and we’re in the process of moving our various software development units to what is called agile development. Some units are there already. Others (like ours) are just starting to look into it. I don’t know when we’ll get there. But I recently went through some training and was pretty inspired.

What agile is

Here are the main values of agile development, as set out in the Agile Manifesto:

We are uncovering better ways of developing software by doing it and helping others do it. Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on the right, we value the items on the left more.

This means:

  • Simplicity: less planning, more lightweight processes
  • Rapid turnaround
  • Iterative development
  • Constant adjustments
  • More communication
  • More transparency

What agile is not

There are lots of techniques and tools associated with agile: pair programming, stand-up meetings, cross-functional teams, burn-down lists, and so forth. But these are just some ways to work toward the values in the Agile Manifesto — they are not themselves agile. They are neither necessary nor sufficient. In other words, just because I can’t do stand-up meetings or participate in a cross-functional agile team, doesn’t mean that there aren’t things I can do to become more agile-esque and therefore work more effectively.

How can I be agile if my team is not?

As a technical writer, I follow the development environment I work in. We currently work in an old-school “waterfall” environment (first requirements, then planning, then development, then testing, then bug fixing, then beta release, then more bug fixing, then a commercial release) and I absolutely must adhere to that. I have certain things due at the various stages, and I depend on planning documents from other groups. I can’t ignore this structure even if I want to. In other words, for the most part, I must wait for my development unit to become agile before I can truly be so.

However, even though I am beholden to the milestones of the waterfall schedule that rules my project, there are ways that I can be more agile, even as an individual. Many of these are really simple.

The task board

The task board is the single most important information radiator that an agile team has. (Tom Perry)

A public, tangible (non-electronic) task board is a key part of many agile development environments, especially those using scrum as a project management framework. In its simplest form, the board shows task progress. Team members stand around the board every day to move sticky notes and evaluate progress.

Because the board is public and updated daily by those doing the actual work, iraises the project’s transparency and the team’s accountability, and clearly shows when things are behind schedule. It is more likely to be an accurate representation of where things are than is an MS Project file on a project manager’s computer that is updated weekly (or worse). In addition, it’s lightweight: tasks are described by a word or phrase that fits nicely on a tiny sticky note (some say 3-M invented agile): no heavy-duty specifications are needed.

taskboard

We all have to-do lists. Some are in a notebook, others are online. A task board is really nothing more than a public, tangible, up-to-date, to-do list. And it’s clearly something that an individual can easily adopt.

The secret geek has called the task board, “The most productive and least ignorable system I’ve ever used,” preferring it to tasks managed by software: “Placing an anti-procrastination tool on the internet is like hosting an alcoholics anonymous meeting inside a brewery.”

Better communication

I admit that I prefer email to meetings or phone calls. I work on a team that is largely distributed, and I will send an email over picking up the phone 9 times out of 10. But agile emphasizes individuals and interactions; one of its key principles is: “The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.” This lets people work in the same direction and quickly respond to change.

Face-to-face isn’t often possible for me, given the geography, but picking up the phone is something that I can and will do.

“Responding to change over following a plan”

This is my favorite item from the Agile Manifesto. I have spent far too much time planning and replanning my work. And despite all this careful effort, again and again I would find that that my estimates were off, I hadn’t anticipated or mitigated the right risks, and I didn’t always end up in the same place as the software developers (the design had changed since the specs were written!). For this reason, it was good to hear our agile trainer say that perfect long-term planning is impossible in the “cone of uncertainty” we inevitably have at the beginning of any software development project.

This means a few things to me:

  • Plan less (if I can get away with it!), and focus on the short-term
  • Beat myself up less when I “get it wrong”
  • Fail early, communicate, and readjust

Timeboxing

Many agile projects fix deadlines firmly, but allow adjustments to the scope. This focuses everyone on the most important deliverables.

As an individual, I can use timeboxing to limit effort on my own work so that I don’t have things hanging out on my to-do list forever. This will help me overcome three of my big weaknesses: procrastination, perfectionism, and over-committing to things.

Reduce work-in-progress

I’m pretty bad for having a lot of stuff hanging out half-done for a long time. This isn’t ideal because humans (even girl humans) aren’t so good at multi-tasking. It makes it more likely that details will be missed. And only things done by the end of a cycle can be considered “done” — work-in-progress doesn’t help us if we’re not finished when the deadline hits.

Kanban (a form of lean programming related to agile) recommends that the number of “work-in-progress” tasks be limited to two. So if there are already two stickies in the middle column of my task board, and I want to start on another, I have to finish one first. This sounds good to me.

don’t send me away, librarian! on second thought, please do

go_awayThe university library collection is divided among several buildings on campus. When seeking help with my question, I made what initially seemed to be a false start: I took my question to the wrong building. The librarian’s response was: “What subject area is this?” This stumped me: if the librarian, with her knowledge of the library, didn’t know the answer to this question, how could I? Users have difficulty mapping their questions onto the internal structures of an institution. Unfortunately, these internal—often bureaucratic, usually opaque—structures may inform the structure of user-facing resources such as an external website or even a reference desk. When I hesitated, the librarian told me that the collection in her building was limited and that I should instead visit the central location.

I was disappointed to be sent to a different building. Just as the library website provides a seamless front-end to all the library resources within the various physical buildings and online databases, and just as any computer terminal allows me to search the entire collection from a central page without knowing how things are organized on the back end, I felt that any librarian at any reference desk in any building on campus should have been able to assist me. I felt that the information desk should not require me to know the contents of the buildings before asking for help.

Yet I adjusted my opinion after a recent positive experience at the public library. This interaction showed me that a librarian who really knows the contents of a collection is able provide more than just excellent search skills. And so the first librarian I spoke to at the university—the one at the smaller, more specialized location—might not have had the collection knowledge to answer my question effectively. Instead, she sent me to someone who did.

[Done as part of assignment for LIBR 503 at the UBC iSchool]