LiveSerials: An introduction to ORCID

ORCID (Open Researcher & Contributor ID) started out as a CrossRef initiative that then flew the nest, with the support of Nature and Thomson. It now has stakeholders including funders, researchers and librarians. Geoffrey Bilder, our speaker today, has been seconded from his day job at CrossRef to be the technical director at ORCID.

The general problem: identity is cheap

The problem at the heart of ORCID's being is that, on the internet, identity is "cheap" - it's easy to create multiple different profiles in silos on different sites, leaving every site with a fragmented view of you.

The problem in scholarly communications

The scholarly record is built on understanding the provenance and 'network status' of content. Publisher brands are based on the 'provenance infrastructure' (credentials of author, editorial rigour, peer review, citations). Both CrossCheck (another CrossRef initiative) and ORCID are key to the credibility of the author, although note that it's not just about authors - it refers to "contributor identifiers" to acknowledge all the other roles. One person (one ID) can contribute in lots of different ways (author, reviewer, programmer, compiler) and can have relationships to other IDs (edited by, co-author, colleague etc).

The knowledge discovery problem: name ambiguity

ORCID is about knowledge discovery, rather than access control or security - about people publicising their work, but ensuring it is credited accurately. The main issue is name ambiguity: name variations, name "collision" (multiple people with the same name, eg. the other Geoff Bilder, a Canadian para-ski-glider), name changes, name translations, corporate authors... All complex problems that must be resolved for accurate crediting within scholarly literature. ORCID's mission is to solve this problem through collaboration; various systems exist - economists use RePeC's author claims service, some countries have national databases of researchers - but regional / disciplinary / institutional silos are unhelpful in our networked age. Aspects of identity can be claimed by individuals or asserted on their behalf by institutions; ORCID recognised it needed to bring both organisational and personal assertions together to seed its system as neither level by itself would ensure sufficient uptake to make the service useful.

Principles and progress

ORCID's ten guiding principles (http://www.orcid.org/principles) demonstrate the organisation's non-partisan, international, open approach. The board is made up of "anyone who can commit the time and wants to participate". So what have they done so far?

Thomson donated codebase for its researcher ID to help jumpstart ORCID
Various functions were added to this for ORCID's alpha prototype - Thomson's system was based on personal "claims", so the organisational layer had to be added
Now working out last details for licensing the codebase to build a phase I version of the system
And planning for future sustainability (funding / staff)
Hoping to have something that people can use, next year

Questions:

Q: Authors are allowed to create profiles - how can IDs remain unique?
A: Authors cannot change the identifier, only the information associated with it.
Q: The contributor ID could become increasingly complex - how do we define where 'contribution' begins and ends?
A: We will studiously avoid defining that - it's going to evolve. But the answer is essentially that people will record what they think is important, and if it's not important, it won't be counted for much. [Given that people will have to take the time to enter this data, they will likely only claim credit for things that are useful / important]
Q: How will this fit with the requirements of REF?
A: It's not clear where REF responsibilities will sit but hopefully ORCID will make the process of gathering information easier.
Q: Pseudonymity?
A: A lot of this information is public already, but in aggregation it's more powerful. What if it becomes too easy to find details about stem cell researchers in Alabama or animal sci researchers in Oxford. People do have good reasons to want to hide information - even just if you want to be credited for peer reviewing without it being public. ORCID will allow any or all information except the identifier itself to be hidden.
Q: What is happening with the development of IDs in different countries?
A: It would be a bad idea to think "ORCID's coming, let's stop working on our system". Other systems will continue to exist and be important. At minimum, ORCID will be able to include information about other relevant identifiers.
Q: What work will be involved for publishers?
A: A classic example: a researcher submitting a manuscript currently fills in all the information each time, and that information quickly becomes stale (e.g. contact data). In future, they will upload their ORCID, and publishers can query and recheck information as necessary.
Q: Who will be the arbiter of who will be attached to a work as a contributor?
A: For example, the corresponding author will have more credibility in saying who else contributed.
Q: Disambiguity of affiliations?
A: We may integrate with e.g. Ringgold to create a controlled vocabulary for organisations.
Q: What are the data protection issues?
A: We are transparent about what is being revealed, to whom, and we give authors control - they can make anything except the identifier private.
Q: What's the long term funding plan?
A: Exactly. The technology doesn't matter if we can't sustain an organisation to keep it running. We are looking at future models, from related service provision to membership.

Labels: Author, Identification, orcid, uksg

LiveSerials

Tuesday, April 05, 2011

An introduction to ORCID

0 Comments:

Previous Posts