Wednesday, April 01, 2009

Plenary presentation summary: Journal Spend, use and research outcomes: A UK perspective on Value for Money. Presented by: Ian Rowlands, CIBER

During the second plenary session on Tuesday during UKSG, Mr. Rowlands presented some preliminary data from part of Research Information Network funded research project. He is halfway through project and will be continuing into next year. There are some very interesting visualization tools to explore the data online.

There has been an unprecedented growth in access to journal material over the past decade as content has moved from print to electronic. However, it is critical to assess the impact of the increase in access and availability of content has had in past decade. Has this increase in access led to higher productivity and more innovative research?

In exploring the research outcomes, Rowlands is looking at many quantifiable criteria, including: Number of Counter downloads, # of Phds, # of grants, institutional spending patterns, and deep log analysis in a variety of disciplines.

It should come as little surprise to the community that the transition from print to electronic publication is nearly complete. 96.1% of science journals are online and 88.5% of arts and humanities journals are online. In 2007, the academic community spent £80 million on e-journal licenses. Collectively those purchases have yielded more than102 million downloads or 0.80 £ per download.

There has been tremendous end-user take up of these resources. The number of downloads doubled from 2004 to 2007. This represents a 21.7% per annum growth in downloads over that period. The core proposition of providing online articles is “very popular” among researchers.

There has also been a rapid increase of number of journals available at an average institution. The average number of titles per researchers is up from just above 4 to just below 8. {TC – Given the present economic environment it is likely these figures will decrease in the coming year, but it certainly will remain at a higher average level.}

Citation analysis is showing that users are drawing more sources, and including more references per paper. The use of navigation and discovery tools, increased access, has created a situation where research is now more deeply founded in previous work.

University administrations are looking for clear and compelling justifications for the continued expense of information purchases and Mr. Rowlands thinks that compelling information is now available.

This change of availability has impacted the information seeking behavior of end-users. It is not surprising that Google is the “librarians friend”. Many Researchers are using gateways, such as Google, Pubmed, etc. to get access to content. Examples of the increase of traffic abound. One OUP Journals saw a two-fold increase in journal uses as an effect of opening up their content to Google.

The access provided by online content is also having a profound impact on resource use. The convenience of 24 X 7 access is tremendous. 17% of activity is taking place on weekends and the “Working day is growing” with 1/3 of activity taking place outside of “normal office hours” of 9:00am – 5:00pm. This access was more difficult in a print-based world.

However questions remain about whether efficient search is the same as or necessarily yields successful research? There is a strong negative correlation between research rating of the scientists in institutions and the average session length on Science Direct. The most “successful researchers” were the group spending the least amount of time online with content. Trends pointed to the fact that the most successful researchers use gateways. Much more search activity is taking place outside the library, typically on services like Pubmed, Google, and Google Scholar.

There were natural clustering of intensive use and the figures for the differences between moderate, high and super users correlated significantly with outputs such as the numbers of papers produced, the amount of grants funds received and the number of PhD’s the institution produces. In addition, while the average cost per download is consistent across institution, the more active the institution the less per article the institution paid.

Mr. Rowlands stressed that these data merely show associations not causation. Nor does the data show any directionality. Is it that a lot of research creates demand for lots of information, or is it that research institutions, put things together and in place for research, which therefore impacts results.

The next stage of this research will look at historical information. Among the topics to be explored is what are linkages between products, spending and outcomes? He is working to produce a computer model, that shows, for example, scenarios what the increase in the number of titles and/or downloads might have on research outcomes.

The initial information points to the fact that downloads and research outputs are like “gears on a bicycle” that move in tandem. As one gear gets bigger, the faster the other gear turns. Although one needs to understand the causality question, the understanding of the fact of the connection is a useful addition to knowledge about assessment and performance measurement.

{NB Disclaimer: Much of this summary is verbatim and/or paraphrased from the Mr. Rowlands talk – very little in this post is interpreted and should not be credited to me. Apologies to Mr. Rowlands for any errors.}

Labels: , , , , , , , , , ,

Tuesday, March 31, 2009

Moving to e-only from a library perspective

I attended this breakout session yesterday and was reminded how useful it is to come to UKSG each year. I was drawn to the title of this session and in particular the '...from a library perspective' bit. That, it seems to me, is the best thing about coming to UKSG. You get to hear the view from librarians and, working in publishing as I do, find this insight invaluable.


Sarah Pearson from the University of Birmingham gave a great overview of the many challenges facing her library (and many others too I'm sure) as they move further towards an e-only model. I hope she doesn't mind me giving the ending away but as Sarah herself admited this e-only model is not likely to arrive at her University library any day soon. Instead a hybrid of print and electronic content exisits.


She outlined the collection development principles currently in place saying that web-based resources are the preferred medium and how important it is to have a flexible budget that responds to changes in course content and research directions. She also said it was key to negotiate great value for money.


The UofB has 24,000 free and subscribed e-journals plus 1000 e-resources (340 subscribed) and 4000 e-books. Sarah highlighted the many benefits of opening up greater access to their collection and offering e-access to library users including distance learners. E-delivery adds value such as alerting services; citation links and discussion forums, which are not available in print of course.

Sarah outlined the benefits, such as opening up a much bigger collection to users for a lower fee, and the drawbacks of big deals, such as taking some journals that may not be used extensively and having less control over collection development. She summed up with some useful learning points to take away:
  • don't expect to go completely e-only
  • usage is an important tool but don't forget about feedback
  • big deals have benefits but there are trade offs
  • negotiate, negotiate, negotiate.

All in all I found the session useful and it certainly gave me a library perspective on the potential for moving to an e-only model and the pitfalls that entails.

Labels: ,

Tuesday, April 17, 2007

Caution: statistics operating in this area

Jason Price, Claremont Colleges' Life Science Librarian and today's Duke of Hazards, took us on an entertaining journey through the potential pitfalls of over-reliance on journal usage data. He opened by warning us of some general hazards to beware:
  • narrow definition of use
    • COUNTER JR 1 (full text article requests) is only one dimension of use; others might be:
    • A-Z list click-throughs
    • citations from your faculty
    • impact factor
    • which journals your faculty publish in, and how much
    • surveying faculty/researchers
    • Page Rank-type
  • vagaries of user behaviour
    • did the user actually get any value out of something to which they clicked through
    • Google Accelerator preloads links from pages users visit
  • different dissemination styles of teaching
    • does the lecturer download and circulate (or post internally) the PDF, or circulate a link to it?
  • granularity of usage reports
    • if report is at title-level, there's no indication of whether accesses are e.g. to purchased frontfile or free backfile
Price then moved onto some more specific hazards that may be encountered:
  • determining cost per use
    • take an annual COUNTER report - divide the package fee by the article views. But what package fee? The *stated* annual fee, or the actual cost, factoring in the additional lock-in fee?
  • comparing to ILL cost
    • the views measures in an online environment cannot be directly to correlated to what would otherwise be ordered via ILL
  • comparing across publishers
    • different interfaces can affect number of article deliveries, for example if linking to the article immediately renders its HTML version - so a user then choosing to access the PDF as well could count as two full text downloads
      • Price notes that the COUNTER code of conduct does require providers to de-dupe statistics to provide a "unique article requests" figure
    • exposure in Google Scholar can also skew usage
    • (at this point, Price's laptop popped up a helpful flag to let us all know that it is hazy but warm in California this afternoon, which we all enjoyed)
  • ignoring by-title data
    • Price showed a classic long-tail curve with low use titles having very high cost per use - these would be the ones to be excluded from/replaced in future packages
  • lack of benchmarks
    • your concerns about the price per use you're paying could amplify or be assuaged by finding out what other institutions' cost per use is for the same publisher
    • it's better to evaluate both cross-institution and cross-publisher to get a more general picture for comparison
Price summarised his recommendations to close before ceding the floor to COUNTER's Peter Shepherd, who opened with an overview of the COUNTER codes of practice, its recommendations and processes, and the current level of compliance within the industry. He outlined some projects which have been undertaken using COUNTER-compliant statistics, for example, the NESLi2 analysis which was able to output cross-publisher comparisons of per-article download costs and growth in full-text article downloads.

Shepherd then overviewed global metrics for evaluating e-journals, including the impact factor, and the potential for a new Usage Factor. Preliminary conclusions of UKSG's recent research in this area indicate that there is considerable support from author, librarian and publisher communities. The UKSG project outcomes also suggest that the COUNTER codes of practice may not be adequately robust, and that there remains frustration at the lack of comparable, quantitative data - particularly given that continuing print usage is not included.

Shepherd proceeded to identify some of the current issues faced by COUNTER, including:
  • interface effects on usage statistics as referenced by Price earlier
    • filter project concluded that it's not best practice to render the HTML regardless of the user's potential choice
    • at most, usage statistics are inflated by no more than 30% as a result of multiple formats
  • separate reporting for archives
    • can currently be requested as a supplementary/sub-report
  • usage of content within institutional repositories
  • involvement with SUSHI protocol to automate retrieval of usage data from providers
And he finished with some future challenges:
  • continued evolution of codes of practice - perhaps with respect to federated searching, "pre-fetching" (Google Accelerator), usability, additional data, new categories of content
  • deriving metrics from the codes of practice e.g. cost per use, Usage Factor for journals

Labels: , , , ,