Tuesday, April 17, 2007

Caution: statistics operating in this area

Jason Price, Claremont Colleges' Life Science Librarian and today's Duke of Hazards, took us on an entertaining journey through the potential pitfalls of over-reliance on journal usage data. He opened by warning us of some general hazards to beware:
  • narrow definition of use
    • COUNTER JR 1 (full text article requests) is only one dimension of use; others might be:
    • A-Z list click-throughs
    • citations from your faculty
    • impact factor
    • which journals your faculty publish in, and how much
    • surveying faculty/researchers
    • Page Rank-type
  • vagaries of user behaviour
    • did the user actually get any value out of something to which they clicked through
    • Google Accelerator preloads links from pages users visit
  • different dissemination styles of teaching
    • does the lecturer download and circulate (or post internally) the PDF, or circulate a link to it?
  • granularity of usage reports
    • if report is at title-level, there's no indication of whether accesses are e.g. to purchased frontfile or free backfile
Price then moved onto some more specific hazards that may be encountered:
  • determining cost per use
    • take an annual COUNTER report - divide the package fee by the article views. But what package fee? The *stated* annual fee, or the actual cost, factoring in the additional lock-in fee?
  • comparing to ILL cost
    • the views measures in an online environment cannot be directly to correlated to what would otherwise be ordered via ILL
  • comparing across publishers
    • different interfaces can affect number of article deliveries, for example if linking to the article immediately renders its HTML version - so a user then choosing to access the PDF as well could count as two full text downloads
      • Price notes that the COUNTER code of conduct does require providers to de-dupe statistics to provide a "unique article requests" figure
    • exposure in Google Scholar can also skew usage
    • (at this point, Price's laptop popped up a helpful flag to let us all know that it is hazy but warm in California this afternoon, which we all enjoyed)
  • ignoring by-title data
    • Price showed a classic long-tail curve with low use titles having very high cost per use - these would be the ones to be excluded from/replaced in future packages
  • lack of benchmarks
    • your concerns about the price per use you're paying could amplify or be assuaged by finding out what other institutions' cost per use is for the same publisher
    • it's better to evaluate both cross-institution and cross-publisher to get a more general picture for comparison
Price summarised his recommendations to close before ceding the floor to COUNTER's Peter Shepherd, who opened with an overview of the COUNTER codes of practice, its recommendations and processes, and the current level of compliance within the industry. He outlined some projects which have been undertaken using COUNTER-compliant statistics, for example, the NESLi2 analysis which was able to output cross-publisher comparisons of per-article download costs and growth in full-text article downloads.

Shepherd then overviewed global metrics for evaluating e-journals, including the impact factor, and the potential for a new Usage Factor. Preliminary conclusions of UKSG's recent research in this area indicate that there is considerable support from author, librarian and publisher communities. The UKSG project outcomes also suggest that the COUNTER codes of practice may not be adequately robust, and that there remains frustration at the lack of comparable, quantitative data - particularly given that continuing print usage is not included.

Shepherd proceeded to identify some of the current issues faced by COUNTER, including:
  • interface effects on usage statistics as referenced by Price earlier
    • filter project concluded that it's not best practice to render the HTML regardless of the user's potential choice
    • at most, usage statistics are inflated by no more than 30% as a result of multiple formats
  • separate reporting for archives
    • can currently be requested as a supplementary/sub-report
  • usage of content within institutional repositories
  • involvement with SUSHI protocol to automate retrieval of usage data from providers
And he finished with some future challenges:
  • continued evolution of codes of practice - perhaps with respect to federated searching, "pre-fetching" (Google Accelerator), usability, additional data, new categories of content
  • deriving metrics from the codes of practice e.g. cost per use, Usage Factor for journals

Labels: , , , ,

0 Comments:

Post a Comment

<< Home