Monday, April 12, 2010

JUSP the Thing

Attending a session with the word JISC in the title is probably cheating a bit for me, but the JISC Usage Statistics Portal is something I have wanted to learn more about for a while as we think about making better use of authentication statistics for the UK federation.

Ross MacIntyre has been thinking about usage statistics since 2000. The same questions are still be asked: 'what is usage?', 'what do you want information on?', 'what is your holy grail?' - although the underlying standards have moved forward significantly with the development of COUNTER and SUSHI.

The Usage Statistics Portal is looking to provide aggregated usage statistics for Nesli2 journal deals - representing over 6000 journal titles. It is currently very tedious for librarians to gather these statistics from the various providers involved in Nesli2, despite good coverage of COUNTER provision. The Portal aims to solve this problem.

The Portal aims to provide the following to libraries:

  • Single point of access to own usage statistics;
  • Monthly figures presented in both academic and calendar years;
  • Adition where relevant of gateway / aggregator statistics;
  • Usage of current collections with backfiles removed;
  • Assistance with SCONUL statistical returm;
  • Trend analysis, high usage titles, publisher summaries etc.
An important factor to note is that a '0' usage title is not regarded by participants in the pilot as a 'worthless' title, but one where the potential of the resource has not been realised. This may demonstrate a requirement for more internal communication with departments, for example.

The Portal also allows institutions to benchmark journal usage against other institutional responses. A big question around this is the issue of whether this data should be anonymised and whether publishers would be unhappy about non-anonymised data being shown. This will be dictated by confidentiality clauses in current licences.

Some of the outstanding issues to be dealt with beyond the pilot are the need for widespread adoption of COUNTER R3 compliance, the need for machine readable sources for publisher price lists, and the need for better subject categorisation of journals.

Finally, the code is being made available open source so it is hoped that this will be available to lots of other institutions and consortia - something I am sure many of my federation colleagues in other countries will be interested in!

Labels: ,

Wednesday, April 01, 2009

Plenary presentation summary: Journal Spend, use and research outcomes: A UK perspective on Value for Money. Presented by: Ian Rowlands, CIBER

During the second plenary session on Tuesday during UKSG, Mr. Rowlands presented some preliminary data from part of Research Information Network funded research project. He is halfway through project and will be continuing into next year. There are some very interesting visualization tools to explore the data online.

There has been an unprecedented growth in access to journal material over the past decade as content has moved from print to electronic. However, it is critical to assess the impact of the increase in access and availability of content has had in past decade. Has this increase in access led to higher productivity and more innovative research?

In exploring the research outcomes, Rowlands is looking at many quantifiable criteria, including: Number of Counter downloads, # of Phds, # of grants, institutional spending patterns, and deep log analysis in a variety of disciplines.

It should come as little surprise to the community that the transition from print to electronic publication is nearly complete. 96.1% of science journals are online and 88.5% of arts and humanities journals are online. In 2007, the academic community spent £80 million on e-journal licenses. Collectively those purchases have yielded more than102 million downloads or 0.80 £ per download.

There has been tremendous end-user take up of these resources. The number of downloads doubled from 2004 to 2007. This represents a 21.7% per annum growth in downloads over that period. The core proposition of providing online articles is “very popular” among researchers.

There has also been a rapid increase of number of journals available at an average institution. The average number of titles per researchers is up from just above 4 to just below 8. {TC – Given the present economic environment it is likely these figures will decrease in the coming year, but it certainly will remain at a higher average level.}

Citation analysis is showing that users are drawing more sources, and including more references per paper. The use of navigation and discovery tools, increased access, has created a situation where research is now more deeply founded in previous work.

University administrations are looking for clear and compelling justifications for the continued expense of information purchases and Mr. Rowlands thinks that compelling information is now available.

This change of availability has impacted the information seeking behavior of end-users. It is not surprising that Google is the “librarians friend”. Many Researchers are using gateways, such as Google, Pubmed, etc. to get access to content. Examples of the increase of traffic abound. One OUP Journals saw a two-fold increase in journal uses as an effect of opening up their content to Google.

The access provided by online content is also having a profound impact on resource use. The convenience of 24 X 7 access is tremendous. 17% of activity is taking place on weekends and the “Working day is growing” with 1/3 of activity taking place outside of “normal office hours” of 9:00am – 5:00pm. This access was more difficult in a print-based world.

However questions remain about whether efficient search is the same as or necessarily yields successful research? There is a strong negative correlation between research rating of the scientists in institutions and the average session length on Science Direct. The most “successful researchers” were the group spending the least amount of time online with content. Trends pointed to the fact that the most successful researchers use gateways. Much more search activity is taking place outside the library, typically on services like Pubmed, Google, and Google Scholar.

There were natural clustering of intensive use and the figures for the differences between moderate, high and super users correlated significantly with outputs such as the numbers of papers produced, the amount of grants funds received and the number of PhD’s the institution produces. In addition, while the average cost per download is consistent across institution, the more active the institution the less per article the institution paid.

Mr. Rowlands stressed that these data merely show associations not causation. Nor does the data show any directionality. Is it that a lot of research creates demand for lots of information, or is it that research institutions, put things together and in place for research, which therefore impacts results.

The next stage of this research will look at historical information. Among the topics to be explored is what are linkages between products, spending and outcomes? He is working to produce a computer model, that shows, for example, scenarios what the increase in the number of titles and/or downloads might have on research outcomes.

The initial information points to the fact that downloads and research outputs are like “gears on a bicycle” that move in tandem. As one gear gets bigger, the faster the other gear turns. Although one needs to understand the causality question, the understanding of the fact of the connection is a useful addition to knowledge about assessment and performance measurement.

{NB Disclaimer: Much of this summary is verbatim and/or paraphrased from the Mr. Rowlands talk – very little in this post is interpreted and should not be credited to me. Apologies to Mr. Rowlands for any errors.}

Labels: , , , , , , , , , ,

Wednesday, April 09, 2008

Breakout Session B (22) Knowing Your Users: Research You Can Do - Judi Briden, Digital Librarian for Public Services, University of Rochester, NY

Although the project described today took place at a (campus based) academic institution Judi Briden began the session explaining that they can be applied to users in any institution.

Background:
  • IMLS grant 2003-2004 to study facility work practices - there was an institutional repositories in place (GSpace) but it wasn't being used properly/enough
  • Libraries hired an anthropologist - this was a very useful move. After studying staff they began studying (the around 5000) undergraduates as it had been so valuable.

What do students really do when they write research papers was the key question but generally they wanted a better understanding of students and how their library facilities, web pages etc. can work better for them. Plans were originally submitted to the research board and the libraries have been very very careful about protecting data and respecting privacy. Consent forms and the right to leave the study were both used.

Retrospective interviews
  • Recently Completed Papers - wanted a concrete example with details
  • From receiving the assignment to turning it in (step by step questions through the process)
  • Each step illustrated on a poster (students would draw in the step as well as describing it)
  • Interviews video recorded and transcribed.

Judi showed an example poster of the processes of writing an assignment. Students did not ever write in a step about talking to librarians though they did include use of the library website/services. Various stages of outline and feedback etc. described and informative for libraries. Some students consulted with teaching staff, some consulted with family (a revelation to the librarians). Students included information about the stage where they got distracted and why, what spaces are better for studying.

Judi showed a senior student's honours paper process poster - a complex poster covering several years. His process much more closely resembled a graduate student.

The next step, after collecting the data, was to look at the data. Research team and librarian staff co-viewed videos, transcripts and drawings in viewings with discussion and brainstorming. The process engendered widespread staff participation and was used at every stage of the project.

What did they learn?
That students:
  • Work on their papers in chunks, with days or weeks in between
  • Asked family and friends for help choosing a topic or editing their papers (one student said her dad had edited all her papers since the 3rd grade!)
  • Some students assumed that if they did a Google search it included the library resources (so they didn't go back and look at the library stuff afterwards) - so services must be better but also resources must be on Google!
  • Did evaluate resources - just not in the ways that librarians recommend (e.g. find and print out articles but don't read for a few days and then discard some)
  • Don't remember who gave their library session

Another technique used were photo surveys - this allows you to investigate environments you would not normally be able to see. When you look at photos with interviewees you learn more than you would be just talking to them. They gave students a disposable camera and a list of photos they should take (the places they study, what they always have with them, how they keep track of time etc.) as well as a few "free" pictures they could use. All images were developed and put on CD and then a session to discuss was scheduled.

Judi showed a students photo (from 2004): mobile phone is present (and almost all students on campus had cell phones). Dorm room pictures were rich in detail that research team would never have thought to ask about otherwise. The things that students take to class photo revealed that no students were taking their laptop to lectures. Once alerted to this the team knew they needed to ask about. An image of a colour coded diary shows the one thing a student couldn't be without. The team found that students were highly and complexly scheduled with work and activities but no days looking the same.

The team also gave students a map of campus and asked students (for one day) to write down where they went and when. It was very very easy to do but very informative and again gave information that was not being given in any other way. Example map Judi shows covers 8am to 12am the next day and covers a complex pattern over the campus covering 2.5 miles (on a fairly compact campus). Students are out all day and take all their stuff around all day. This explains the laptop issue - they didn't want to lug it round all day but would use it when sat in one place for a long time, usually the library at night.

The team also looked at the website design with Design Workshops
  • Create a device that did everything that students wanted it to do and yet could still be small and light
  • They also had to redesign the library webpage and, in another session, marked up a version of the current library homepage with any changes they would make if they could.

Judi explained that even the warm up session devices was informative in terms of the concerns of students.

The marking up of homepages was very valuable and they repeated this exercise with faculty and graduate students later on.

Another study focused on library space. There was an area they wanted to make into a collaborative work space so students input on what should be there was sought. Trying to get participation was tough. In the end recruited on the day with posters, pizza and small payment for taking part in a design workshop. This walk in workshop asked students to imagine that the library has a big new empty space that they can design to be their space in the library, it's build and you love it... what does it look like?

Students were asked for 20 minutes of their time but many stayed for over an hour and got very into the design process. Many pictures go into lots of details. Comfort, daylight, wifi, bookshelves etc. all important to different students. Quiet but not silent seemed to recur as an idea. Talking all the ideas and analysing them they cover five key areas:
  • flexibility to meet variety of needs (need to be able to move things in the space/do several things in same space
  • comfort with family room feel and attention to environment (natural light etc)
  • technology - computers, printers, scanners, whiteboards, "mini Kinkos" (copy centre machine), chargers (phone and ipod) etc.
  • Staff support from checking things to "making a killer latte!"
  • resources - books, dvds etc.

Additional student interviews took place in the student union at a time of year when papers were being worked on - student worker at the library (same age as students) recruited participants. Interviews were done by recent anthropology student who'd just graduated. By using young research team the libraries expected to get more honest answers to questions. They asked if students felt they had enough time for papers, if it mattered and who they have asked for help (and they were prompted as to whether they'd asked a librarian), also when had they last worked on their paper and when would they next work on it.

The results showed that most students had used library resources and had been able to find what they wanted. They also felt they had enough time. They didn't feel that organizing and writing (especially narrowing topic) was going as well. Professors and TAs are subject experts (specifically professors) but saw librarians as experts on finding specific books. All students expected to do well or as well as needed (many were prioritizing several papers).

Faculty study had included interviews in offices which had proved valuable. For studying undergrads it seemed important to go to their dorm rooms. It was felt this could be tricky but students were extremely welcoming and open (putting onus on researchers to be responsible with what they do with data). 2 Dorms were studied and the anthropologists went out between 11pm and 1am as other research showed that students worked at that time of night. Only went to rooms where explicit permission and asked students to do what they normally would do (students did this!) as team observed them. What was interesting was students use of technology, particularly use of computer desktops. Judi showed a video of a student using his computer (a mac with lots of items on dashboard - conversion tables, sticky notes - with reminders, quotes etc (mostly not assignments)). Judi said that you can see from this how little physical space was being taken up by assignments (one paper sticky note on monitor) - assignments are not so large in students general sphere of activity as librarians expected. Librarians have a better sense of proportion about discussing papers as a result.

Dorm observations
- Lots of distractions - music, video games, people, IMs, Facebook, NOT a lot of reading
- My room is your room culture - sort of communal - people wander in and pop through. Lots of sharing going on
- Freshman much more active vs. Upperclassman dorms (busy but less chaotic) - makes sense but seeing that made librarians realise that the library is a refuge and it's a place where students count on a lack of distractions to get things done when they need to.


What are they doing differently?
  • Gleason Library - 24/7 collaborative space - when architect selected they had to incorporate ideas of students, architects willing and excited to do that. Work went on during summer when students not on campus else they would have been included. However members of research team worked with the architects about what students wanted though discomfort about properly consulting students. When it came to final layout students were returning to campus so were asked to place drawings of furniture around as they would like. Consistently the students did 2 unexpected things: the space had a new wall of windows and architects had put comfy chairs there but students wanted natural light for work tables so that actual work rather than relaxation could take place there; demanded quiet study areas not just open space (no doors but divisions to separate noisier from quieter areas. The students love the place. Judi showed images of the room - it's busy and popular all the time, the furniture moves all the time. Judi walks through the area to get to her office and it always changes. There are some frosted glass cubicle areas where the glass walls are whiteboards - they are always being used and prove really really useful. Flipcharts asking about what students thought of the space also asked for even more whiteboards so they are there. Students multitask (image of one knitting and reading!)
  • Night owl reference at paper crunch-time - Judi showed a "Whooo's working late" promoting help with assignment papers (at key times in the year) until 11pm several nights a week when students were busiest. Proved useful and it's now been fine tuned and happens every semester around crunch time to accomodate students needs.
  • Parents breakfast at orientation - given importance of parents to learning process the library now hosts the orientation breakfast and listen to concerns, talk about process and then they tell parents that every single class has a librarian who knows about that class and about resources for that class - so parents can refer kids back to library later on.
  • Experimenting with webpage redesign - will go live in the fall and the new design based on student ideas. Widgets are being used making it customisable and/or rearrangable for specific sessions. This will make pages more attractive but they will also be able to learn from what students choose to include/exclude.
  • Changing the way that library sessions are tought - now feel much more comfortable with experimenting with students, one librarian is now a writing instructor, sessions include discussion of what's going on. We understand how confident and competant students feel - pairs of students are given a resource that they can play with and must present to group about what it does, how to search, when you find something you like how do you get it and finally if you find something you like what are two many ways to get it. Librarians add extra info as needed but just facilitate, students lead and share and it works really well.

Long term benefits
  • understand how our undergraduates live and work on campus - this 2 and a half year project has been fascinating and really fun. We like and know our students a whole lot better and we are feeling motivated to make the library better for them
  • Understand their use of library
  • High staff engagement and participation - librarians now have a more personal perspective and communicative with students
  • Greater comfort and lower overhead for trying new ideas - major change
  • Continuing this type of research

All students and users are different so you need to find out about yours. You don't need an anthropologist. You can do great things with low tech low cost small programmes. Get a small team of interested staff together and build it from there, staff will become interested and it will be fun and find great results!

Q & A

Q: what was the sample size?
A: varied. space designs formed from 19 drawings, 8 students did photo project, 20 students in interview etc.

We've written a book, Studying Students, which is available for free download and have a project website with lots more info:
http://www.tinyurl.com/f63dj

Q: Are you reviewing this process?
A: All the time. We know students love the library and we are listening to them. It's granular and hard to add up. Over time we're looking for more interaction though and writing and library classes very much following that and much more participative. No metrics yet though.

Labels: , , , , , ,

Tuesday, April 08, 2008

Q & A for Plenary Session 3

Q: I am an academic and I build robots. I do this because I want information but you have described what I do as damaging. I do not threaten copyright but it is difficult to download information responsibly from publisher websites

A (Ian Bannerman): I'm sorry I gave that impression. Robots do distort statistics though. The type of usage you use it is legitimate and good and there is work for publishers to do. But if it is measured as human use there is a real issue.

A (Richard Gedye, Oxford Journals (Chair of Session)): we are looking at this in COUNTER and the issues raised by federated searches etc.


Q (for Herbert Van de Sompel): is there not a place for simple usable metrics for people to use

A (Herbert Van de Sompel): they should not be simple but usable - we should know what they are about. We don't really know what they're all about. It is really early days in the study of usage indicators so we are trying to get a grasp on this issue. We won't get out of the two year project with a set of simple metrics to use but we should have some ideas and some valid caveats about using them. A real distinction between a research project as oppossed to launching a metric and being stuck with it for years.

A (Richard Gedye, Oxford Journals (Chair of Session)): Yes we see Herberts work as almost a Which? Guide to Usage Statistics that can be taken forward to build the type of simple metrics for practical usage.

Labels: , ,

The other side of the story: is usage data all it's cracked up to be?

"I collect them - but I don't want them, I don't need them, and I won't use them," said Ian Bannerman, likening promotional giveaway items to usage statistics. We implicitly assume that usage data is consistent, credible and compatible, and that usage factors will be a meaningful indicator of something.

In terms of credibility, Release 3 of the COUNTER guidelines is making inroads into the robot problem (crawlers distorting usage data) by publishing a list of robots and encouraging publishers to exclude these known robots from their list: but is this enough, asks Ian? It's the "amateur" crawler efforts within university IP ranges that cause the most damage and the most distortion, and COUNTER's code will not help exclude these crawlers. And if you identify a crawler (Ian's example showed an article being downloaded 6,372 times in one day), should you retrospectively exclude its activities from previously published usage statistics? Had this crawler been a little more sophisticated, behaving a little more like a "normal" user, the activity would remain unnoticed.

There's a lot going on that we don't understand and that we probably won't ever identify, and interface effects can also distort statistics: HighWire's practice of automatically displaying the HTML version of an article, whilst offering the user the ability to download a PDF from that page, was wrongly giving the impression that the ratio of PDF:HTML downloads was 1:1 (similar stats from Wiley Interscience during the same period gave a more credible 20:1). Given that COUNTER stats are part of the toolkit used by libraries when managing collections, falsely inflating one's statistics in this way is ethically dubious (upon realising this Highwire discontinued its practice).

There is a danger that in measuring the literature, we change the literature. Impact Factors can be abused, for example, by increasing self-citations to the journal in other articles/editorials; alerting authors to content they should cite (however positive the intention); publishing cite-able papers early in the year to maximise the number of citations they could garner before they become eligible for impact factoring; targetting topical areas rather than long-term studies; and so on - see Chronicle of Higher Education Oct 2005 (Monastersky's article) for additional thoughts. The Usage Factor will also not be immune to such "observer effects": authors may encourage everyone they know to download their article and improve their ranking; publishers may "sex up" keywords or seek to double downloads per the HighWire example above, or encourage online coursepacks over printed ones to maximise usage ... and so on. Ian concludes that not all attempts to influence the impact factor are necessarily bad, while those attempts to influence the usage factor are - and are less traceable. Ultimately, citations (on which IF is based) are more meaningful than downloads (on which UF is based).

Recommendations
  • Be cautious; don't over-interpret usage data.
  • Let's carry out further research into factors that influence article downloads
  • Let's improve guidelines on detection/blocking/filtering of robots
  • Let's watch out for the Observer Effect when developing usage-based metrics.
During questions, Peter Murray-Rust noted that some robotic usage is for non-aberrant purposes (e.g. data mining) and we need to be careful to distinguish between different forms of robotic activity and ensure that we are not excluding valid usage from metrics.

Labels: , , ,

Information-seeking behaviour of the virtual scholar: from use to users - David Nicholas, UCL

David began by addressing the issue of disconnection from the user. We monitor activity rather than actual users. The virtual audience differs in composition from previous audience - we also can't even see it and we find they move elsewhere (accessing material from publisher sites for instance). Content was kind. Now the consumer is king.

We need to identify best practice, find scholarly outcomes and achieve satisfaction.

David has a slide up to illustrate the Virtual Scholar: a portfolio of services used by these users. This information is evidence based including:

  • UK National e-Books Observatory, JISC, 2008-9
  • Impact of Open Access Journal Publising, OUP, 2006-
  • RIN study on use and impact of journals, RIN, 2008
  • Behavious of the Researcher of the Future (Google Generation), 2008
Digital information footprints allow a complex view of information seeking behaviour which is rich in detail.

Profiling Information Seeking Behaviour
  • There are huge numbers of scholars and high demand for scholarly product driven by ubiquitous access (on buses, trains, hotels etc, existing users can search more freely and flexibly); huge usage numbers; spiralling growth; usage is not the outcome.
  • Some issue in the fact that many users are overseas - UK government funded scholarly websites have less than a third of their users in the UK; Asia loves OA; what issues does this raise?
  • Many users are young - information seeking behaviour is very different; spend lots of time online and some still see them as "noise" in the stats
  • Robots are always an issue - around half of all scholarly site visits are by robots (in some cases 90% of users are robots); now mimic human behaviour (Google's are particularly shrewd)

Human/Human-like behaviour
  • Shop around (40% of visitors never visit again)
  • Bounce (1-3 pages only of the many available - overseas visitors bounce less, young people more)
  • Flicking (a kind of channel hopping behaviour)
  • View (humans conditioned by emailing, text etc.
  • Don't view articles for more than 2 minutes
  • Spend more time reading short articles than long articles online; if it is long either read the abstract or squirrel away for later)
  • Power browse (you can hoover through titles, contents, abstracts etc at huge rate);
  • Books now opened-up great view
  • Horizontal rather than vertical
  • Navigate (we spend half our time navigating to content)
  • We are not all the same (national differences, e.g. Germans most successful searchers and most active information seekers; age differences; gender differences (women are less permiscuous!))
  • Brands very complex but imporant (difficult to identify where authority lies (especially with authorised resources - hard to tell how your access is occuring)
  • What you think is the brand is not what other people will see as the brand, and some are cool, some are not)
  • Do not behave like a librarian!
  • Behave like an e-shopper (use a common platform, multitask, information pedigree of some of key e-commerce giants: Amazon and Google)

Impacts, outcomes etc. best summed up by Guardian (quoting Marshall McLuhan's "Gutenberg galaxy").

David Nicholas compared power browsing and information seeking etc. to alcoholics anonomous: people don't want to admit they do these things. We are all behaving like this though - not just the young! Although older users have different conceptual framework for this behaviour.

Access is no longer the outcome - need to go beyond having that access be easy and quick, now we need to profile behaviours in order to find best practice and see what works and what does not. Establishing the good and the bad needed to establish development of information literacy. We also need to know how we justify our spend on information resources by proving value.

"We are not fighting Google, the battle is with ourselves"

Unless we connect with our users we will dissipate.

Labels: , , ,

Reconsidering scholarly impact: MESUR

"Usage data totally rocks", chirped the endearingly-passionate Herbert van de Sompel as he attempted to rouse us all from our hangovers on this second morning of UKSG. van de Sompel's team in Los Alamos has explored interoperability, OpenURL, OAI-PMH, repository architecture and more but is now focussing on MESUR, a project intended to enhance methods of assessment of scholarly impact.

In the paper age, our best attempt to quantify scholarly impact was to count citations. But in a networked environment, we have many more metrics to deploy:

Usage-based metrics
can include numbers of accesses to scholarly material, where the come from and so forth. We can factor in usage of multiple content types (preprints, blog postings, datasets alongside journals and articles) and maintain a comprehensive record from the moment of an item's digital publication. Usage data can, however, present significant challenges. What *exactly* constitutes usage? Can we be sure to protect users' privacy? How do we standardise and aggregate data records?

Network-based metrics
can leverage citation networks, co-authorship networks and so on to assess behaviour. We need to select metrics that characterise the network and define the importance of specific nodes within that network. Tools like Google PageRank and the Eigenfactor can help us to assess networks and assign appropriate levels of significance to nodes within them.

MESUR has accumulated a vast dataset (1 billion usage events, relating to 50 million documents/100,000 publications, spanning up to five years) from multiple stakeholders in the information community. It is important to avoid bias in sampling and analysing this data. Cross-validation against existing indicators, to ensure that there is an appropriate level of correlation, allows the team to check whether their results are broadly valid. The project's goal is to assess whether metrics can be defined from usage data and how these could be used if so.

Networks are identified based on tracking a user's behaviour through a session - for example, creating a connection between the documents downloaded by a single user. Once this type of analysis has been extrapolated to a billion usage events, patterns emerge. This helps to confirm our expectations that, for example, practitioners use the literature differently to researchers. It also shows that whilst users read across multiple disciplines, their citations tend to stick to their own discipline. Correlating different metrics on maps shows that usage-based metrics tend to cluster together (basically in agreement with one another), whereas citation metrics vary both from usage metrics and from each other. Overall, there is an indication that the traditional Impact Factor (IF) "is a completely different animal" to multiple network-based metrics.

Labels: , , ,

Tuesday, April 17, 2007

Caution: statistics operating in this area

Jason Price, Claremont Colleges' Life Science Librarian and today's Duke of Hazards, took us on an entertaining journey through the potential pitfalls of over-reliance on journal usage data. He opened by warning us of some general hazards to beware:
  • narrow definition of use
    • COUNTER JR 1 (full text article requests) is only one dimension of use; others might be:
    • A-Z list click-throughs
    • citations from your faculty
    • impact factor
    • which journals your faculty publish in, and how much
    • surveying faculty/researchers
    • Page Rank-type
  • vagaries of user behaviour
    • did the user actually get any value out of something to which they clicked through
    • Google Accelerator preloads links from pages users visit
  • different dissemination styles of teaching
    • does the lecturer download and circulate (or post internally) the PDF, or circulate a link to it?
  • granularity of usage reports
    • if report is at title-level, there's no indication of whether accesses are e.g. to purchased frontfile or free backfile
Price then moved onto some more specific hazards that may be encountered:
  • determining cost per use
    • take an annual COUNTER report - divide the package fee by the article views. But what package fee? The *stated* annual fee, or the actual cost, factoring in the additional lock-in fee?
  • comparing to ILL cost
    • the views measures in an online environment cannot be directly to correlated to what would otherwise be ordered via ILL
  • comparing across publishers
    • different interfaces can affect number of article deliveries, for example if linking to the article immediately renders its HTML version - so a user then choosing to access the PDF as well could count as two full text downloads
      • Price notes that the COUNTER code of conduct does require providers to de-dupe statistics to provide a "unique article requests" figure
    • exposure in Google Scholar can also skew usage
    • (at this point, Price's laptop popped up a helpful flag to let us all know that it is hazy but warm in California this afternoon, which we all enjoyed)
  • ignoring by-title data
    • Price showed a classic long-tail curve with low use titles having very high cost per use - these would be the ones to be excluded from/replaced in future packages
  • lack of benchmarks
    • your concerns about the price per use you're paying could amplify or be assuaged by finding out what other institutions' cost per use is for the same publisher
    • it's better to evaluate both cross-institution and cross-publisher to get a more general picture for comparison
Price summarised his recommendations to close before ceding the floor to COUNTER's Peter Shepherd, who opened with an overview of the COUNTER codes of practice, its recommendations and processes, and the current level of compliance within the industry. He outlined some projects which have been undertaken using COUNTER-compliant statistics, for example, the NESLi2 analysis which was able to output cross-publisher comparisons of per-article download costs and growth in full-text article downloads.

Shepherd then overviewed global metrics for evaluating e-journals, including the impact factor, and the potential for a new Usage Factor. Preliminary conclusions of UKSG's recent research in this area indicate that there is considerable support from author, librarian and publisher communities. The UKSG project outcomes also suggest that the COUNTER codes of practice may not be adequately robust, and that there remains frustration at the lack of comparable, quantitative data - particularly given that continuing print usage is not included.

Shepherd proceeded to identify some of the current issues faced by COUNTER, including:
  • interface effects on usage statistics as referenced by Price earlier
    • filter project concluded that it's not best practice to render the HTML regardless of the user's potential choice
    • at most, usage statistics are inflated by no more than 30% as a result of multiple formats
  • separate reporting for archives
    • can currently be requested as a supplementary/sub-report
  • usage of content within institutional repositories
  • involvement with SUSHI protocol to automate retrieval of usage data from providers
And he finished with some future challenges:
  • continued evolution of codes of practice - perhaps with respect to federated searching, "pre-fetching" (Google Accelerator), usability, additional data, new categories of content
  • deriving metrics from the codes of practice e.g. cost per use, Usage Factor for journals

Labels: , , , ,