Wednesday, April 01, 2009

Google and Librarians: why it shouldn't be us and them

In preparing to speak to us today .. Clare Duddy did some Googling. In Google's browser. She warns us upfront, in case we hadn't clocked, that she's pro-Google.

Clare is a Masters student at London Met university who won a UKSG competition to present her views on information discovery in the Google generation. I am torn between wishing that I'd had an opportunity like this when I was a student, and thinking how petrifying it must be to present to an audience of professionals eager to hear your views. Clare tells us she's nervous but proceeds to speak confidently and knowledgeably on a subject that while familiar to us all, still holds surprises.

Us and them
Clare already works part time at Oxford University libraries as an electronic journals assistant. Interestingly, she sees the "us and them" of the information world as librarians vs Google (not, as some of the other UKSG delegates might see it, as librarians vs publishers). Between her work and her thesis, Clare spends a lot of time looking for information. "I've been online for more than half of my life, and search engines were already prevalent by the time I started my academic career - I've never had to find information without them." She quotes a friend: "Google is an extension of my memory - I don't have to keep facts in my head."

Finding information
There is a new balance in education as we keep up with emerging technologies. Google has 63% share of the search engine market (13.5bn searches in the US in Jan 09); OCLC research shows that 89% of college students start searches on search engines and Clare confirms it's her first port of call for all her information needs from academic to social. It's a known known. Perhaps less known is that the same research shows only 1% of users starting their search in an online database.

The Google generation
The Google generation is not defined by an age group but by a demographic - "always connected"; multi-tasking; computer literate. Clare says we might also see this group as "impatient, gullible and lazy" - taking the first result they find in a search engine and giving librarians sleepless nights. As we know, the main problems with using search engines as our point of entry to research are:

Material not indexed
* deep web
* access controlled
* non-linked
* robot-excluded
* non-HTML
* no static URL)
Despite this, Google has value - it highlights "informal literature" - the non-traditional materials that other library resources don't surface so effectively, if at all. Through Google Scholar you can filter your search to authoritative content, and the Library Links program enables libraries to direct users to licensed content. And because of Google's power and influence, they drive exposure and sensible structuring of content (e.g. Harvard has redesigned its website to expose its digital collections more effectively; National Libraries of Australia have created stable URLs and metadata for individual items in their image collection). There is a sense that we overestimate what you can't find, and underestimate the value of what you can find.

Quality of material online
"Democratic" (user-generated) publishing - famously exemplified by Wikipedia - concerns librarians and publishers, the gatekeepers of authoritative content. But Wikipedia's advantage is its breadth - over 2.7 million entries in comparison to Oxford Reference Online's 1.3 million (yes, there could be an apples and oranges issue here). "We have to assume that we can't control the web or impose our authority on it any kind of comprehensive way", so how do we manage our response to what we find? With "a pinch of salt"; the widespread news coverage of Wikipedia's flaws, and our own knowledge of how simply we can publish what we want, helps us understand that not everything we find can be trusted. Librarians spend a lot of time already training users about the quirks of different online resources; why not include Google and Wikipedia (etc) in that training.

Deskilling search
Clare recalls a lecturer harking back to the glory days where "users were not allowed near the computers and had to use a librarian to find information", but "librarians are no longer required in that role" - they feel displaced; is their reticence about broad search resources based on frustration? There is a context in which "one-box" search engines are in fact the best way to find something. But still users have need of more complex search interfaces and despite their fondness for simplicity they do recognise the value of more sophisticated search.

Conclusion
"Young people today need to be educated to use these tools properly, just as we had to be taught to use a library and book properly in the past". We shouldn't assume there is one Google generation with one set of characteristics - users are still a complex group with varying needs. It can only be helpful for us to acknowledge the place of Google in our users' lives and to help grow their understanding of this tool in the context of the other tools we offer.

(see next post for question and answer session revealing more of Clare's online behaviour)

Coda: Clare's presentation was excellent - not only interesting and well-informed in terms of the material covered but ably and compellingly presented. The feedback about this session has already been overwhelmingly positive and we'll definitely be thinking about how to follow up with more user input at next year's conference.

Labels: , , , , , ,

Wednesday, April 18, 2007

"The old git slot:": a life in scholarly publishing flashing before our eyes

"I've drawn the old git slot," said John Cox ruefully as he took the stage, and then proceeded to confirm that judgement by listing the plethora of modern office necessities not yet invented when he started in publishing, and bemoaning the "witless wonders" that are our modern youth.

Yet plus ça change, plus c'est la même chose. Scholarly publishing is essentially the same industry as it was when UKSG was founded 30 years ago - and in fact the principles on which publishing rested 300 years ago are still relevant today. Even commercial publishing has been around longer than we think; Cox cited examples from the 18th and 19th centuries. And back at the beginning of Cox's career, journal publishing was rudely healthy. Librarians had ample funding and researchers' appetites for information were not yet overwhelmed. But social changes began to limit the growth of libraries' budgets such that journals began to be cancelled and success for new journals was not so immediate (as Paul Calow noted in yesterday's "Financial Imperatives" session, it now takes 7 years for a new title to break even).

Then along came a spider ... or, in fact, its Web. The shift from print to online may not yet be complete, but about 90% of scholarly journals are now online, and this has changed the way that libraries and publishers do business together - for example, with consortial purchasing. One consequence of online publishing is the hunger for Open Access - an unproven business model which has not yet shown itself to be sustainable, says Cox, particularly across the broader and non-scientific literature. Further, as Sally Morris had noted this morning, the Open Archive movement is potentially damaging to the scholarly journal; the world's 850 institutional repositories may currently be scantly populated (with academics actually admitting they are "distinctly unwilling" to deposit), but they are being supported by a number of major funding agencies, and may yet grow sufficiently to change the current landscape.

Cox took a detour at this point to acknowledge the effect on scholarly communications of Google, "the search engine of choice for most of us" (albeit propounding the common misconception that the search giant has indexed "most journals" in Google Scholar ). Google is getting closer and closer to us and will "shape the development of our industry over the next 5 to 10 years", having already revolutionised things with its page rank algorithm.

The future for publishers, therefore, is in the functionality within which they wrap their content. If the research itself is freely available - and easily discoverable - elsewhere, publishers have to differentiate themselves with truly useful features (e.g. supporting datasets, taxonomies, community facilities). Cox praises OECD's SourceOECD for using the capabilities of online to add massive value over the print, and Alexander Street Press for building communities in the humanities - demonstrating the value across different sectors.

Web 2.0 "will bring further changes", of which user-generated content and folksonomies have most relevance to scholarly publishing. They represent the value-adds which can differentiate publisher platforms from institutional repositories - if publishers are willing or able to make the necessary investment in technology, and to make the transition to being service providers rather than manufacturers.

Labels: , , , , ,

Tuesday, April 17, 2007

Google Scholar @ GSK: from discussion to implementation

Jennifer Whitaker, a member of the Published Information Group within the Information Management team at GlaxoSmithKline, gave a succinct presentation at her briefing session, which nevertheless provoked much discussion amongst the attendees.

Jennifer described the thinking behind the decision to promote Google Scholar to researchers as a means of providing a quick search across scientific information on the web. She went on to explain how Scholar was positioned within the company; the information given to researchers; how Scholar has been subsequently used at GSK; and the effects on usage the standard bibliographic databases subscribed to at the organisation.

The reasons behind the decision to promote Scholar were pragmatic:
  • There was a wish to maximise the use of expensive subscriptions to full-text e-journals.
  • There was a need from researchers to be able make "quick and dirty" searches for background information on topics which would still yield information from quality-controlled sources.
  • The standard Google search was already a highly used tool, and as such there was high awareness of the Google brand.
  • Google Scholar was easy to use, yet offered some of the standard search features of a bibliographic database (such as journal title search field).
  • Google Scholar offered broad coverage of scientific information across disciplines.

There were several strands to the implementation:

  • An evaluation of scientific search engines was carried out, with users being informed of progress and results via the Library webpages.
  • Once it was decided that Google Scholar would be the preferred choice, the Google Scholar toolbar was added to the primary Library Resources webpage in a prominent position.
  • Much thought was given to communication with users, both in person through roadshows to departments, and through FAQs mounted on Library website.

Information professionals at GSK were at pains to inform users that Google Scholar would not be the answer to all their information needs, and should not replace the use of bibliographic databases for in-depth searching. It does not offer comprehensive coverage of scientific literature, and does not necessarily pick up the most recent publications. The search engine is also still in beta, which means that functions could change, and that there is a possibility that it could be withdrawn or become a chargeable service at any time.

An additional factor at GSK is the commercial sensitivity of the searches that researchers carry out. All employees are trained to be aware that their searches of the open web are insecure and can be tracked, and that standard web search engines should not be used when there is a need for confidentiality. Researchers were reminded of the other online resources on offer, and that the Library staff were also on hand to offer advice on searching and locating full text.

Trends in usage statistics for Scholar at GSK showed a steady increase between the implementation in June 2005 and November 2006, with marked increases in usage at the times when the search toolbar was added to the Library webpages, and also when full text linking to GSK's own full text subscriptions was added. Interestingly, usage statistics for the top 6 bibliographic databases at the company showed static usage over the same period, demonstrating that Scholar was a complementary facility rather than a competing one. Unfortunately, no statistics on the usage of full text journals were offered, so we could not see if one of the objectives of the exercise, to make these resources more visible, was achieved.

Jennifer concluded by stating that the decision to promote the use of Google Scholar at GSK was a successful but pragmatic one, and that it will be kept under review for the forseeable future, particularly as new scientific search products become available.

There were many questions from the audience - I have paraphrased some of these below, with Jennifer's answers:

Q: Is there source/coverage list for Google Scholar that GSK users can consult?

A: No there is not.

Q: Does the fact that Google Scholar is still in beta concern you?

A: It is a concern, but users are informed of the beta status and what it implies through face-to-face discussion and online FAQs.

Q: Has there been any research on how researchers at GSK use Google Scholar?

A: Nothing formal, however anecdotal evidence suggests that many are using it as a quick way of locating a paper for which they already have the bibliographic details, and for basic background information on a topic, for example a particular disease.

Q: Have you considered using server log metrics to find out who is using Scholar and how much?

A: Not yet, although we will think about this.

There was some general discussion amongst the audience about the desirability of Google having data on Library holdings in order to provide the OpenURL linking service, although no definite conclusions were drawn.

Q: Would you consider the implementation of a cross-search or federated search engine which could search across your subscribed databases?

A: We are keeping all types of search options under review as they are developed.

Q: If your statistics had shown a drop in use of your bibliographic databases, would you have considered cancelling subscriptions?

A: We would have felt that this was a failure of communication on our part, as Google Scholar is not intended to replace these resources. We would be inclined to re-double our efforts to market the databases rather than consider cancelling them.

It was noted by some members of the audience that users trust (or at least are highly aware of) Google, whilst some information professionals show a marked distrust of Google Scholar. It was also noted that users conflate brand awareness with trust - for example unpublished research has shown that if a set of search results is branded with the Google logo, users will trust the results, even if they have actually been drawn from other search engines.

Altogether this was an interesting session which once again highlighted that speed and ease of use is incredibly important to searchers, even those in the pharmaceutical industry.

Labels: ,