Case Law in Google Scholar
Google Scholar has a new radio-button selection on its front page to search for “Legal opinions and journals.” This development is at least a useful new free way to quickly obtain the (cut-and-past-able, html) text of known opinions with cited opinions conveniently hyperlinked — it remains to be seen what, if any, deeper research value the tool will have.
Based on a few minutes of tinkering, the legal opinions that turn up in searches are full-text, hosted by Google, while journal article results tend to be hosted by third parties and/or have only a “citation” result turning up from the Google Scholar search. The ‘Advanced Scholar Search’ interface also allows the user to limit the search to opinions only from either federal courts or from individual states. The Google-hosted Scholar results do not seem to show up in regular web-search Google results.
Google Scholar searching is citation based, in that the presentation and ranking of results is based on the Google’s crawler’s indexing of the citing documents rather than primarily on the indexing of the document returned by the search (which, indeed, may not even be available in full-text online if it is cited adequately often by crawl-able online content). While Google’s web search obviously looks (more broadly)at linking patterns (an innovation on the Google founders’ part that was informed by their familiarity with scholarly citation analysis) the focus on citations as search fodder in Google Scholar is narrower in focus and much more explicit. As a result, using Google Scholar for cases will differ in important ways from searching in other online, word-searchable, repositories of case law, where results are based primarily on the presence and placement of search words in the actual retrieved texts.
A test search for In re Bilski, for instance, turns up the Federal Circuit’s opinion, and also several of the major patent cases cited in Bilski. This didn’t work all that well — Diamond v. Diehr, for instance, is turned up no where in the results — as Google’s bots obviously have no way of knowing which cited works are either at the top of the authority pyramid or figure most importantly into the analysis of the courts and parties. The citations-based ranking algorithm likely does do much better than Lexis’ or Westlaw’s ‘natural language’ searches in zeroing in on ‘lead cases’ with a fairly crude/simple keyword search — a test search for “business method patent” turned up (appropriately, I’d say)the State Street Bank case as the first result (with other major cases following)while Lexis’ and Westlaw’s ‘natural language’ searches predictably turned up mostly long lists of recent district court opinions.
I’m reluctant to extend an endorsement of Google Scholar as a case-finding tool too far, as any number of secondary sources would, much more reliably (and with valuable provision of context and explanation) direct the researcher to the same lead cases. On the other hand, Google’s citation-based ranking algorithm certainly adds something new to the mix, and provides results that strike me (on limited experimentation) as an interesting contrast to those from web-based, keyword-searched, repositories.
The citation-analysis basis of Google Scholar is also leveraged in the presentation of results. When a judicial opinion result is selected by the user, it is presented in a tabbed display, with the ability to toggle between the default “view this case” tab and a tab labled “how cited.” “How cited” includes ‘blurbs’ from citing resources (hint to Google that offering a way at the, e.g., “83 similar citations,” would be useful for this material) and also a full list of citing documents that Google has unearthed for the case in question. It is unclear to me how these citing documents are ranked as results, and there do not appear to be tools to limit this display to, e.g., cases only (much less to cases by jurisdiction). But at least the germ of a new, free, citator exists here.
It is unclear from precisely where Google is sourcing its judicial material or what the depth or scope of the database is (though Justia notes that opinions from all 50 states are included from 1950). Results do come – in some way – from official reporters and do include marginal notation of print pagination. I can speculate that the Public.Resource.Org compilation of federal case law, much originating with a donation from FastCase, may play a role. State sourcing is even more mysterious — I hope colleagues can fill me in on what role Google’s bulk library book scanning plays, as my sense was that the law libraries involved in the program had not yet been scanned.
Other places with new news and tinkering/experiment results:
Justia’s Law, Technology, and Legal Marketing Blog (and Paul Stanley’s Twitter feed); Harvard Law School Library’s Et Seq. blog; ResourceShelf; and Internet for Lawyers; Rick Klau of Google, on Twitter.