10th February 2011
Aliases of Anthony Judge Identified by Google Search
- / -
As indicated in the FAQ item relating to author in the menu for the site of Laetus in Praesens, the author of the documents on this site is Anthony Judge, unless otherwise specifically indicated in the text of the document itself (as when the document is co-authored with named others). Authorship can be confirmed by viewing the source code of any page (by ctrl-u in browser), where this is so stated in the relevant HTML meta tag for "author". Biographical information is also provided. It is the fact that the author, Anthony Judge, is not conventionally indicated below the title of each document (as displayed by browsers) that has resulted in the curious situation described in what follows.
Confusion arises when reliance is placed on an indication of authorship in Google search results (or those of Google Scholar). Google explicitly indicates that it does NOT make use of the author meta-tag and instead endeavours to deduce the author from other information in the document. When "Anthony Judge" is the sole author, as indicated above, and does not then immediately appear below the title of the document displayed by a browser, Google algorithms make unpredictable deductions regarding authorship. The most frequent "author" indicated by Google search results for these pages is "by CF Reactor" (indicated in Google Scholar Advanced Search as responsible for some 500 documents on this site). This derives from its retrieval from one (obscure) menu item indicating "Cognitive Fusion Reactor".
Other attributions of authorship seem to derive from detection of the last apparent name in any appended list of bibliographical references -- somewhat embarrassing in the case of distinguished authors who find themselves presented as the author of documents on this site.
More curious, and potentially embarrassing, are other such deductions. For, example an article about the Marketable Exploits of Osama bin Laden is listed by Google search results as "by O bin Laden" (with 47 documents indicated as authored by him in Google Scholar, most on other sites and one being an Open Letter from The Project for the New American Century to US President George W. Bush).
|[images retrieved 16 December 2010]
Using any analysis "by author", the latter would then appear erroneously as an author of documents on this site. Other (potentially amusing) cases are indicated in the table below.
The matter has been specifically discussed with Google Webmaster Help (see thread). A detailed commentary has been made about this and related cases by Tomáš J. Fülöpp (Osama bin Laden and other bloggers conjured up by Google AI).
The phenomenon has been described in a journal article by Peter Jacso (Newswire Analysis: Google Scholar's Ghost Authors, Lost Authors, and Other Problems, Library Journal, 2010), following on an earlier report by Geoff Nunberg (Google Scholar: another metadata muddle? Language Log, 26 September 2009). As indicated there, the Google team blames libraries and publishers for bad data -- namely failing to conform to Google's explicit preference that author information be placed in the description meta-tag contrary to other conventions. This notably runs the risk of truncating relevant descriptions inappropriately. But as noted by Jacso,
Google's algorithms create phantom authors for millions of papers. They derive false names from options listed on the search menu, such as P Login (for Please Login).
Very often, the real authors are relegated to ghost authors deprived of their authorship along with publication and citation counts.
Other amusing examples -- far less frequent -- are listed below. It is difficult to search for them systematically. To see the erroneous aliases (until the programming error is corrected), use the keywords in the examples below in Google or Google Scholar.
Subsequent to the original presentation of the above information, the meta tags in the documents on this site were modified in the light of the Inclusion Guidelines for Webmasters for Google Scholar. The pages now include meta tags which are specific to Google Scholar rather than for other search engines. The examples given are
<meta name="citation_title" content="Marketable Tales of the Exploits of Osama bin Laden">
<meta name="citation_author" content="Judge, Anthony">
<meta name="citation_date" content="2004/05/31">
However it is unclear whether (or when) Google Scholar may re-index pages so tagged, and whether such indexing will be respected in the Google listing of search results. The Guidlines also make it clear, in the case of PDF documents, that title and author information are required in a particular form to be taken into consideration. The PDF pages on this site use the PDF meta-tag facility.