The Book Tower's Long Tail
On the deal between Google Books and Ghent University library
On 23 May 2007 Google and the University of Ghent signed an agreement. As a result, the Book Tower became the fifteenth institution worldwide to join the American company in its ambition to digitize the world's paper heritage. Ghent University Library and the Royal Library in The Hague (14 July 2010) are still the only libraries in the Low Countries whose collections are included in Google Books. The first shipment left on 11 October 2007; the last load of old editions appeared online in 2018. So, it's time to take stock.
Google was founded in 1998 with the mission "to organize the world's information and make it universally accessible and useful". In the early 2000s it indexed more or less the whole web and opened it up with the PageRank algorithm, which was developed by Larry Page and Sergey Brin in the garage of Susan Wojcicki, who is now CEO of YouTube. New sources of information entered the picture. In 2004, by which time the company was listed on the stock exchange, it developed Google Scholar, a bibliographical databank that made it possible to search through scientific journals. It also launched Google Books. Then, on 14 December the same year, the company announced that it would “digitize the ‘world’s knowledge’”. It had concluded a cooperation agreement with five large research libraries, including those of Harvard and Stanford, to scan fifteen million books from their collections. Other agreements followed fast.
Having opened up digital-born knowledge, the company now wanted to convert the information that was still confined to paper, by mass scanning it and making it accessible to Google's search engine's algorithms. The plan was actually a logical consequence of the goal to make all information available universally (and of the business model behind it). The addition of data from analogue formats would make Google even better. The investments in Google Books were therefore also strategic.
Modern Alexandria or Napster for books?
Reactions to the announcement were divided. Google Books promised (and still promises) free access to our world heritage over time. A universal library that is free of charge and always accessible. It sounds like a present-day Alexandria. Yet there were significant objections from the very start. Critics wondered whether our collective heritage should be put into the hands of a private multinational. What would happen if Google ever filed for bankruptcy? And is universal accessibility sufficient to sweep all the legal objections concerning privacy and intellectual property from the table?
Opponents pointed to "Google's totalitarian ambitions"
Opponents pointed to "Google's totalitarian ambitions" and described a scenario in which society was willingly delivering itself to a big brother. Patrick Lefèvre, Director-General of the Royal Library of Belgium had a sign put up in 2008: “Non à la privatisation du savoir”. For their part, authors feared a "Napster for Books”. The American Authors Guild filed a copyright case against Google Books' scanning programme which, after a long appeals procedure, was won by the company. In Europe, France in particular went into battle. In April 2005 the head of the French National Library, Jean-Noël Jeanneney, reacted with a contribution in Le Monde, which he later developed into the book Google and the Myth of Universal Knowledge: A View from Europe (2007). To the legal claims concerning copyright and the social issue of whether world heritage should be managed by a listed company, Jeanneney added a third argument, Google's universal library threatened in practice to turn into Anglo-Saxon cultural imperialism.
Jeanneney's campaign was to have two consequences. The European Union launched its own platform, www.europeana.eu, for the digitization of cultural heritage from European archives, museums and libraries, while Google was looking for partnerships with libraries that could supplement its programme with non-English-language resources. That was the strategic reason why, in 2007, Google asked the university library in Ghent to let its collection be digitized in Google Books.
A meeting in Silicon Valley
The Book Tower in Ghent © Michiel Hendryckx
The coincidental reason was a meeting that the chief librarian, Sylvia Van Peteghem, had in Silicon Valley. She was invited on a mission to the United States with the then-Flemish Minister for Science Policy Fientje Moerman. On the agenda was a visit to the Rem Koolhaas Library and a networking event in Stanford. There Van Peteghem met Wim De Waele from the Belgian Institute for Broadband Technology, who was in contact with an Indian top man at Google. There was a first spark of interest. Back in Europe, on 2 February 2007, Google's Paris-based representative, Philippe Colombet, visited the Book Tower. Google's interest was confirmed and, a week later, six men from the company, whose names were not disclosed, flew in. They all wore black backpacks with Google embroidered on them in colourful letters. Van Peteghem took them around the tower, showed them collections and presented Ghent-based digitization projects like Recollecting Landscapes. The link was also made between Google and Paul Otlet's nineteenth-century bibliographical undertaking (the Mundaneum) and the former chief librarian in Ghent, Ferdinand vander Haeghen. At the end of the afternoon there was a short intermezzo, after which one of the Google men stood up and said, “From our side, it’s a deal. We love your building, your collection and your team.”
On 10 May 2007 the board of the university approved the agreement with Google and on 23 May there was a joint appearance for the press. The publicity that the university got as a result of the agreement was invaluable and the plans for the restoration of the Book Tower, which were in the pipeline at the time, received a considerable boost.
The library stipulated that it should always receive a copy of its scans free of charge
As a result of the agreement, all the print books from before 1870 (except for special formats) were eligible for digitization; there were about 250,000. Google scanned them free of charge. The Book Tower was responsible for the selection and preparation of the shipments. Google put together a team to filter the collection via pick lists. On 6 September 2007, a test load of five print editions left for the States. Then, on 11 October 2007, the first complete load, consisting of around five thousand volumes, left the Book Tower. This rhythm of five thousand volumes per month was maintained in 2007, 2008, 2009 and 2010. As of 2011, loads were sent off every two months. The last regular load was dispatched to Google in April 2015.
On top of the Book Tower © Geert Roels
The scanned volumes were immediately viewable in Google Books. The Book Tower received digital copies free of charge and had negotiated that UGent's watermark should appear on every digitized version. To ensure that they were one step ahead of the critics, who thought that the library had been robbed by a commercial player, the library had also stipulated that it must always receive a copy of the scans of books from Ghent free of charge.
Clicks from China
Right from the start, Google Books' Ghent programme was successful. Every month thousands of volumes got a new life online. Every day thousands of internet users opened the digitized collection. By December 2011, some 113,000 titles from Ghent (181,000 volumes) had already been uploaded onto Google Books, whose pages where opened approximately 400,000 times a day. On peak days, such as 23 October 2012, Google Books got 1.7 million hits (expressed in number of pages clicked on). Among the big hits from Ghent there were splendid illustrated botanical books, like British entomology, being illustrations and descriptions of the genera of insects found in Great Britain and Ireland, which was published in sixteen volumes by Richard Taylor, London, between 1824 and 1839; travel stories, like the two-part Travels through Sweden, Finland and Lapland to the North Cape, in the years 1798 and 1799 (1802), by the adventurer Giuseppe Acerbi, of which the Ghent copies were obtained from the estate of Virginio Armellini; and anthropological studies, like Die Völker des Erdballs nach ihrer Abstammung und Verwandtschaft (1847), of which Google had also scanned the nineteenth-century, hand-written dedication, “à la bibliothèque de l’Université de Gand, hommage.” These titles not only became globally viewable, they were also fully searchable.
From the moment Van Peteghem arrived on the scene, making the collection viewable was a priority. Thanks to Google Books that priority made a quantum leap.
In the course of the whole of 2013 – the last year before it closed for restoration - the Book Tower counted around 230,000 physical incoming visits. Google Books got close to that figure every couple of days. In 2011, the last year in which Google indicated the geographical origin of its digital visitors, 22.3 percent were from China, 19.8 percent from the United States, 16.5 percent from France, 4.6 percent from Germany, 4.2 percent from Belgium, 3.5 percent from the Netherlands, and smaller numbers from other countries.
Google came to Ghent to add non-English-language resources to its programme. It succeeded
The official shipments came to an end in 2015. However, when the opportunity arises, more material is digitized. For example, the library collection of the Royal Academy for Dutch Language and Literature, which was recently transferred to the Book Tower, contained a collection of old editions that were suitable for Google Books. Some 1,700 volumes were digitized and made available.
At the moment there is a test project to upload the collection of auction catalogues, which are extremely valuable to international art history research. The Book Tower has over eleven thousand, often annotated copies, the oldest of which date back to the early seventeenth century. The first shipment will leave in August 2019.
The most recent figures from 2018 show that an average of 50,000 books from Ghent are consulted every day, always with one or more page views.
Two books from the Ghent University Library that have been digitized by Google: Christian and Politick Reasons wherefore England and the Low-Countries may not have Warres with each other and the Saxon court's state calendar from 1735.
Google went to Ghent to add non-English-language resources to its programme. It was a success. Around one third of the scanned books are in French, a quarter are in Dutch and just under 20 percent were written in Latin. Only 2 percent are in English. Almost half of the titles that Google uploaded from Ghent date back to the nineteenth century, a quarter were published in the eighteenth century and another quarter in the seventeenth century. Only a few thousand titles are from the sixteenth century.
The remarkable thing about Google Books is that it is not the beautiful nineteenth-century botanical books that are consulted most often, or the wonderful travel stories. The most popular title of the year is the treatise Christian and Politick Reasons wherefore England and the Low-Countries may not have Warres with each other. The pamphlet, from 1652, was obtained by chief librarian Vander Haeghen on 28 November 1869 at the sale of the Amsterdam sugar baron Isaac Meulman's pamphlet collection. The subject matter of the eight-page booklet situates it in the wars between England and the Dutch Republic in the mid-seventeenth century. In 2018 it was consulted more than twenty-two thousand times worldwide. It was closely followed by Θεοδωρίτου ἐπισκόπου Κύρου και εὐαγρίου σχολαστικού ἐκκλησιαστική ἱστορία, a Greco-Latin history of the early Church, published in Paris in 1673, which was opened almost twenty thousand times in Google Books.
Both titles are rather rare, but not unique. There does not seem to be any immediate explanation for their success, and what links them more than anything else is their obscurity. The same applies to most of the other titles from the top ten. Number three is the Essai philosophique sur les probabilités (1829), by Pierre-Simon Laplace. Number four is a source publication, the Calendars of the Proceedings in Chancery, in the Reign of Queen Elizabeth (1829), to which thirteen thousand people surfed. Number five is world heritage, an 1843 edition of Shakespeare's Hamlet, from Charles Knight & Company. It was consulted eleven thousand times in 2018, half as often as the pamphlet from 1652. The top ten of the most popular Ghent titles in 2018 concludes ingloriously with the third volume of the Calendars of the Proceedings in Chancery for 1832. It was clicked on 7,642 times in Google Books.
A double page from British entomology, being illustrations and descriptions of the genera of insects found in Great Britain and Ireland
It is difficult to explain why these particular titles were the most frequently consulted. It is possible that there was a research programme somewhere which made published case law from sixteenth-century England so popular, but apart from that the list is very random. Number seven, for example, is the state calendar of the Saxon Court from 1735. Number eight is the Contextio gemmarum sive annales, a seventeenth-century edition annotated by one of the Church Fathers, which includes both the Hebrew and Latin versions. This type of bedtime reading was retrieved more than eight thousand times.
What has become clear from over ten years of Google Books in Flanders is that worldwide Latin and Greek are still better known than Dutch, or at least they are more sought after than the Dutch language. Not a single Dutch-language book made it to the top ten of 2018. But there were two classical titles, just as there were two German, one French and five English. Bearing in mind that only 2 percent of the scanned books were in English (as opposed to 24 percent in Dutch), the English language certainly remains very dominant. So, the French chief librarian Jeanneney was not wrong to be concerned.
The long tail really does exist, even in the world of old editions
The second conclusion is more paradoxical. The scale of Google Books makes it possible to illustrate how rare and obscure titles can draw a large audience globally. If you could bring together everyone who was researching the relations between England and the Dutch Republic in the seventeenth century, as Google does, you could apparently fill a good-sized festival venue. So, the long tail really does exist, even in the world of old editions.
Perhaps we should not be surprised about the results of Google Books. Perhaps it is only normal - evidently it is - that there are twenty thousand people living on this globe who want to study church history in the original Greek. It is still an uplifting thought, for example, that a theologist somewhere in India can examine a text fragment from Theodoretus Cyrensis in a seventeenth-century edition that is kept thousands of kilometres away in the Ghent Book Tower; or that a man or woman from China can sift through what exactly went on at the Saxon court in 1735. Flemish heritage is not being spread by Google Books, but world heritage that is located in Flanders is. That is good too.