On 8 July 2015, Luisa Calè and Ana Parejo Vadillo engaged the Birkbeck Centre for Nineteenth-Century Studies in a crowdsourced Twitter interview experiment with members of the AHRC-funded Lost Visions project team: Julia Thomas (Principal Investigator), Nicky Lloyd (Postdoctoral Research Associate; now lecturer in Digital Humanities at Bath Spa University), and Ian Harvey (Research Associate; now researcher on the WISERD project, Cardiff University).1The Illustration Archive was created on the Lost Visions project by specialists in the humanities in collaboration with computer scientists from Cardiff University. Julia Thomas has defined illustration as a ‘dark art’. The aim of the project is to retrieve it from the paper archive of the nineteenth century. How digital platforms transform books as leaf-through devices, what new lives images acquire when they are disbound from books and their paper supports, and what skill sets and conversations across disciplinary boundaries shape their new possibilities were among the questions we had for the Lost Visions team. Through the Twittersphere we targeted digital humanities groups and book history centres, and used hashtags that would link up our Twitter interview with complementary discussions on visual culture, the materiality of the digital, and the new disciplinary crossings enabled by the digital archive. You can read the Twitter interview on Storify here.2 While Twitter offered us an aphoristic medium, we subsequently asked Julia Thomas to expand on some of the questions raised during the day. Thomas chose to discuss in more detail the models, digital disciplines, communities, and crowdsourcing underpinning Lost Visions, the digital transformations of its corpus, and the poetics of the archive, returning to questions by the Birkbeck Centre for Nineteenth-Century Studies (@BirkbeckC19), Luisa Calè (@lcale2), Ana Parejo Vadillo (@aipv2010), Michael Goodman (@mikeygoodman1), and Alexis Wolf (@Ms_Alexis_Wolf).


There are several illustration archives that have been influential both for the construction of The Illustration Archive, and the earlier database that we developed, The Database of Mid-Victorian Illustration (DMVI). DMVI was originally launched in 2007, which, in computational terms, is aeons ago (it was updated and new features were added in 2011). Because we had a closed corpus of illustrations in DMVI (a mere 896 images), we were able to keyword all the pictures in-house. We were inspired, in part, by the exhaustive keywording system in the William Blake Archive, which allows users to search the archive using these words. Just scrolling through the list of terms in this archive gives a tremendous insight into Blake’s iconography. I am also an admirer of the Rossetti Archive, which was never afraid of experimentation. The DMVI team took part in one of the first Networked Infrastructure for Nineteenth-Century Electronic Scholarship (NINES) summer schools at the University of Virginia when our eyes were opened to all sorts of possibilities for the digital archive. I am delighted that the iconographic system we constructed on DMVI has played a small part in the development of other digital archives, including Yellow Nineties Online and Illustrating Scott.

The Illustration Archive presented a different set of issues to DMVI, primarily because we had over a million images that needed tagging, so we could not tag them ourselves. For this project, we decided to harness the goodwill of ‘the crowd’. We looked at lots of different crowdsourcing projects, including those on Zooniverse. The challenge was getting the questions we asked the user right, so that the archive could then be searched using the information provided. One of the best crowdsourcing projects in this respect is Your Paintings. I have spent many happy hours tagging these paintings and collecting coloured paintbrushes (go online to see what I mean!).

Digital disciplines

Having worked on several digital projects with computer scientists, it seems to me that the key to success is learning to speak each other’s language, or at least trying to. The best humanities projects are those where the research questions of the humanities take the lead, but where those very questions challenge and push how things are done computationally. In terms of The Illustration Archive, we had to explore different ways of identifying the similarities between images because content-based image retrieval falls short when it comes to a data set of images that are radically diverse: we have black and white and colour images, images from different types of book (literature, history, geography, science, philosophy), from different periods (roughly the eighteenth to the mid-twentieth century), produced by different techniques (etching, engraving, photography, lithography, etc.), and of varying image quality. Computer vision tools, such as those developed for facial recognition, do not work successfully here.

There are always different viewpoints between disciplines, but this is all part of the excitement of working on these projects. One issue that arose when we were developing The Illustration Archive is that humanities scholars have very different ways of searching digital archives. We are not necessarily disconcerted by having several hundred search results and we will (more or less) happily wade through them. As a computer scientist, Ian Harvey was astonished that anyone would look past the first page of Google’s results!

The term ‘philology’ is, of course, problematic when it comes to images because it is specifically about the structure, development, and relationships of (written) language. Images are different. This was precisely the tension we came up against when developing DMVI and The Illustration Archive. We need to take account of the difference of the visual (iconographic and stylistic features, how images look), but, by necessity, this difference is negotiated through the medium of language. Tagging is, after all, a textual act.

The Illustration Archive allows us to come to a greater understanding of how illustrations make their meanings, how they signify. This signification is relational: it takes place in the interaction between images and texts, and this interaction can be explored in the archive because we have access to the texts as well as the pictures. Illustration is relational in another sense, too: the pictures refer to other illustrations, whether consciously or unconsciously; they are interpictorial. The Illustration Archive makes visible this interpictoriality in a way that has not previously been possible.

Digital communities and crowdsourcing

The workshops we held for librarians and schoolteachers really helped to shape The Illustration Archive. Claire Horrocks, who was the advisory editor for the Punch Historical Archive, came to one of the workshops and devised a feedback worksheet that she gave to her students. Their comments revealed how important it is to blind test digital archives on potential users before they go live.

Our engagement work on the project has fed into another outreach activity: the REimagine competition (September–December 2015). We are inviting entrants to recreate an illustration from the archive in art, text, craft, or multimedia.3 We want to show how these historic illustrations are relevant today. So please get knitting, sewing, baking, colouring, and writing!

In a fundamental way, crowdsourcing made us confront the very question of what an ‘illustration’ is. We wanted a system that captured as much information as possible about each image, but this information also had to be relevant to those searching the archive for illustrations. A crowdsourced tagging system needs to take account of two types of user: the tagger and the searcher. This is always a juggling act, particularly when it comes to meeting the requirements of experts and specialists as well as the general user. We have a free text ‘additional information’ box where experts can tell us what they know about the pictures.

The tagging question that was most difficult to devise was the seemingly straightforward one in which the user is asked to identify the illustration ‘type’ from a list that includes an advertisement, a portrait, a decoration, a photograph, and so on. This is not a classificatory list; we simply set out to capture information that we would not necessarily get from the other tagging questions. But it does expose the whole notion of how illustrations can be classified and defined. Unlike fine art, illustration has no established list of ‘types’. Even the Arts and Architecture Thesaurus seems to struggle when it comes to illustration. There was the issue, for example, of whether the user would necessarily understand or be able to identify what a ‘literary’ illustration is. ‘Literary’ is highly problematic in this context because it is a category that is determined largely by the bibliographic information rather than the iconographic features of the image. Without the title of the book, it is not always easy to distinguish a ‘literary’ illustration from one that appears in, say, a travel book. We would also have liked to include different techniques of reproduction in our list (etching, engraving, lithography), but some of these techniques are notoriously difficult to identify, especially in a scanned image.

Transformations: books, material texts, digital objects

That’s a big question. In some ways, The Illustration Archive replicates leafing through a book for illustrations in its browsing/random image view. Digital archives have developed wonderful software precisely for replicating the materiality of the book. Yellow Nineties Online, for example, has a FlipBook feature that enables the user to turn the pages of the book, including the tissue interleaves. This is a terrific resource in the case of books that are difficult to access, rare, or fragile, and The Yellow Book is all of these.

In some cases, though, attempts to reproduce the format of the book in a digital environment can be problematic. Such interfaces emphasize the materiality of the book, but they erase the specificity of the digital. I want to be able to see where and how the illustration is situated on the page and the text that surrounds it, but I also want the illustrations to be free, albeit momentarily, from the confines of the book: leafing through a book allows the reader to see only one or two illustrations at a time; a digital display allows the user to see many simultaneously and to trace the connections between them.


The digital helps us to understand illustration in new ways: it makes accessible images that have otherwise been forgotten; it enables us to look for commonalities and differences between illustrations and to embed them in the values of their historical moment. But this is not a one-way process: illustration studies also has implications for the digital. After all, digital archives are themselves constituted by words and images and the interaction between them.

Hide and seek

In some ways, I think there is a risk of losing the original dynamic between word and image that defines illustration. We are fortunate to have the text as well as the images in The Illustration Archive, so the user can view the full page and even the whole book and see the images in situ. In another sense, however, the original context of illustration is always already lost. In DMVI, we included the illustrations divorced from the text because the collection of periodical illustrations that we were using (from the School of Art Museum and Gallery in Aberystwyth) had been cut out and pasted onto card by a Victorian collector. The very fact that these images had been cut out suggests the extent to which illustrations in this period were mobile rather than fixed to a specific text: woodblocks were commonly sold on to other engraving and publishing companies. A picture that appears in one book can turn up again in an entirely different context. There are some lovely examples of these reused illustrations in The Illustration Archive.

Archive poetics

I am drawn to this idea of The Illustration Archive as a poetics of illustration. In a sense, it is a distinctly visual poetics, where the difference of illustration is displayed and where the interpictorial allusions between images can be traced. But it is also bimedial. The digital archive is a space where the relation between word and image that defines illustration is remediated in a new dynamics between the picture and the other texts that surround it: tags, descriptions, captions, iconographic, and bibliographic metadata.