Rachel Robinson is a spring 2016 graduate of Southwestern University. Here, she guest-blogs about her experience as a student worker for the Department of Research and Digital Scholarship in the Smith Library Center at Southwestern University.
As a student worker in the Department of Research and Digital Scholarship in the Smith Library Center at Southwestern University this past year, I have spent a lot of time learning about new and different ways to engage in academia within a digital format. Most of my responsibilities during the academic year dealt with showcasing digitized archival items from Special Collections on a new website. Over the summer, I was tasked with two projects that required different skills and served different purposes from my previous work. First, taking up the bulk of the summer, I converted PDFs of articles from the Anti-Slavery Reporter from the early 20th century into searchable plain text files using Optical Character Recognition (OCR) software. Second, I created digital copies of typed transcriptions of correspondence to Leonard Woolf, Virginia Woolf’s husband, regarding his work in anti-slavery activism, also from the early 20th century. Both of these projects resulted in the creation of digital humanities datasets, which can be used in many different ways for further scholarship.

Cover of the October 1909 issue of the Anti-Slavery Reporter
The process for both projects was rewarding but more time-consuming than I had expected. Running OCR, especially, ended up stretching out far longer than I had originally planned due to the level of detailed attention that the process required. While the software converts the PDF to plain text, it is a highly imperfect conversion that gives wildly differing results. Some plain text files were mostly correct, requiring only a few minor edits, while others were almost completely illegible, all the text being symbols and signs rather than words. The longest documents, between fifteen and twenty pages, tended to alternate between correct pages and illegible ones. Often, the shortest documents (one or two pages) would also be illegible and require a complete transcription. Documents that were between three and six pages tended to be the easiest to clean up. Regardless of the quality of the conversion, I had to read through every single plain text file and compare it to its corresponding PDF in order to check for errors. Small mistakes could easily slip through; for example, “be” would often convert as “he” (and vice versa). While this step significantly increased the time spent on this project, it also meant that I became very familiar with the subject matter of these PDFs. The same thing happened with the letter transcription.
Both of these groups of documents revealed to me the kind of discourse being used by British anti-slavery activists in the early 20th century. Although a great deal of outside research would be required for me to better understand the context of the Anti-Slavery Society, and the extent of Leonard Woolf’s engagement with anti-slavery politics, I can make preliminary observations of the considerable amount of text I have read and edited. The strength of conviction that the men of the Society had in their mission is evident in the amount of time, effort, and political maneuvering described in the Reporter. Topics were revisited in each issue of the journal, most notably the “Congo Question”; this allowed me to trace the political developments in the Congo, and to see how the Anti-Slavery Society reacted to them, from 1906 to 1912. It was disheartening to notice, again and again, that although the Society was vehemently against the enslavement of indigenous peoples in European colonies, it did not question the presence of European nations in Africa and other colonized areas, nor did it question the perceived superiority of white European society. The Woolf correspondence, while not directly related to the Society, gave insight into the personal, everyday lives of these anti-slavery activists in Great Britain: who they disliked (e.g., Sir Arthur Steel-Maitland), where they lived, where they ate meals, etc. The letters often included newspaper clippings, revealing a vital method by which these men exchanged information related to their purpose. These letters remind me that communication was much slower and more deliberate a century ago.

Scanned image of an ASRAF article before Optical Character Recognition (OCR)

Text file of the article post-OCR
My work over the summer has helped me tremendously in my post-graduation job search. I am currently working as an intern at a fine art gallery in Houston, TX. My project is to create the database for the entire body of work of a deceased artist, whose estate has come into the hands of the gallery. The process for this includes transcribing the entirety of the hard-copy inventories from the estate’s owner, conducting research on the artist’s life (especially in digital archives), and scanning slides and editing images. In my interview, my work in the RADS department was of particular interest to my employers, one of whom noted that I “seem[ed] to like systems.” The focus of my student worker position on digital archives demonstrated to my employer that I would not only be able to handle the detail-oriented work of this digitization project, but also that I would truly understand the value of it. With the creation of this database, this artist’s work can be viewed, exhibited, and sold internationally. The vast increase in accessibility to knowledge that has been facilitated by digitization initiatives is something I am proud to have been a part of at Southwestern University. I am excited for all of the scholarship, education, and understanding that will arise out of more and more students, interns, and employees engaging in digital projects.
-Rachel Robinson