Home > Digitizing our Collection

Digitizing our Collection

The process of digitizing documents was a large undertaking and, in many ways, included many more steps and much more involvement than our group first expected.  The instructions we were given for each step of the process were very helpful, but decisions still had to be made independently during the process by each of the three group members.  From planning the file names, photographing the documents, and retouching those images in Photoshop, we learned a great deal about the enjoyable and positive aspects of digitization as well as the monotony, physical stress and effort, and especially the problems that arise from it.  Even though there are many good reasons to preserve these documents through digitization, we realized after experiencing the documents first-hand the significance of the emphasis that theorists such as Benjamin and Derrida place on the notion of loss and destruction that accompanies the creation of an archive: in the process of digitization, numerous aspects of the document and its history disappear.  

The Digitizing Process

The first major step of the overall image-capturing project was to go to Special Collections in the MSU Library and take photos of the original documents.  Before this could be completed, however, there were decisions that had to be made both by the entire class and by the image capturing team as to what standards we would implement in the process.  For example, as a class we experimented with the camera equipment by taking pictures in order to better inform our decisions concerning image size, aspect, f-stop, ISO, and shutter speed.  The entire group, following standard archiving procedures, decided that we would shoot in RAW format and that we would aim to shoot in the highest quality with the largest file sizes possible, but that these decisions would ultimately fall to the image capturing team: Miranda, Ryan, Matt, and Kylene.  The specification we decided to use included: shooting in the RAW file type with settings of ISO of 80, aspect of 3:2, shutter stop of 0”, and an f-stop of f/10.

Once the class addressed some of the larger overarching questions, the image capturing team had to coordinate a meeting time to go to Special Collections for an introduction from the librarian about the how to handle the documents, as well as an introduction on how to setup the image capturing station and work the camera equipment.  During this introduction, the librarians explained the fragility of the documents and showed us the correct way of removing the documents from their manila envelopes and Mylar cases, how to carefully turn the pages and situate the foam positioning blocks, and how to gently re-insert the documents into their Mylar cases without damaging any of the edges (See Figure Below).  Two other very important procedures that we learned during this introduction was how to set the white balance on the camera and how to handle and display the correct sides of the ColorChecker, which was used for color correcting the photos later in the process.  Our image capturing setup consisted of a table where foam blocks were used to prop up the images, a stationary tabletop tripod that allowed the camera to be positioned directly above the documents, a Nikon camera, a ColorChecker block, lead filled shoelaces used to carefully hold the document pages in place, and two standing lights situated on either side of the entire setup to provide ample light (See Figures Below).  For the shorter members of the team, the library also provided a stepping stool so that we could see the camera display while taking the photos.  Two boxes of the (as we would find out later, roughly) one hundred documents was also placed on the table where we used a red bookmarking tab to indicate which documents had been photographed.  

Figure 2.2

One of the first procedural impasses we ran into during the first image capturing session was that our original agreed-upon camera settings needed to be adjusted in order to produce the best photos.  For this reason, we decided to change our ISO to 100 and our f-stop to f/8, which gave us brighter and clearer images overall.  Another issue that Kylene encountered during a session include hardware malfunction: one of the bulbs in the lights burned out and had to be replaced by a librarian.  Once this occurred, Kylene had to readjust the white balance again on the camera and wait several minutes for the new light bulb to warm up and reach the same brightness as the other one.  One decision that Kylene made during her session was to flip the document around so that it was upside down facing me.  By doing this, we were able to cut out one step in the editing process of having to flip all of the images so that they were correctly oriented in the photos.  Another difficulty we faced, however, included the realization during photographing several documents that although we had flipped the document itself, we were forgetting also to flip the call number tag. These photos had to be discarded, and we were forced to start over to ensure that both the document and the tag were flipped.  A couple other issues that we dealt with during the digitization phase included that we did not all stick to the number of documents we were supposed to photograph.  For example, although we were each supposed to photograph around 30 documents each, Ryan did several more, leaving another only a handful of documents.  Together we then worked with color correction and converting the ARW to TIF and JPG files.The workflow for the actual image capturing consisted of removing a manila folder with the document from the larger file box, carefully removing the protective Mylar and then the document from the folder and then laying the document in the middle of the foam platform.  We also removed the call number tag included in the folder and made sure to include this tag in each photograph so that the document could be easily identified. We also used this call number tag as an additional safeguard in case any documents got mixed up during the process.  Next, we made sure that the ColorChecker was in the bottom right corner of the foam. Once both of the lights were turned on and the color balance was set on the camera for the entire session, we then secured the camera to the tabletop tripod above the document and took the photo of each of the pages of the document, making sure to focus the images by pressing halfway down on the shutter button before taking the photo.  After photographing an entire document, we then recorded our initials next to the document in the Excel spreadsheet and also made a running list of the call numbers in a separate place so that we would know what order the images were taken in during the file naming process.  One individual decision that the image capturing team encountered early on is when we came across folders that had duplicate copies of the same document (See Figure Below).  Rather than digitizing all of the duplicate pamphlets, we decided to choose the pamphlet that was in the best condition and easiest to read.  After all of the documents had been photographed, we used the USB cable for the camera to download the images on to our individual computers in order to rename the individual files for the corresponding images.

 

 What is Lost in Digitization

Through the process of digitizing the documents, both the class as well as the digitization team has asked itself fundamental questions regarding what is lost through the process of turning material documents into digital images and regarding how the original events are transformed through the act of, as Benjamin calls it, technological reproduction.  For instance, the documents within our collection were originally used as a means of reproducing news about violent crimes and executions to a broader public.  Through their retelling, reprinting, and mass distribution, these pamphlets served as a public spectacle and display of political and judicial power, especially as they included poems, songs, and images and were cheaply made and sold for profit.  Along this line of reasoning, the reproduction of these events in the form of informational and sometimes even moral pamphlets also suggests the commercial nature presupposed in the production and circulation of these documents.  As soon as the crimes and executions were reproduced in any fashion they began losing their claim to authorship and authenticity and thus, the digitization process is only another contributor in the loss of the aura of the crime or execution.

Throughout this process, the image capturing team had the privilege of engaging physically with the documents and we learned quickly that one of the greatest losses during this process was in the aura of the original documents.  By considering Derrida, Foucault, and Benjamin’s approach to reproduction and authenticity, we were able to face many of our digitization decisions and processes from a more thoughtful and educated perspective.  Although there are obvious advantages to digitizing the pamphlets (such as opening access to a wider viewing public), by displaying them as images on a website the artifacts are further removed from their original spatio-temporal setting and context and continue to lose what Benjamin refers to as “aura.”  Although Special Collections at the Library is far removed from these documents’ original context, there is a reverence and interaction with the artifacts that is lost through digitization: the materiality and fragility of the documents is lost.  Throughout the image capturing process, the digitization team experienced the fragility of the documents and had to carefully and respectfully engage with the pamphlets in order to preserve their material integrity.  Certain pamphlets required greater care than others due to their already deteriorating nature (See Figures Below), some were smaller or larger and more delicately bound (See Figure Below), yet many of these things are obscured if not altogether lost through digitization.  Furthermore, working with the documents revealed how the linen rag paper was made from worn out linens that people threw out and recycled to make cheap pulp paper.  Holding up a special light up to the paper made legible not only the pulpiness of the paper, but also horizontal and vertical lines where the drying screens made slight impressions during the papermaking process (See Figure Below).  Although many of these things could be recovered through careful documentation and presentation, there is no substitute for handling the physical artifacts and experiencing them in the way that the image capturing team was able to.  Therefore, our team acknowledges that although the physical characteristics of the documents are greatly minimalized through digitization, there is an important trade-off in terms of increased access and distribution of the documents that occurs.

Naming Image Files

The file naming process was not very difficult.  The way we chose to name the files, with the document’s call number, page number, and date the photo was taken was simple.  However, after working with the other group member’s files, we realized that we had to make independent decisions on how to name the files even within the confines of the specified format.  For example, the call number of the documents occasionally included spaces.  While some team members omitted the spaces, others kept the spaces.  Also, the Excel sheet added some information to the call number, such as “XXfolio.”  Hence, we had to decide if we should keep that extra information in the file name and each group member independently decided to keep the extra information in the name, because it was easier to copy and paste the name from the Excel document.  Another difference between our file names was capital versus lowercase letters: while Ryan used capital letters, Miranda and Kylene kept the lowercase file names from the Excel sheet. Through these individual decisions, our team came to better understand how even a seemingly easy process can entail numerous unplanned, complicated, and crucial moments of choice.  

Color Correcting Process

In contrast, the color correcting process presented us with a handful of unexpected decisions right off the bat.  The color correcting process was using the software Photoshop, which is a tool some team members had not previously encountered. Particularly tricky was that our directions did not exactly fit the version of Photoshop program that the library had to offer, so we had to find our own way around the various tools.  One major problem concerned the rotating tool, which we were unable to find and use. Instead, we discovered that using the straighten tool was easier to use to rotate the image and more easily found on the program than the rotate tool, so we decided to use this tool instead.  The eyedropper tool was a little confusing to use to create the correct white balance in the photo because it moved with every shift of the mouse, making it impossible to use anything but the tab option to actually change the white balance.  After learning how to use these tools, however, the process went quickly, especially as we learned our way around Photoshop.  

Like the naming process, color correction also entailed a number of small decisions that we had to reach independently.  Using the U.S. NARA guidelines for digitization to develop our own guide, it specified that the red, blue, and green values should be around 200 for the correct white balance and color to show (Puglia et. al.).  However, we frequently could not obtain an even 200 value for each color. We had to make decision such as if values like 200, 200, and 202 was better or worse than 199, 200, and 201 values.  Since the directions seemed to prefer as many of the same values as possible, Miranda, for instance, chose to use a color balance that had as many even numbers as possible no matter if those even numbers were at 200, such as 199, 199, 201.  

Another unexpected decision that we had to make resulted from using the straightening tool.  First, we had to decide if we should use the tool based on the bottom or the top of the document.  Another choice was which lines to follow on the top of the document.  Saving the documents to jpg and tiff files had their own issues as well.  We had to copy the original file name of each file before changing the folder that the document would go in, then choose the correct tiff or jpg version of the file, and finally replace the new version of the file’s name with the old one by pasting the correct one.  We also realized that although a simple task, renaming these files once they were converted resulted in a few human errors where tiff files were accidentally mixed in with the jpg file folder and vice versa.  These unexpected problems and decisions that had to be made resulted in a much longer digitization process than the group originally allotted for the task.  Thankfully, it did not reduce from the amount of time we could work on other aspects of the project. 

Conclusion

Throughout the image capturing process, we learned that there are many difficult decisions that have to be made in order to produce a thoughtful and engaging digital project.  We also quickly learned that human and technological errors could quickly compound to add and complicate the process in ways we never imagined or anticipated.  Through the image capturing process we learned the importance of standardizing our approaches as well as double-checking not only our own work, but also the work of the other entities involved.  We also realized that in a project like ours, difficult decisions concerning aura, loss, and retention have to be made and that certain tradeoffs occur, which must be accounted for. Furthermore, in the image correcting process we learned how to confront the challenges of new and sometimes difficult technology as well as how to deal with technological errors, glitches, and potential data loss. Ultimately, the process of digitization revealed that digital humanities projects require an immense amount of patience, brainstorming, communication, and acknowledgement of and wrestling with difficult questions and concerns in order to produce a meaningful final product.

Download Our Data

Name Size / Type  
RAW Images 8.3 GB ARW Download RAW
TIF Images 13.3 GB TIF Download TIF
JPG Images 3.7 GB JPG Download JPG
JPG Cropped Images 2.7 GB JPG Download JPG (small)