Title Essays, Linked Data, and the Ethnic Press in Chronicling America
The 16 million digitized newspaper pages in Chronicling America (ChronAm) contain an almost inconceivable amount of information on virtually every aspect of the country’s history published from 1690 to 1963. Among the many highlights are the official tally of George Washington’s election as the first president of the United States, the reprinting of the Emancipation Proclamation, the establishment of the eight-hour workday, celebrations of immigrant arrivals, and Sojourner Truth’s “Ain't I A Woman?” speech. One could also look for curious or silly events like the 1949 reports of flying saucers or the dumbfounding case of a man killed nowhere.
The list of interesting articles and topics seems endless, especially considering that the number of pages in this searchable database continues to grow as more institutions contribute from their collections. However, throughout my work on ethnic newspapers as the summer intern with the Division of Preservation and Access of the NEH, I have also come to understand how important it is to think of the smaller elements that fuel and enhance ChronAm, like the metadata and title essays that accompany each newspaper title.
By this I do not mean that the number of pages or titles is small, with over 16 million and 3,185 respectively. Nor is the program, which brings together institutions from 48 states and two territories through the National Digital Newspaper Program (NDNP), a partnership between the National Endowment for the Humanities (NEH) and the Library of Congress (LC). Even the effort it takes for NEH grant recipients to select, inspect, and digitize the required 100,000 pages of historical newspapers for ChronAm is an undertaking that requires experts from different fields and institutions working collaboratively over the two-year award period of NDNP. Along with the images and data of the newspapers, recipients also send a microfilmed copy of the newspapers, their title selection list, a survey of other local collections of newspapers, and 500-word essays for each title.
In collaboration with the NDNP program coordinator, the librarians and specialists are tasked with drafting these title essays in which they summarize the history, scope, and general content of the newspaper as concisely as possible. This is challenging for all of the titles since newspapers of record like the Alexandria Gazette have thousands of issues, while local or ethnic newspapers like the wonderfully titled Golden Rule may have few extant issues. A good title essay provides a snapshot that invites the reader to dip into thousands of pages or hints at what is missing from scattered numbers. This is why, out of all of the deliverables, the title essays were, for me, the clearest indication of how “smaller” parts of ChronAm reveal vital information about the more visible elements of this digital resource.
My main project for my NEH Internship was to create structured data about the editors and the publishers described in the title essays of ethnic publications. Specifically, I matched people’s names to their Library of Congress Name Authority File (LCNAF) when available and added its unique resource identifier (URI). These name authority files are the most centralized data point that we have as they at once link, disambiguate, and standardize the unique names of people, organizations, places, events, and titles. These URIs are basic building blocks in both digital bibliographical and prosopographical records as they exist now and are increasingly coming to exist as linked open data. Historically marginalized and excluded from history, many editors who were women and people of color still do not have LCNAF records linking their names to their contributions. My task has therefore been to provide primary source evidence in the form of a ChronAm page that identifies the editors with their publications via a stable web address to the image of a page in the associated newspaper title that identifies the person’s role. Finally, I recorded the ethnic group to which these people belonged following the categories used in ChronAm, information from the title essay, and content from the newspapers.
From my own PhD research, I know how complex and imprecise the identification of an ethnic newspaper can be. Of the 19th-century New York based newspapers that I research, for example, one was published in Spanish by Cuban and Puerto Rican editors, while the other was edited by a Brazilian publisher in Portuguese. Written by foreigners and addressed mainly to immigrant communities, these newspapers could have easily been grouped as ethnic, both at that time and now using more recent definitions of the term. If these newspapers were in ChronAm, which ethnic category would be used to describe them? Given their ties to Cuba, Puerto Rico, and Brazil, they could be considered “Latin American.” But, the Spanish-language paper might have also been under the “Spanish” category since Cuba and Puerto Rico were colonies of Spain during the run of the newspaper, while the paper in Portuguese would have no such linguistic categorical home. The Puerto Rican newspapers that do exist in ChronAm have no “ethnic” designation, further complicating the problem of categorization.
Take another example from a newspaper that is in ChronAm: the Branding Iron was printed in English by a white settler in lands of the Choctaw people in Oklahoma, called “Indian Territory” at the time, and folded after four issues due to complaints from Choctaw leaders. It is described with the ethnic category “Indians of North America.” A converse example, the Free Press of Charleston, South Carolina, was edited by an Irish immigrant but mainly dealt with events affecting African Americans and is labeled as an “African American” publication. Another case is El Pelayo from New Orleans, Louisiana, which is categorized as a “Latin American” newspaper but was published to advocate for the Spanish population of the city and was edited by an immigrant from Spain.
The Branding Iron (1884) |
The Free Press (1868-186?) |
El Pelayo (1851-1852) |
Title essays can draw attention to the complex ethnic and racial ecosystems in which newspapers are produced, read, and cataloged, but again, as prose, the “data” the title essays produce is unstructured. In other words, it helps us to comprehend these ecosystems, but it is not actionable. The title essays do not allow us to make connections and see networks in the way that structured data can. When the name authority files of the identified editors are standardized and added to the bibliographic records of their associated newspapers, editor and publication will be linked and searchable by people and computer software alike. Furthermore, by enhancing the LCNAF to include editors who have, in many instances, been elided from the historic record, we hope to establish these editors and their important contributions to print cultures in a permanent way.