Documerica: A Brief History | Search Functionality | Clusters | Environmental Impact | Funding
Digital Documerica broadens knowledge and engagement with the Documerica project through a set of tools and resources to explore the photographic. The digital, humanities project was created by the Distant Viewing Lab at the University of Richmond by Taylor Arnold and Lauren Tilton. We provide several search features including a map, exploration by topic clusters, and open search. The authoritative archive for Documerica is the National Archives of the United States (NARA), which holds the physical prints, extensive archival material, and copyright information.
If you engage with Digital Documerica for your work, please cite as:
T. Arnold and L. Tilton (2025) "Digital Documerica: Exploring Environmental Photography from the 1970s." URL: https://digitaldocumerica.org
Americans' concerns for the environment grew significantly following World War II. The post-war affluence, marked by cleared forests for segregated suburbs, land contamination from nuclear testing, polluted waterways from chemical plants, and litter from pedestrians, raised serious questions. The concerns led to a series of best sellers, chart toppers, and box office returns that circulated these concerns in print, over the airwaves, and on screens. Rachel Carson’s best-seller Silent Spring (1962) vividly depicted the detrimental effects of pesticides on the land, while CBS News aired the first Earth Day on April 22, 1970, a significant event where 20 million people went outside to hike, sing, and protest (CBS News: Earth Day 1970 Flashback). Along with the civil rights and anti-war movements, the environmental movement proved impossible for the nation’s leaders to ignore.
President Nixon supported a cascade of laws in 1970 including establishing a new federal agency known as the EPA. The legislation charged the Environmental Protection Agency with setting the nation’s environmental standards, enforcing regulations, and researching new rules to improve the country's health. Building on legislation passed by the Johnson Administration, the Clean Air Act of 1970 and Clean Water Act of 1972 set the agenda for the EPA alongside new regulations about pesticides, radiation, and waste. How to enforce the laws and track progress and setbacks became key questions.
With the administration’s support in 1972, Gifford Hampshire launched DOCUMERICA, a photography unit charged with documenting the current state of the nation’s environment, followed by improvements and challenges of implementing the new laws. The photo editor turned government relations expert drew inspiration from the FSA photographers of the Great Depression and World War II, whose aims included developing a portrait of the nation amid significant turbulence. Among their ranks included Walker Evans, Dorothea Lange, and Gordon Parks (see: Photogrammar.org). Arthur Rothstein and fellow FSA photographer John Vachon lent their support and insights to the new unit modeled on their achievements. Photographers were charged with documenting the environmental state of the nation, a kind of visual evidence to augment the data, statistics, and research reports, along with supporting the enforcement of regulations. Deeply concerned about the future, they also set out to demonstrate the power of photography for environmental advocacy.
Hampshire hired over 100 photographers to develop the visual baseline using Kodachrome and Ektachrome, a color slide film from Kodak. He hired experienced photojournalists like Jack Corn, Danny Lyon, and Charles O’Rear. Guided by the EPA’s legal mandate, they set out to document pollution, waste, and contamination. It is not always easy since the culprit, such as a toxic gas released into the air, may not always be visible. Phase 1 focused on the initial baseline from 1971 – 1973. Phase 2 ran until 1977 with two priorities: rephotographing any significant changes documented in Phase 1 and expanding the baseline to the latest policy developments to land use, energy, and people’s lifestyles that adversely impacted the environment. Subjects included crowded highways, contaminated water, and accumulated garbage. Inspired by biologist Barry Commoner’s four laws of ecology, the unofficial slogan of Documerica and the EPA became "everything is connected to everything else." Photography has the potential to turn words into powerful images.
The EPA produced nearly 16,000 photographs. The images circulated in government reports, newspapers, and exhibitions. Yet, their reach quickly stopped when Documerica shuttered in 1977 due to shifting priorities. While The EPA’s workforce and budget continued to expand, the amount of work exceeded these significant investments, and priorities moved to other forms of documentation and enforcement. Any hopes of reviving Documerica receded with the 1980 election of Ronald Reagan, whose administration sought to reduce the size of the federal government. In 1981, the National Archives in Bethesda acquired the collection and later digitized the slides. The hopes of Documerica are still to be realized.
As you explore Documerica, questions to consider:
Which photographs catch your attention and why?
Which photographer(s) work stands out to you and why?
Pick a cluster. What can we learn about the topic?
How do you feel when you look at these photographs?
What have you learned about the environment in 1970s America by looking at the pictures and reading the captions?
What additional information would you like to know, and where might you go to find it?
The search bar on Digital Documerica allows for querying both structured archival metadata and automatically-created tags generated with AI models. The search will query the archival title, location, photographer, and automatically generated image captions. By default, we only return results that include all the terms in the query. The words do not need to appear in the same order. The algorithm also ignores capitalization. For example, if we type the word rain, the search will return images that have "rain" in their caption as well as any photographs taken by Belinda Rain. By default, we return results for words that start with your request. So, our example query rain will also match captions that contain the word raining or raindrop among others.
To force the model to only return words in the given order or in a specific form, the search term(s) can be enclosed in quotes. If we search for the string "blue bird", the results will only return results where blue and bird appear together. Similarly, "bird " (with a space before the final quote) will only find examples where bird is followed by a space.
In addition to filtering, the query also includes an ordering to the search results. We use a multimodal model called SigLIP to estimate how likely the search query would be a reasonable caption for the results. The returned results are ordered from the most to the least likely to be a caption for a given search. If no results are found, the entire collection is returned ordered by using the same logic: the likeliness to be the caption for that image. This allows for finding possible matches even if the specific words in the query are not used in the archival or AI-generated captions. For example, one might consider searching the word feline. Although there are images with captions that use the specific names of felines ('cat', 'leopard', and 'tiger'), no caption directly uses this specific term. However, sorting the entire collection by the term 'feline' using the SigLIP models provides an alternative way of finding the images of felines in the archive. It is possible to only sort (and not filter) the results by adding the special tag :sort to the query string. This can be useful in cases where there are a small number of exact matches but many more close matches.
The automatically generated captions were created with OpenAI's GPT-4-turbo model. We generated the captions by asking the model to "Provide a detailed plain-text description of the objects, activities, people, background and/or composition of this photograph." The captions can be found at the bottom of each of the individual photograph pages. While these are far from perfect, making frequent minor errors and occasional major ones, we find that the resulting captions allow for finding images related to topics featured in the images but not mentioned directly in the (often short) photographic captions from the 1970s. To learn more about this approach and to download the entire set of captions, see our recent article on using multimodal AI models for search and discovery linked below.
To search for a specific location, start the search string with the marker location:. Similarly, we can specify the desired photographer with photographer: and the cluster with cluster:. These search terms must match exactly; we recommend using the links on the map, cluster, and photographer pages (or on the individual photograph pages) to query by specific fields. When specifying location, photographer, or cluster with these special tags, the tags are ignored in the sorting of the results using SigLIP as described above.
Please see the following papers for more details about the design and algorithms underlying the project. All publications are freely available.
T. Arnold and L. Tilton (2024) "Explainable Search and Discovery of Visual Cultural Heritage Collections with Multimodal Large Language Models." Proceedings of the Computational Humanities Research Conference. [pdf] [data]
T. Arnold and L. Tilton (2024) "Automated Image Color Mapping for a Historic Photographic Collection." Proceedings of the Computational Humanities Research Conference. (Best Short Paper, CHR 2024) [pdf] [data]
The links above include reproducable code and a downloadable version of the dataset.
We have associated each of the digitized photographs in the Documerica collection to one of 52 clusters. The clusters attempt to group together photographs by their primary subject matter and composition. These clusters have been automatically generated using the AI-created captions described in the previous section. Specifically, we used an generative model to summarize the 50 themes in the captions based on the collection of captions. Then, we assigned each photograph to the cluster name to which it has the closest SigLIP embedding. In other words, which of the cluster names would most likely be the caption for a photograph. These clusters should not be treated as authorative labels for the images. Rather, they are a potentially useful tool for understanding the scope of the collection that must be augmented with a close analysis of individual images.
There are important ongoing discussions and concerns regarding the environmental impacts of generative artifical intelligence models. Particularly given that the the goal of Documerica was to highlight human-caused environment issues, we have aimed to be careful to minimize the environmental impact of the AI models used in the project. We ran the AI models a single time through the entire collection, and the results are now stored locally on our server. The process of running all of the models used one GPU for an approximate duration of 20 hours. This required roughly 6.7 kilowatt hours, creating around 3.4 kilograms (1.7 cubic meters) of carbon emmisions, or approximately the amount of emmisions generated driving a standard-sized car a distance of 24 kilometers (15 miles) [source].
Digital Documerica is funded in part by a grant from the Mellon Foundation. Funding has also been provided by the University of Richmond for continued work and development.
Names of the photographers for all images in your current search query. The numbers show the total number of images from a photographer that are part of your results. Click on a photographer to further filter the search, or clear the search bar above to see all photographers.
A map with a circle showing each of the locations corresponding to images in your current search query. Larger circles correspond to more photographs. Zoom in and out to see further details. Hovering over a circle shows the number of photographs in a given location from your current search. Click on a circle to further filter the search, or clear the search bar above to see all photographers.
Search Functionality + AI-Generated Data
The search bar on the Digital Documerica allows for searching both structured archival metadata and automatically created tags generated with AI models. By default, a search will show all images that include every search time in your query in some combination of the archival title, location, photographer or automatically generated image caption. The words do not need to appear in order and the search ignore capitalization. It also, by default, will return results for words that start with your request. For example, the query cat will match both cats and Catherine. To force the model to only return words in the given order or in a specific form, the search term(s) can be enclosed in quotes. For example, "white cat" will only return results where white and cat appear together and "cat " (with a space before the final quote) will only find examples where cat is followed by a space.
In addition to filtering the search results, the query also provides an ordering to the images that are found. We use a multimodal model called SigLIP to estimate how likely the search query would be to be a reasonable caption for a given photograph in the output. The returned results are ordered from the most to the least likely to be a caption for a given search. If no results are found, the entire collection is returned ordered by the search query. This allows for finding possible matches even if the specific words in the query are not used in the archival or AI-generated captions. It is possible to only sort (and not filter) the results by adding the special tag :sort to the query string.
The automatically generated captions were created with OpenAI's GPT-4-turbo model. We generated the captions by asking the model to "Provide a detailed plain-text description of the objects, activities, people, background and/or composition of this photograph". The captions can be found at the bottom of each of the individual photograph pages. While these are far from perfect, making frequent minor errors and occasional major ones, we find that the future well to allow for finding images related to topics features in the images but not mentioned directly in the (often short) photographic captions. To learn more about this approach and download the entire set of captions, see our recent article on using multimodal AI models for search and discovery (Arnold and Tilton, 2024).
To search along a specific location, start the part of the search string with the marker location:. Similarly, you can specify the desired photographer with photographer: and the cluster with cluster:. These search terms must match exactly; we recommend using the links on the map, cluster and photographer pages (or on the individual photograph pages) to query by specific fields. When specifying location, photographer or cluster with these special tags, the tags are ignored in the sorting of the results using SigLIP as described above.