Discovery Files

Digitizing the Past to Protect and Preserve History

Using supercomputers to help protect and preserve ancient sites and artifacts

When Adam Rabinowitz was 15 years old, his aunt, an archaeologist, invited him to join her on a dig in Sicily.

More than two decades later, Rabinowitz, now the assistant director at the Institute of Classical Archaeology at The University of Texas at Austin, is still travelling around the world getting dirt under his nails. And though much remains the same about archaeology since he first picked up a trowel, a lot has changed.

In previous eras, researchers logged their data in notebooks, which were preserved along with photographs, maps and objects, in a physical archive. Rabinowitz can still access the notebooks and negatives of people who conducted research more than a hundred years ago at the same sites he is exploring. Today, archaeologists are more likely to take thousands of digital photos, make notes in a database on a laptop or a tablet, and record careful, geographically referenced information that only a computer can interpret.

"The development of digital technologies has exponentially magnified the amount of data we're collecting, simply because we have the tools now to collect a lot more information much more easily than we did in the past," Rabinowitz said.

Digging in the digital age

However, the ability to manage technology often lags behind the capability of the technology itself, as Rabinowitz knows personally.

"Digs that I've participated in have produced information that is now digitally gone because the platforms and the storage mechanisms became obsolete, and that's in the space of ten years," he said. "When we look down the road and ask, 'What will we leave for people 25 years from now, 100 years from now?' we're faced with a huge issue that people are just starting to confront."

Over the course of 16 years, researchers have developed a rich dataset related to research in the urban center and agricultural territory of Chersonesos, a Greek colony on the Crimean peninsula that thrived through the Byzantine age. Thanks to support from the Packard Humanities Institute, the Institute of Classical Archaeology was able to use increasingly sophisticated digital methodologies to document its excavations. But by 2008, some of the systems that organized the digital data sat on a single portable server that the team carried back and forth to Ukraine and that, say the researchers, "could have blown up at any time."

The situation led the team to think carefully about what would happen to this complex relational dataset as technologies changed. They turned to the National Science Foundation-supported Texas Advanced Computing Center, one of the leading academic computing centers in the nation, to preserve their data in ways that would make it possible for future researchers to harness the richness of digital information to develop a greater understanding of the past.

Creating digital archives

Working with Maria Esteva and David Walling, the computing center's digital archivist and data applications expert, respectively, Rabinowitz developed a state-of-the-art data management system and framework for the long-term preservation and reuse of data from the Chersonesos project.

To illustrate the power of digital approaches to the contextual analysis of archaeological data, Rabinowitz points to an interactive map his team created of a Byzantine residential block at Chersonesos, excavated between 2001 and 2006. The block was pillaged and burned in the middle of the 13th century, and was left as a snapshot of life at that moment. A padlock found at the site, dating to the late 12th or early 13th century C.E., serves as an example of why context is key.

It is standard practice to document an object by photographing it after it is pulled from the ground. Spatial and contextual information, however, adds another dimension: On the digital map, you can see exactly where the lock was found smashed into two pieces, perhaps by a looter's ax.

The excavation database and map also work together to display information about other items found nearby, including an iron bucket and the skeleton of a woman in her fifties left lying in the street under the collapsed rubble of a roof. Together, the objects paint a much more vivid picture of this moment of destruction than they would individually. But the relationships between digital files that make this contextual view possible are also very vulnerable to changes in software and hardware, and can only be preserved if they are independent of the programs used to display them.

Preserving the past

So the collaboration between the two groups has focused on the creation of a storage system that allows users to extract "metadata"--data about data--automatically from each individual file. When digital files are ingested into the system, it uses information from file names and database records to create documents that describe the type, format and contextual associations of each file. When files or database records are added or changed, the metadata documents for each file are automatically updated to reflect the changes. That will allow future scholars to make sense of original digital data, like image files, in relation to other objects in the dataset, even when the software used today is obsolete.

"We're preserving the data gathered at the site, and we're also documenting the documentation process itself," said Esteva.

Three years of collaboration were required to complete the data archiving system. Now that the infrastructure is complete, the team is working to make the primary archaeological data available to other archaeologists. The methodology can also be generalized to other research topics in the humanities and social sciences where scholars are struggling with the long-term preservation of complex digital data.

"We have to take care of the research data collections so they can be reused in the future to answer new questions and to make discoveries," said Esteva.

-- Aaron Dubrow, Texas Advanced Computing Center, aarondubrow@tacc.utexas.edu

This Behind the Scenes article was provided to LiveScience in partnership with the National Science Foundation.