What is data rescue?

DataRescue is a one-day hack-a-thon type event welcome to volunteers from all backgrounds and technical abilities. Following a workflow developed from the initial DataRescuePhilly event at UPenn, we will create trustworthy copies of federal climate change and environmental data.

While the Internet Archive has been able to archive many government websites, it is unable to archive other types of information, particularly datasets. DataRescue events are a key piece in ensuring that these datasets are copied. The Internet Archive, datarefuge.org and a consortium of research libraries hold these copies and keep them available for public access.

#DataRescueDenton is an opportunity for programmers, scientists, archivists, activists, and volunteers of all kinds to identify, back-up, and help preserve publicly accessible federal data resources in the public interest, in the event they are removed from public view and use. We will hear from experts on the challenges of preserving born-digital at-risk data and related issues followed by breakout sessions for the hack-a-thon.

Please follow us on social media, sign up for the event here, or email datarescuedenton@gmail.com for more info.


The U.S. federal government hosts a multitide of different scientific datasets, mainly through executive branch agencies like the EPA, NASA, NOAA, the Department of Energy, and the Department of the Interior. Recently, many in the federal government at a variety of levels have communicated their antagonism toward the scientists, researchers, and civil servants who study and inform the public about a wide variety of topics. The federal government has already moved to expunge information related to climate change from whitehouse.gov and introduced legislation to abolish the EPA. We seek to protect vital public scientific and data resources from suppression by archiving them in multiple, redundant repositories so they may be accessed and reused after potentially disappearing.



#DataRescueDenton is open to volunteers from all backgrounds and technical abilities. For those who want to participate in the archiving itself, there are four main tracks with varying degrees of technical know-how required – Seeders, Harvesters, Checkers, & Describers. If you’d rather write about the event, the Storytelling track may be best for you. If you’re not sure where you fit into all of this or if you’d rather just come to listen and learn, there’s plenty of room for all!


  • Seeding and Sorting (to feed the End of Term Archive): This is the widest path and requires a variety of skill levels. Seeders and Sorters canvass the resources of a given government agency, identifying important URLs. They identify whether those URLs can be crawled by the Internet Archive’s webcrawler.
  • Harvesters take the uncrawlable data and try to figure out how to actually capture it based on the recommendations of the Researchers. This is a complex task which can require substantial technical expertise, and which requires different techniques for different tasks. Consider this path if you have hacking and tech skills.
  • Checkers & Baggers inspect harvested datasets to make sure that they are complete and have all the information necessary to make sure they are useful for scientists. Checkers need to have an in-depth understanding of harvesting goals and potential content variations for datasets. Once the data has been validated, Checkers sign off on it and put it in a BagIt file.
  • Describers then compare the data against the original source and add more detailed provenance metadata then put it into the DataRefuge.org repository. Consider this path if you have experience working with scientific data (particularly environmental or climate data) or creating metadata. Trained librarians and scientists will be very helpful on this path.
  • Storytellers help record and publicize stories about the importance of climate and environmental data on our everyday lives and share the work on social media. Consider this path if you have skills in writing and social media, arts, blogging, photography, journalism, and media. We invite you to write field notes or story about who and which local communities, organizations, and institutions currently use specific datasets, and how.


Don’t have tech skills but want to help? Become a Seeder.


Are you a hacker? Harvesting is the path for you.

Checkers & Baggers

Have you worked with scientific data before? Help check our work!


Librarians and scientists will be helpful on this path.


Are you best at social media and writing? Come tell our stories!

Have questions? Contact us:


Saturday, May 20, 2017

Stoke | Denton, Texas

  • 8:00: Check-in
  • 9:00: Opening remarks & introductions
  • 9:15: Program begins
  • 12:00: Lunch
  • 1:00: Archive-a-thon & workshops
  • 4:00: Wrap-up & closing remarks
  • 5:00: Event ends.



Join the conversation:

Contact us: