Preserving Content fromYour InstitutionalRepositoryWendy C Robertson and Carol Ann BorchertNASIG, Buffalo, N.Y., June 8 20...
The Signalhttp://blogs.loc.gov/digitalpreservation/
“”a permanent, institution-wide repository ofdiverse, locally produced digital works (e.g.,article preprints and postprint...
An institutional repository is not…Most IRs currently are not preservationrepositories; they do not meet all the criteriai...
10 basic characteristics of digitalpreservation repositories (CRL)1. The repository commits to continuing maintenance ofdi...
10 basic characteristics (cont.)5. Acquires and ingests digital objects based upon statedcriteria that correspond to its c...
10 basic characteristics (cont.)7. Creates and maintains requisite metadata aboutactions taken on digital objects during p...
The year is 2100. Can you read your files?
Our questions for you• Who has an IR?• What platform are you using?• Who’s backing it up?• Who’s part of a PLN?• Who’s hav...
Localized disasters
FireFloodhttp://chronicle.com/blogs/wiredcampus/what-katrina-can-teach-libraries-about-sandy-and-other-disasters/40986Hurr...
WarTsunamiEarthquake© 2011 UMD Librarieshttp://savemlak.jp/wiki/saveMLAK/en?lang=en&uselang=enhttp://www.flickr.com/photos...
Disasters with warningMoving servers outof the University ofIowa Libraries, 2008.© 2008 The University of Iowahttp://digit...
Disasters with no warningUniversity of SouthFlorida, verylocalized floodhttp://lib.usf.edu/offtheshelf/tampa-library/the-f...
“”Disaster recovery strategies and backupsystems are not sufficient to ensure survivaland access to authentic digital reso...
Exit strategyMake sure you can easily migrate all yourcontent and metadata out of your system in ausable format.
Test, test and test some moreTest that all files are as expected regardingstructure and completeness.
Persistent identifiersUsing persistent identifiers now will help ifyou move to a new repository in the future.
Preserving the WebYou may want archiveinstitutional contentthat is notappropriate for an IRbut which isappropriate for the...
Archive-ItArchive-It canpreserve journalsand otherscholarly workfrom yourinstitution thatdoesn’t go intoyour repository.ht...
Internet Archive“The Montana State Library(MSL) last year moved acopy of its collection of3000 born digital statepublicati...
IRs are a bit different…The copy of the document in the repositoryoften is the only version you have.
Access copy vs. preservation copyDigitized content may have a preservationscan as well as the version which displays tothe...
IRs have special problems…Automatically adding a cover page to brandand identify content has change the file,perhaps even ...
File formatsWhen possible, use open file formats socontent will remain accessible long into thefuture, but will you turn d...
PDF/A (ISO 19005-1:2005)PDF/A is an ISO standard“which provides amechanism forrepresenting electronicdocuments in a manner...
U Iowa electronic theses & dissertations1931 PDFs and 7 XML documentsSupplemented by:21 .avi1 .avp8 .doc2 .mov2 .mp31 .mp4...
Public preservation policyMake yourpreservation andsubmission policyclear so thatcontributorsunderstand therisks ofcontrib...
Preservation metadataPREMIS (PREservation MetadataImplementation Strategies)“Preservation metadata supportsactivities inte...
“”Metadata can help support authenticity bydocumenting the digital provenanceof the resource — its chain of custody andaut...
Methods of preserving data• Refreshing data• Migrating data• Emulating software platform• Replicating• Validating data int...
Long-term preservation options• Global LOCKSS Network• Private LOCKSS Network• Portico
Global LOCKSS Network• For e-journal content• Preserves the format as well as the content• Light archive• Adding journals ...
Private LOCKSS Network• All material from the IR• Need at least 7 nodes/destinations• Each should be a LOCKSS Alliance mem...
Setting up policies for a PLN• How long is initialcommitment?• How much notice towithdraw?• How do members removedata for ...
Examples of PLNs
Portico• For e-books and e-journals• Source files converted to an archiveformat• Dark archive• Portico is responsible for ...
Factors to consider in developing a formalpreservation plan• Organizational &financial commitment• Stakeholders• Local bac...
Organizational & financial commitment•What is the long-term financial commitmentfrom your library or institution?•Do you h...
Stakeholders•Producers•Users•Owners•Managers•Funding authorities•Other parties?
Local backups vs. long-term preservation•Definition of backups versus preservation•Metadata, content, software, or all of ...
Storage needsDisk space How muchspace do youneed? Who isresponsible formaintainingdisks?Software Whichsoftware willbe r...
Roles & responsibilities•Who is implementing the plan?•Who is maintaining the data and how?•Who is providing support for a...
Data ingestion•How are you getting data into the systemfor preservation or backup?•Will this be done in-house or outsource...
Funding vs. staffing• Is it easier to fund these efforts at your organization orstaff them?• How well-staffed is your orga...
Questions?Wendy RobertsonDigital Scholarship LibrarianUniversity of Iowa Librarieswendy-robertson@uiowa.edu@wendycr_ Carol...
SourcesBall, Alex. Preservation and Curation in Institutional Repositories. DigitalCuration Centre, UKOLN, 2010. Version 1...
SourcesNestor Working Group. Catalogue of Criteria for Trusted Digital Repositories.Frankfurt am Main, Dec. 2006. Urn: de:...
SourcesTrustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC).Version 1.0. Feb 2007. http://www.crl...
of 49

Preserving Content from Your Institutional Repository

Between institutional repositories and hosting journals, many libraries are becoming responsible for scholarly content in new ways. While PDFs are the most common format today, the unique, local, serial content may be in variety of formats. These items may be digitized text, born digital text, audio, video, or images. This presentation will discuss formats that will remain accessible through time (PDF/A, txt, xml) so that content is not locked in proprietary formats. It will also discuss options for backing up items and associated metadata, including simple back-ups, off-site storage of files, LOCKSS, Private LOCKSS Networks, and Portico. The presenters will offer suggestions for how to ensure your local content is being preserved properly. Carol Ann Borchert Coordinator for Serials, University of South Florida Carol Ann Borchert has been the Coordinator for Serials at the University of South Florida (USF) since 2004. Previously, she was in the Reference and Government Documents departments at USF, and in several areas of the James B. Duke Library at Furman University. She holds an MLS from the University of Kentucky and an M.A. in Spanish from USF. Wendy Robertson University of Iowa Wendy Robertson, Digital Scholarship Librarian has worked as a librarian at The University of Iowa Libraries since 2001. Her previous work positions include Electronic Resources Systems Librarian in Enterprise Applications, Electronic Resources Management Unit Head in Technical Services, and Electronic Resources Technical Services Librarian in Serials. She holds an MLS from The University of Iowa.
Published on: Mar 4, 2016
Published in: Education      Technology      
Source: www.slideshare.net


Transcripts - Preserving Content from Your Institutional Repository

  • 1. Preserving Content fromYour InstitutionalRepositoryWendy C Robertson and Carol Ann BorchertNASIG, Buffalo, N.Y., June 8 2013This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.
  • 2. The Signalhttp://blogs.loc.gov/digitalpreservation/
  • 3. “”a permanent, institution-wide repository ofdiverse, locally produced digital works (e.g.,article preprints and postprints, data sets,electronic theses and dissertations, learningobjects, and technical reports) that is available forpublic use and supports metadata harvesting.University of Houston Libraries, Institutional Repository Task Force. Institutional Repositories. SPECKit 292. July 2006. p.13An institutional repository is…
  • 4. An institutional repository is not…Most IRs currently are not preservationrepositories; they do not meet all the criteriain Trustworthy Repositories Audit &Certification (TRAC) or other audits.
  • 5. 10 basic characteristics of digitalpreservation repositories (CRL)1. The repository commits to continuing maintenance ofdigital objects for identified community/communities.2. Demonstrates organizational fitness (includingfinancial, staffing, and processes) to fulfill itscommitment.3. Acquires and maintains requisite contractual and legalrights and fulfills responsibilities.4. Has an effective and efficient policy framework.
  • 6. 10 basic characteristics (cont.)5. Acquires and ingests digital objects based upon statedcriteria that correspond to its commitments andcapabilities.6. Maintains/ensures the integrity, authenticity andusability of digital objects it holds over time.
  • 7. 10 basic characteristics (cont.)7. Creates and maintains requisite metadata aboutactions taken on digital objects during preservation aswell as about the relevant production, access support,and usage process contexts before preservation.8. Fulfills requisite dissemination requirements.9. Has a strategic program for preservation planning andaction.10.Has technical infrastructure adequate to continuingmaintenance and security of its digital objects.
  • 8. The year is 2100. Can you read your files?
  • 9. Our questions for you• Who has an IR?• What platform are you using?• Who’s backing it up?• Who’s part of a PLN?• Who’s having their IR journalspreserved in LOCKSS or Portico?Question mark sign by Colin_K, on Flickr
  • 10. Localized disasters
  • 11. FireFloodhttp://chronicle.com/blogs/wiredcampus/what-katrina-can-teach-libraries-about-sandy-and-other-disasters/40986HurricaneTornadohttp://blog.al.com/spotnews/2011/10/plans_to_rebuild_birmingham_li.htmlhttp://www.ncsml.org/Content/About-Us/Museum-History/2008-Flood.aspx http://news.bbc.co.uk/onthisday/hi/dates/stories/august/1/newsid_2526000/2526839.stm
  • 12. WarTsunamiEarthquake© 2011 UMD Librarieshttp://savemlak.jp/wiki/saveMLAK/en?lang=en&uselang=enhttp://www.flickr.com/photos/umd_libraries/6075914283/in/set-72157627383474133http://www.bostonglobe.com/business/2013/02/17/the-smuggled-hard-drives-timbuktu/rCyv0QL1FdLLkw4tjv6hDO/story.html
  • 13. Disasters with warningMoving servers outof the University ofIowa Libraries, 2008.© 2008 The University of Iowahttp://digital.lib.uiowa.edu/cdm/ref/collection/flood/id/3414
  • 14. Disasters with no warningUniversity of SouthFlorida, verylocalized floodhttp://lib.usf.edu/offtheshelf/tampa-library/the-flood-of-09dedication-in-the-face-of-disaster/
  • 15. “”Disaster recovery strategies and backupsystems are not sufficient to ensure survivaland access to authentic digital resources overtime. A backup is a short-term data recoverysolution following loss or corruption andis fundamentally different to an electronicpreservation archive.JISC. Digital Preservation: Continued Access to Authentic Digital Assets(November 2006)Backups vs. preservation
  • 16. Exit strategyMake sure you can easily migrate all yourcontent and metadata out of your system in ausable format.
  • 17. Test, test and test some moreTest that all files are as expected regardingstructure and completeness.
  • 18. Persistent identifiersUsing persistent identifiers now will help ifyou move to a new repository in the future.
  • 19. Preserving the WebYou may want archiveinstitutional contentthat is notappropriate for an IRbut which isappropriate for thelibrary’s mission.http://dx.doi.org/10.7207/twr13-01
  • 20. Archive-ItArchive-It canpreserve journalsand otherscholarly workfrom yourinstitution thatdoesn’t go intoyour repository.http://archive-it.org/collections/824
  • 21. Internet Archive“The Montana State Library(MSL) last year moved acopy of its collection of3000 born digital statepublications to the InternetArchive (IA).”—ChrisStockwell for Montana StateLibrary, 12/29/2010http://archive.org/post/340223/how-montana-state-library-uploaded-batches-of-digital-objects-to-the-internet-archivehttp://archive.org/details/MontanaStateLibrary
  • 22. IRs are a bit different…The copy of the document in the repositoryoften is the only version you have.
  • 23. Access copy vs. preservation copyDigitized content may have a preservationscan as well as the version which displays tothe public.
  • 24. IRs have special problems…Automatically adding a cover page to brandand identify content has change the file,perhaps even removing accessibility features.
  • 25. File formatsWhen possible, use open file formats socontent will remain accessible long into thefuture, but will you turn down content inother formats?
  • 26. PDF/A (ISO 19005-1:2005)PDF/A is an ISO standard“which provides amechanism forrepresenting electronicdocuments in a mannerthat preserves theirvisual appearance overtime, independent ofthe tools and systemsfor creating or rendingthe files.”http://www.pdfa.org/publication/pdfa-in-a-nutshell-2-0/
  • 27. U Iowa electronic theses & dissertations1931 PDFs and 7 XML documentsSupplemented by:21 .avi1 .avp8 .doc2 .mov2 .mp31 .mp44 .mpg1 .mxf3 .NTS2 .pde6 .pdf4 .txt3 .wmv18 .xls2 .zip
  • 28. Public preservation policyMake yourpreservation andsubmission policyclear so thatcontributorsunderstand therisks ofcontributing a non-open format.http://services.ideals.illinois.edu/wiki/bin/view/IDEALS/PreservationSupportPolicy
  • 29. Preservation metadataPREMIS (PREservation MetadataImplementation Strategies)“Preservation metadata supportsactivities intended to ensure thelong-term usability of a digitalresource.”—Caplan, p.3http://www.loc.gov/standards/premis/understanding-premis.pdf
  • 30. “”Metadata can help support authenticity bydocumenting the digital provenanceof the resource — its chain of custody andauthorized change history.Caplan, Priscilla. Understanding PREMIS. Library of Congress, ©2009. p.3Digital provenance
  • 31. Methods of preserving data• Refreshing data• Migrating data• Emulating software platform• Replicating• Validating data integrity• Metadata
  • 32. Long-term preservation options• Global LOCKSS Network• Private LOCKSS Network• Portico
  • 33. Global LOCKSS Network• For e-journal content• Preserves the format as well as the content• Light archive• Adding journals to LOCKSS• Notify LOCKSS of metadata/file changes• Not all serials are appropriate for GlobalLOCKSS
  • 34. Private LOCKSS Network• All material from the IR• Need at least 7 nodes/destinations• Each should be a LOCKSS Alliance member• Set up policies and governance for the PLN
  • 35. Setting up policies for a PLN• How long is initialcommitment?• How much notice towithdraw?• How do members removedata for withdrawninstitution?• Does the group need agoverning body or steeringcommittee?• Will the PLN be a dark orlight archive?• Do any of the membershave embargoedmaterials?
  • 36. Examples of PLNs
  • 37. Portico• For e-books and e-journals• Source files converted to an archiveformat• Dark archive• Portico is responsible for future contentmigrations• Adding journals to Portico• Not all serials are appropriate for Portico
  • 38. Factors to consider in developing a formalpreservation plan• Organizational &financial commitment• Stakeholders• Local backups vs. long-term preservation• Storage needs• Roles & responsibilities• Data ingestion• Policy on deletion of orembargoes for materials• Funding• Staff
  • 39. Organizational & financial commitment•What is the long-term financial commitmentfrom your library or institution?•Do you have the support of the organization?From what level of administration?
  • 40. Stakeholders•Producers•Users•Owners•Managers•Funding authorities•Other parties?
  • 41. Local backups vs. long-term preservation•Definition of backups versus preservation•Metadata, content, software, or all of these?•How often and who is responsible?•PLN or other option for long-term preservation
  • 42. Storage needsDisk space How muchspace do youneed? Who isresponsible formaintainingdisks?Software Whichsoftware willbe required? Who migratesinformation assoftware needschange?Equipment Whatequipment willyou need? Who will fundthe equipment,set it up,maintain it?
  • 43. Roles & responsibilities•Who is implementing the plan?•Who is maintaining the data and how?•Who is providing support for accessingmaterial and troubleshooting issues?
  • 44. Data ingestion•How are you getting data into the systemfor preservation or backup?•Will this be done in-house or outsourced toa third party?•How frequently and in what format?
  • 45. Funding vs. staffing• Is it easier to fund these efforts at your organization orstaff them?• How well-staffed is your organization?• What kind of expertise do you have (or not have) in thelibrary?• What level of commitment does your organization haveto preserve digital information?
  • 46. Questions?Wendy RobertsonDigital Scholarship LibrarianUniversity of Iowa Librarieswendy-robertson@uiowa.edu@wendycr_ Carol Ann BorchertCoordinator for SerialsUniversity of South Florida Librariesborchert@usf.edu
  • 47. SourcesBall, Alex. Preservation and Curation in Institutional Repositories. DigitalCuration Centre, UKOLN, 2010. Version 1.3http://www.dcc.ac.uk/sites/default/files/documents/reports/irpc-report-v1.3.pdfCaplan, Priscilla. Understanding PREMIS. Library of Congress, ©2009.http://www.loc.gov/standards/premis/understanding-premis.pdfDigital Repository Audit Method Based On Risk Assessment (DRAMBORA). Glasgow,2009. http://www.dcc.ac.uk/resources/repository-audit-and-assessment/dramboraJISC. Digital Preservation: Continued Access to Authentic Digital Assets (Nov.2006)http://www.jisc.ac.uk/publications/briefingpapers/2006/pub_digipreservationbp.aspx
  • 48. SourcesNestor Working Group. Catalogue of Criteria for Trusted Digital Repositories.Frankfurt am Main, Dec. 2006. Urn: de:0008-2006060703OpenDOAR Policies Tool. http://www.opendoar.org/tools/en/policies.phpOettler, Alexandra. PDF/A in a Nutshell 2.0: PDF for long-term archiving. Berlin:Association for Digital Document Standards e. V., ©2013.http://www.pdfa.org/wp-content/uploads/2013/04/PDFA_in_a_Nutshell_21.pdfPennock, Maureen. Web-Archiving. DPC Technology Watch Report 12-01 March2013. DOI: http://dx.doi.org/10.7207/twr13-01Reference Model for an Open Archival Information System (OAIS). RecommendedPractice CCSDS 650.0-M-2. Magenta Book, June 2012.http://public.ccsds.org/publications/archive/650x0m2.pdf
  • 49. SourcesTrustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC).Version 1.0. Feb 2007. http://www.crl.edu/archiving-preservation/digital-archives/metrics-assessing-and-certifying/tracUniversity of Houston Libraries, Institutional Repository Task Force. InstitutionalRepositories. SPEC Kit 292. July 2006.http://publications.arl.org/Institutional-Repositories-SPEC-Kit-292/3University of Illinois at Urbana-Champaign. “IDEALS Digital Preservation SupportPolicy.” ©2013https://services.ideals.illinois.edu/wiki/bin/view/IDEALS/PreservationSupportPolicyUniversity of Illinois at Urbana-Champaign. “Preparing Items for Deposit intoIDEALS. File Format Recommendations” ©2013https://services.ideals.illinois.edu/wiki/bin/view/IDEALS/SubmissionPrep#File_Format_Recommendations