Edward  M.  Corrado  |  ecorrado@binghamton.edu   Binghamton  University  Libraries    |    http://library...
Binghamton  University,  one  of  four  comprehensive  doctoral  research    universities  within   the  Stat...
Collections profile
Our facilities
¡  Backups  ≠  Preservation   §  “Actions  required  to  maintain  access   to  digital  mater...
¡  While  Digital  preservation  can  support  Open   access  and/or  Open  data,  preservation  does  not ...
¡  JISC  Beginner’s  Guide  to  Digital  Preservation  elaborates:   §  Managed:  Digital  preservation	...
¡  Local  Content  as  the  Future  of  [Academic]  Libraries?   §  At  least  in  regards  to  Ph...
¡  Why  Libraries?   ¡  Libraries  have  been  preserving   information  for  centuries ...
“Digitization for Access is Not Preservation; Without Preservation,There is No Access”
Licensed    commercial     OCLC’s   Innovative’s  e-­‐resources   CONTENTdm   Conten...
¡  Adhere  to  International  Standards   §  "Librarians  can  take  over  the  world.”  (Dr.  Barry  Smit...
¡  Experimented  with  various  “Digital  Content”   systems  including  Content  Pro,  CONTENTdm,   DSpa...
¡  Scalable  ¡  Expandable  ¡  Flexible  ¡  Accessable  ¡  Standards-­‐based   §  “Based  on  the  Open  ...
Complete  preservation  solution  allowing  collection,  archiving  and  preservation  of  digital  materials  ...
¡  No  preservation  systems  is  useful  if  there  is  no   access  (especially  for  a  University  L...
Licensed     Open  Access  Commercial     Digital   Local   Local  e-­‐resources...
Data  Sets   Faculty  Research   ...
Image  cc-­‐by-­‐nc-­‐sa  2.0:  http://www.flickr.com/photos/yakibah/3512735385/  
¡  Systems:  1  person  (~0.5  FTE)   §  Project  Management   §  Systems/Technical    ¡  Metad...
¡  Metadata  Librarians  are  Project  Managers   §  Decide  on  appropriate  descriptive  metadata  fields  ...
¡  In  the  preliminary  planning  stages   ¡  Need  to  demonstrate  we  can  do  what  we  say   ¡  S...
Please  characterize  your  research  in  terms  of  data  intensity  for  your  analysis   ...
80   desktop  or  laptop  comput...
60  50  40  30   fo...
50  40   private  30   ...
¡  Library  provides:   §  Digital  platform  (Rosetta)   §  Metadata  consulting   §  Metadata  trainin...
¡  Bring  everyone  on  board  ¡  Set  priorities  ¡  Review  metadata  and  digital  objects  often  (at ...
¡  Enlist  subject  librarians  to  help  make   connections   §  A  few  subject  librarians  have  iden...
¡  Provide  preservation;  offer  dissemination   §  Don’t  confuse  preservation  with  open  access   § ...
Rosetta  at  Binghamton  University  Libraries    |    14  November  2011  Reference:  Digital  Preservation	...
of 31

Preservation and Research Data at Binghamton University Libraries by Edward Corrado

Presentation given by Edward Corrado on 11/14/11 at the University at Buffalo Libraries symposium entitled "Research Data: Management, Access, Control."
Published on: Mar 4, 2016
Published in: Education      
Source: www.slideshare.net


Transcripts - Preservation and Research Data at Binghamton University Libraries by Edward Corrado

  • 1. Edward  M.  Corrado  |  ecorrado@binghamton.edu   Binghamton  University  Libraries    |    http://library.binghamton.edu/  Presented    14  November  2011  at  Research  Data:  Management,  Access  and  Control  Symposium,    University  at  Buffalo
  • 2. Binghamton  University,  one  of  four  comprehensive  doctoral  research    universities  within   the  State  University  of  New  York,  is  recognized  for  stellar  academics,  an  international   focus,  high  graduation  rates  and  overall  value  §   Undergraduates:  11,706    §   Graduate  students:  3,007    §   Average  SAT  score  for  2011  incoming   Freshman:  1305    §  Top  25%  of  high  school  class:  85%    §  Students  of  color:  33.3%    §   International  students:  10%    §  #1  in  2011  as  a  best  value  among  the   nations  public  colleges  for  out-­‐of-­‐   state  students  and  #5  overall   Binghamton  University:   (Kiplingers  Personal  Finance,  2011)     "The  premier  public  university  in  the  §  Students  come  from  all  50  states  and   northeast”    and  “best  buy”     100  countries   (Fiske  Guide  To  Colleges,  2010)
  • 3. Collections profile
  • 4. Our facilities
  • 5. ¡  Backups  ≠  Preservation   §  “Actions  required  to  maintain  access   to  digital  materials  beyond  the  limits   of  media  failure  or  technological   change.”  (Digital  Preservation  Collation,  2009)   §  Backups  alone  are  not  sufficient   §  Don’t  protect  against  obsolete  file   formats,  software,  hardware,  etc.  ¡  Providing  access  ≠  Preservation   §  Digital  Asset  management  systems   offer  access  but  not  [necessarily]  long   term  preservation
  • 6. ¡  While  Digital  preservation  can  support  Open   access  and/or  Open  data,  preservation  does  not   and  can  not  always  imply  Openness   ▪  Patents  and  other  legal  issues   ▪  Confidential  data  such  as  Blood  Serum  Collection   ▪  Researcher/Discipline  Norms   ▪  Discipline  Specific  Repositories  such  as  arXiv.org   and  Inter-­‐University  Consortium  for  Political  and   Social  Research  (ICPSR)
  • 7. ¡  JISC  Beginner’s  Guide  to  Digital  Preservation  elaborates:   §  Managed:  Digital  preservation  is  a  Management  problem.   §  Activities:  The  policy  needs  to  filter  down  to  a  list  of  processes:   tasks  that  can  take  place  at  specified  times  and  in  specified  ways.   §  Necessary:  What  needs  to  be  done.  How  long  do  you  want  to   preserve  the  objects  for?  Discussions  about  the  activities  needed   to  achieve  a  level  of  preservation  are  necessary.   §  Continued  Access:  Access  is  the  key  here.  Most  objects  in  the   public  sphere  are  preserved  to  support  access  and  retrieval.   §  Digital  Materials:  Digital  materials,  digital  objects,  call  them   what  you  will.  This  is  the  stuff  you  are  preserving.  Different   objects  require  different  processes.
  • 8. ¡  Local  Content  as  the  Future  of  [Academic]  Libraries?   §  At  least  in  regards  to  Physical  Collections?   §  Google  Books,  HathiTrust     §  To  a  large  degree  the  material  under            the  “Bell  Curve”  ( journals,              gov’t  docs,  etc.)  is  already            being  “managed”  outside              of  libraries  ¡  The  University  is  a  collection  of  Niche  Markets  (John  Meador,  Jr.)     “The  Long  Tail”  by  Chris  Anderson  in  Wired  (October,  2004),  his  book:  The  Long  Tail:  Why  the  Future  of   Business  is  Selling  More  of  Less.  New  York:  Hyperion,  2006  and  its  Revised  and  Updated  EdiKon,  2008.
  • 9. ¡  Why  Libraries?   ¡  Libraries  have  been  preserving   information  for  centuries   §  Furthers  the  role  of  libraries  to   the  digital  world   §  Not  a  new  idea,  a  new  format   §  Majority  of  new  material  is   published  in  digital  format   (Scholarly  Articles,  Campus   newsletters,  Course  catalogs,   Web  sites…)    University  of  Al-­‐Karaouine,  Founded  859,  Fes,  Morocco    http://en.wikipedia.org/wiki/University_of_Al-­‐Karaouine
  • 10. “Digitization for Access is Not Preservation; Without Preservation,There is No Access”
  • 11. Licensed    commercial     OCLC’s   Innovative’s  e-­‐resources   CONTENTdm   Content  Pro
  • 12. ¡  Adhere  to  International  Standards   §  "Librarians  can  take  over  the  world.”  (Dr.  Barry  Smith)   But  we  need  to  use  tools  that  have  been  proven  -­‐  not   building  new  ontologies  ¡  Capture  the  locally  born  digital  objects  that  are   replacing  titles  formerly  found  in  our  print  archives  ¡  Ensure  Digital  Curation  &  Preservation  ¡  Provide  Cross-­‐Collection  Search  ¡  Demonstrate  proof  of  concept  before  soliciting   faculty  research
  • 13. ¡  Experimented  with  various  “Digital  Content”   systems  including  Content  Pro,  CONTENTdm,   DSpace,  EPrints  ¡  None  of  these  have  preservation  “built-­‐in”  ¡  Building  our  own  was  not  practical   §  Staffing  levels   §  Lack  of  programmers   §  Mission  creep?   §  Sustainability?  ¡  Rosetta  by  Ex  Libris
  • 14. ¡  Scalable  ¡  Expandable  ¡  Flexible  ¡  Accessable  ¡  Standards-­‐based   §  “Based  on  the  Open  Archival  Information  System   (OAIS)  model  and  conforming  to  trusted  digital   repository  (TDR)  requirements.”  http://www.exlibrisgroup.com/category/RosettaOverview
  • 15. Complete  preservation  solution  allowing  collection,  archiving  and  preservation  of  digital  materials  of  any  type.  Rosetta  ensures  data  integrity  and  provides  access  over-­‐time  to  digital  materials.   Operational   Storage   Migration   Action   Permanent   Storage   Execute  Preservation   Identify  Risks   Evaluate  Alternatives   Actions  http://www.exlibrisgroup.com/category/RosettaOverview
  • 16. ¡  No  preservation  systems  is  useful  if  there  is  no   access  (especially  for  a  University  Library)  ¡  Rosetta  does  not  have  a  public  discovery  layer  ¡  Rosetta’s  Digital  Publishing  System  is  flexible  so   there  are  options  ¡  Primo  for  discovery   §  First  University  to  use  Prim0  with  Rosetta   §  Works  well  with  other  library  systems  such  as  Aleph   and  Primo  Central   §  One  stop  shopping
  • 17. Licensed     Open  Access  Commercial     Digital   Local   Local  e-­‐resources   Objects   Print   Digital
  • 18. Data  Sets   Faculty  Research   Special  Collections/   Course  Catalogs   University  Archives   University  Photographs   Newsletters   Also  need  to  be   opportunistic  (blood   serum  collection)  Images  cc-­‐by-­‐nc-­‐2.0:  http://www.flickr.com/photos/bycp/
  • 19. Image  cc-­‐by-­‐nc-­‐sa  2.0:  http://www.flickr.com/photos/yakibah/3512735385/
  • 20. ¡  Systems:  1  person  (~0.5  FTE)   §  Project  Management   §  Systems/Technical    ¡  Metadata/Cataloging:  3  people  (~1.0  FTE)  ¡  User  Interface:  Part  of  Web  Services  Librarians  Time  ¡  Special  Collections:  Not  directly  involved  with   implementation,  but  relied  on  heavily  for  collection   level  expertise
  • 21. ¡  Metadata  Librarians  are  Project  Managers   §  Decide  on  appropriate  descriptive  metadata  fields   §  Create  the  metadata  forms   §  Provide  training   §  Develop  and/or  provide  specialized  terminology   (such  as  LCSH,  TGM,  TGN)   §  Review  submissions  as  appropriate   §  DO  NOT  typically  create  the  metadata  (student   workers  or  other  staff  will  create  metadata)
  • 22. ¡  In  the  preliminary  planning  stages   ¡  Need  to  demonstrate  we  can  do  what  we  say   ¡  Scholarly  output   §  Articles,  proceedings,  etc.   §  Research  data   §  Related  material  including          grey  literature,  research  notes,          correspondence,  etc.    Photo  from  http://anthro.binghamton.edu/BiomedWebsite/serum.shtml
  • 23. Please  characterize  your  research  in  terms  of  data  intensity  for  your  analysis   run  (n=91)   308  individuals  who  either   Normal  (working   had  an  externally   2.4%   data  set  up  to  100   sponsored  project  since   Megabytes)   2009  or  who  had  submitted   a  proposal  during  that  time   Heavy  (working   period  where  asked  to  take   24.4%   data  set  up  to  1   the  survey.  By  June  15,  2011   Terrabyte)   91  respondents  complete   Very  large  (working   the  survey.  (Conducted  by   data  set  up  to  1000   Jim  Wolf,  retired  Director  of   73.2%   Academic  Computing)   Terrabytes)   Extreme  (working   dataset  over  1000   Terrabytes)
  • 24. 80   desktop  or  laptop  computer  60   in  office  or  lab   on  instrument  in  lab   research  group  server  40   storage   departmental  server  storage  20   ITS  storage   external  network  storage   0   Please  identify  where  you  store  data  generated  or   gathered  for  your  project
  • 25. 60  50  40  30   forever   3-­‐7  years  20   <  3  years  10   0   Local  research   ITS  storage   Library  archive   Disciplanry   group  server   repository  (e.g.   ICPSR)
  • 26. 50  40   private  30   proprietory  20   openly  avilable  to  all  10   access  granted  to   individuals   0   Local   ITS  storage   Library   Disciplinary   research   archive   repository   group  server   (e.g.,  ICPSR)
  • 27. ¡  Library  provides:   §  Digital  platform  (Rosetta)   §  Metadata  consulting   §  Metadata  training   §  Ongoing  preservation  ¡  Blood  serum  archive  provides:     §  Subject  expertise   §  Digitization   §  Metadata  creation   Photo  CC-­‐by-­‐2.0  via  http://www.flickr.com/photos/usnavy/5804689369/in/photostream/
  • 28. ¡  Bring  everyone  on  board  ¡  Set  priorities  ¡  Review  metadata  and  digital  objects  often  (at   least  in  the  beginning)  ¡  Metadata  may  contain  confidential  and/or   legally  protected  information   §  Will  metadata  librarians  need  human  subjects/IRB   approval?   §  Need  for  separate  discovery  mechanisms
  • 29. ¡  Enlist  subject  librarians  to  help  make   connections   §  A  few  subject  librarians  have  identified  some   possible  data  needing  preservation  and  are  going  to   meet  with  faculty  for  preliminary  discussions  ¡  Work  with  faculty  on  data  management  plans   §  Many  granting  agencies  such  as  NSF  are  requiring   data  management  plans     §  Get  involved  early   §  Assist  with  submission  requirements  for  research
  • 30. ¡  Provide  preservation;  offer  dissemination   §  Don’t  confuse  preservation  with  open  access   §  Faculty  don’t  always  want  or  can  not  make  data  open   ▪  Dark  archive  if  desired   §  Do  not  need  to  replace  or  replicate  current  data   dissemination  methods  (unless  researchers  desire)  ¡  Not  all  research  data  is  “Big  Data”   §  Don’t  let  the  challenges  of  “Big  Data”            scare  you  away  from  all  data.  Photo:  http://siliconangle.com/files/2011/07/Big-­‐Data.jpg
  • 31. Rosetta  at  Binghamton  University  Libraries    |    14  November  2011  Reference:  Digital  Preservation  Collation  (2009).  Digital  Preservation    Handbook.  http://www.dpconline.org/advice/preservationhandbook/introduction/definitions-­‐and-­‐concepts