PD:CCSDS 652.0-M-1 AUDIT AND CERTIFICATION OF TRUSTWORTHY DIGITAL REPOSITORIES

From PUBLIC DOMAIN PROJECT
(Redirected from PD:CCSDS 652.0-M-1)
Jump to: navigation, search

This is the full text of the CCSDS recommended practice 652.0-M-1.

The Public Domain Project is using this recommended practice to do internal audits of the project. This text is presented here as full quote to make direct links to the supporting text of a given requirement from the audit report. This makes it possible to write a much more compact and readable report without loosing any information for the interested reader.


Recommendation for Space Data System Practices

AUDIT AND CERTIFICATION OF TRUSTWORTHY DIGITAL REPOSITORIES RECOMMENDED PRACTICE CCSDS 652.0-M-1 MAGENTA BOOK

September 2011

AUTHORITY

Issue: Recommended Practice, Issue 1
Date: September 2011
Location: Washington, DC, USA

This document has been approved for publication by the Management Council of the Consultative Committee for Space Data Systems (CCSDS) and represents the consensus technical agreement of the participating CCSDS Member Agencies. The procedure for review and authorization of CCSDS documents is detailed in the Procedures Manual for the Consultative Committee for Space Data Systems, and the record of Agency participation in the authorization of this document can be obtained from the CCSDS Secretariat at the address below.

This document is published and maintained by:

CCSDS Secretariat
Space Communications and Navigation Office, 7L70
Space Operations Mission Directorate
NASA Headquarters
Washington, DC 20546-0001, USA

STATEMENT OF INTENT

The Consultative Committee for Space Data Systems (CCSDS) is an organization officially established by the management of its members. The Committee meets periodically to address data systems problems that are common to all participants, and to formulate sound technical solutions to these problems. Inasmuch as participation in the CCSDS is completely voluntary, the results of Committee actions are termed Recommendations and are not in themselves considered binding on any Agency.

CCSDS Recommendations take two forms: Recommended Standards that are prescriptive and are the formal vehicles by which CCSDS Agencies create the standards that specify how elements of their space mission support infrastructure shall operate and interoperate with others; and Recommended Practices that are more descriptive in nature and are intended to provide general guidance about how to approach a particular problem associated with space mission support. This Recommended Practice is issued by, and represents the consensus of, the CCSDS members. Endorsement of this Recommended Practice is entirely voluntary and does not imply a commitment by any Agency or organization to implement its recommendations in a prescriptive sense.

No later than five years from its date of issuance, this Recommended Practice will be reviewed by the CCSDS to determine whether it should: (1) remain in effect without change; (2) be changed to reflect the impact of new technologies, new requirements, or new directions; or (3) be retired or canceled.

In those instances when a new version of a Recommended Practice is issued, existing CCSDS-related member Practices and implementations are not negated or deemed to be non- CCSDS compatible. It is the responsibility of each member to determine when such Practices or implementations are to be modified. Each member is, however, strongly encouraged to direct planning for its new Practices and implementations towards the later version of the Recommended Practice.

FOREWORD

This document is a technical Recommendation to use as the basis for providing audit and certification of the trustworthiness of digital repositories. It provides a detailed specification of criteria by which digital repositories shall be audited.

The OAIS Reference Model (reference [1]) contained a roadmap which included the need for a certification standard. The initial work was to be carried out outside CCSDS and then brought back into CCSDS to take into the standard.

In 2003, Research Libraries Group (RLG) and the National Archives and Records Administration (NARA) created a joint task force to specifically address digital repository certification. That task force published Trustworthy Repositories Audit & Certification: Criteria and Checklist (TRAC—reference [B3]), on which this Recommended Practice is based.

Through the process of normal evolution, it is expected that expansion, deletion, or modification of this document may occur. This Recommended Practice is therefore subject to CCSDS document management and change control procedures, which are defined in the Procedures Manual for the Consultative Committee for Space Data Systems. Current versions of CCSDS documents are maintained at the CCSDS Web site: http://www.ccsds.org/

Questions relating to the contents or status of this document should be addressed to the CCSDS Secretariat at the address indicated on page i.

DOCUMENT CONTROL

Document     Title                              Date      Status
CCSDS        Audit and Certification of         September Original issue
652.0-M-1    Trustworthy Digital Repositories,  2011
             Recommended Practice,
             Issue 1

Contents

INTRODUCTION

PURPOSE AND SCOPE

The main purpose of this document is to define a CCSDS Recommended Practice on which to base an audit and certification process for assessing the trustworthiness of digital repositories. The scope of application of this document is the entire range of digital repositories.

APPLICABILITY

This document is meant primarily for those responsible for auditing digital repositories and also for those who work in or are responsible for digital repositories seeking objective measurement of the trustworthiness of their repository. Some institutions may also choose to use these metrics during a design or redesign process for their digital repository.

RATIONALE

In 1996 the Task Force on Archiving of Digital Information (reference [B1]) declared, ‘a critical component of digital archiving infrastructure is the existence of a sufficient number of trusted organizations capable of storing, migrating, and providing access to digital collections’. The task force saw that ‘trusted’ or trustworthy organizations could not simply identify themselves. To the contrary, the task force declared, ‘a process of certification for digital archives is needed to create an overall climate of trust about the prospects of preserving digital information’.

Work in articulating responsible digital archiving infrastructure was furthered by the development of the Open Archival Information System (OAIS) Reference Model (reference [1]). Designed to create a consensus on ‘what is required for an archive to provide permanent or indefinite long-term preservation of digital information’, the OAIS addressed fundamental questions regarding the long-term preservation of digital materials that cut across domain-specific implementations. The reference model (ISO 14721) provides a common conceptual framework describing the environment, functional components, and information objects within a system responsible for the long-term preservation of digital materials. Long before it became an approved standard in 2002, many in the cultural heritage community had adopted OAIS as a model to better understand what would be needed from digital preservation systems.

Institutions began to declare themselves ‘OAIS-compliant’ to underscore the trustworthiness of their digital repositories. However, there was no established understanding of ‘OAIS- compliance’ beyond being able to apply OAIS terminology to describe their archive, despite there being a compliance section in OAIS which specifies the need to support the model of information and fulfilling the mandatory responsibilities.

Claims of trustworthiness are easy to make but are thus far difficult to justify or objectively prove. Establishing more clear criteria detailing what a trustworthy repository is and is not has become vital.

In 2002, Research Libraries Group (RLG) and Online Computer Library Center (OCLC) jointly published Trusted Digital Repositories: Attributes and Responsibilities (reference [B2]), which further articulated a framework of attributes and responsibilities for trusted, reliable, sustainable digital repositories capable of handling the range of materials held by large and small cultural heritage and research institutions. The framework was broad enough to accommodate different situations, technical architectures, and institutional responsibilities while providing a basis for the expectations of a trusted repository. The document has proven to be useful for institutions grappling with the long-term preservation of cultural heritage resources and has been used in combination with the OAIS as a digital preservation planning tool. As a framework, this document concentrated on high-level organizational and technical attributes and discussed potential models for digital repository certification. It refrained from being prescriptive about the specific nature of rapidly emerging digital repositories and archives and instead reiterated the call for certification of digital repositories, recommending the development of certification program and articulation of auditable criteria.

OAIS included a Roadmap for follow-on standards which included ‘standard(s) for accreditation of archives’. It was agreed that RLG and National Archives and Records Administration (NARA) would take this particular topic forward and the later published the TRAC (reference [B3]) document which combined ideas from OAIS (reference [1]) and Trusted Digital Repositories: Attributes and Responsibilities (TDR—reference [B2]). The current document follows on from TRAC in order to produce an ISO standard.

STRUCTURE OF THIS DOCUMENT

This document is divided into informative and normative sections and annexes. Sections 1-2 of this document are informative and give a high-level view of the rationale, the conceptual environment, some of the important design issues, and an introduction to the terminology and concepts.

  • Section 1 gives purpose and scope, rationale, a view of the overall document structure, and the acronym list, glossary, and reference list for this document.
  • Section 2 provides an overview of audit and certification criteria, ideas about evidence to support claims, and a discussion of related standards. Metrics are empirically derived and consistent measures of effectiveness. When evaluated together, metrics can be used to judge the overall suitability of a repository to be trusted to provide a preservation environment that is consistent with the goals of the OAIS. Separately, individual metrics or measures can be used to identify possible weaknesses or pending declines in repository functionality.
  • Sections 3 to 5 provide the normative metrics against which a digital repository may be judged. These sections provide metrics grouped as follows:
    • covers Organizational Infrastructure;
    • covers Digital Object Management;
    • covers Infrastructure and Security Risk Management.

Each section groups metrics into one or more subsections.

  • Security considerations are discussed in annex A.
  • Annex B provides Informative References.

DEFINITIONS

ACRONYMS AND ABBREVIATIONS

AIP Archival Information Package (defined in reference [1])
CCSDS Consultative Committee for Space Data Systems
DEDSL Data Entity Specification Language (see reference [B7])
DIP Dissemination Information Package (defined in reference [1])
FITS Flexible Image Transport System
GIS Geographic Information System
ISO International Organization for Standardization
OAIS Open Archival Information System (see reference [1])
PDI Preservation Description Information (defined in reference [1])
SIP Submission Information Package (defined in reference [1])
TEI Text Encoding Initiative
UML Unified Modeling Language
XML Extensible Markup Language

TERMINOLOGY

Digital preservation interests a range of different communities, each with a distinct vocabulary and local definitions for key terms. A glossary is included in this document, but it is important to draw attention to the usage of several key terms. In general, key terms in this document have been adopted from the OAIS Reference Model. One of the great strengths of the OAIS Reference Model has been to provide a common terminology made up of terms ‘not already overloaded with meaning so as to reduce conveying unintended meanings’ (reference [1]). Because the OAIS has become a foundational document for digital preservation, the common terms are well understood and are therefore used within this document.

The OAIS Reference Model uses ‘digital archive’ to mean the organization responsible for digital preservation. In this document, the term ‘repository’ or phrase ‘digital repository’ is used to convey the same concept in all instances except when quoting from the OAIS. It is important to understand that in all instances in this document, ‘repository’ and ‘digital repository’ are used to convey digital repositories and archives that have, or contribute to, long-term preservation responsibilities and functionality. This document uses the OAIS concept of the ‘Designated Community’. A repository may have a single, generalized ‘Designated Community’ (e.g., every citizen of a country), while other repositories may have several, distinct Designated Communities with highly specialized needs, each requiring different functionality or support from the repository; this document uses the term Designated Community to cover this second case also.

Finally, this document names criteria that, combined, evaluate the trustworthiness of digital repositories and archives.

Glossary

Unless otherwise indicated, other definitions are taken from the OAIS Reference Model (reference [1]).

Access Policy: Written statement, authorized by the repository management, that describes the approach to be taken by the repository for providing access to objects accessioned into the repository. The Access Policy may distinguish between different types of access rights, for example between system administrators, Designated Communities, and general users. Practice: Actions conducted to execute procedures. Practices are measured by logs or other evidence that record actions completed.

Preservation Implementation Plan: A written statement, authorized by the management of the repository, that describes the services to be offered by the repository for preserving objects accessioned into the repository in accordance with the Preservation Policy.

NOTE

The relationship between these terms is motivated as follows. A repository is
assumed to have an overall Repository Mission Statement, part of which will be
concerned with preservation. The Preservation Strategic Plan states how the
mission will be achieved, in general terms with goals and objectives. The
Preservation Policy then declares the range of approaches that the repository will
employ to ensure preservation (that is, to implement the Preservation Strategic
Plan), and finally the Preservation Implementation Plan translates those into
services that the repository must carry out. This is an abstract documentary
model that, in reality, can result in different documents, a different distribution of
subjects between documents, different document names, etc.

Preservation Policy: Written statement, authorized by the repository management, that describes the approach to be taken by the repository for the preservation of objects accessioned into the repository. The Preservation Policy is consistent with the Preservation Strategic Plan.

Preservation Strategic Plan: A written statement, authorized by the management of the repository, that states the goals and objectives for achieving that part of the mission of the repository concerned with preservation. Preservation Strategic Plans may include long-term and short-term plans.

Procedure: A written statement that specifies actions required to complete a service or to achieve a specific state or condition. Procedures specify how various aspects of the relevant Preservation Implementation Plans are to be fulfilled.

Provider (or Submitter): A person or system that submits a digital object to the repository. The Provider can be the Producer.

Repository Mission Statement: A written statement, authorized by the management of the repository, that, among other things, describes the commitment of the organization for the stewardship of digital objects in its custody.

NOMENCLATURE

The following conventions apply for the normative specifications in this Recommended Practice:

a) the words ‘shall’ and ‘must’ imply a binding and verifiable specification;
b) the word ‘should’ implies an optional, but desirable, specification;
c) the word ‘may’ implies an optional specification;
d) the words ‘is’, ‘are’, and ‘will’ imply statements of fact.

NOTE – These conventions do not imply constraints on diction in text that is clearly informative in nature.

CONVENTIONS

The following conventions apply:

  • The term Designated Community may include multiple Designated Communities.
  • Sub-metrics for any section are intended to help clarify and elucidate their superior item. Satisfaction of the sub-metrics provides evidence supporting a claim of compliance with the hierarchically superior items.
  • Each metric has one or more of the following informative pieces of text associated with it:
    • Supporting Text: giving an explanation of why the metric is important;
    • Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement: providing examples of the evidence which might be examined to test whether the repository satisfies the metric;
    • Discussion: clarifications about the intent of the metric.

CONFORMANCE

An archive that conforms to this Recommended Practice shall have satisfied the auditor on each of the requirements.

Conformance to these metrics, as with all other such standards, is a matter of judgment. The supporting organization and practice of auditing will lead to the creation of auditors’ guidelines, as described in the draft ISO 16919.

As described in the referenced ISO documents, the aim of the audit process is to create a process of continuous improvement. Thus the outcome of the audit will not be a simple yes/no but rather a judgment about areas that need improvement.

REFERENCES

The following documents contain provisions which, through reference in this text, constitute provisions of this Recommended Practice. At the time of publication, the editions indicated were valid. All documents are subject to revision, and users of this Recommended Practice are encouraged to investigate the possibility of applying the most recent editions of the documents indicated below. The CCSDS Secretariat maintains a register of currently valid CCSDS documents.

Reference Model for an Open Archival Information System (OAIS). Recommendation
for Space Data System Standards, CCSDS 650.0-B-1. Blue Book. Issue 1.
Washington, D.C.: CCSDS, January 2002. [Also published as ISO 14721:2003.]

NOTE – Informative references are listed in annex B.


OVERVIEW OF AUDIT AND CERTIFICATION CRITERIA

This section provides an overview of some of the key concepts that are incorporated in the design of the metrics in this Recommended Practice.

A TRUSTWORTHY DIGITAL REPOSITORY

At the very basic level, the definition of a trustworthy digital repository must start with ‘a mission to provide reliable, long-term access to managed digital resources to its Designated Community, now and into the future’ (reference [B2]). Expanding the definition has caused great discussion both within and across various groups, from the broad digital preservation community to the data archives or institutional repository communities.

A trustworthy digital repository will understand threats to and risks within its systems. Constant monitoring, planning, and maintenance, as well as conscious actions and strategy implementation will be required of repositories to carry out their mission of digital preservation. All of these present an expensive, complex undertaking that depositors, stakeholders, funders, the Designated Community, and other digital repositories will need to rely on in the greater collaborative digital preservation environment that is required to preserve the vast amounts of digital information generated now and into the future.

Communicating audit results to the public—transparency—will engender more trust, and additional objective audits, potentially leading towards certification, will promote further trust in the repository and the system that supports it. Finally, attaining trustworthy status is not a one-time accomplishment, achieved and forgotten. To retain trustworthy status, a repository will need to undertake a regular cycle of audit and/or certification.

EVIDENCE

As noted in 1.5.4 each metric has associated with it informative text under the heading Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement: providing examples of the evidence which might be examined to test whether the repository satisfies the metric. These examples are illustrative rather than prescriptive, and the lists of possible evidence are not exhaustive.

RELEVANT STANDARDS, BEST PRACTICES, AND CONTROLS

Numerous documents and standards include pieces that are applicable or related to this work. These standards are important to acknowledge and embrace as complementary audit tools. A few examples:

  • The ISO 9000 family of standards (e.g., Quality Management Systems
  • Fundamentals and Vocabulary—reference [B9]) addresses quality assurance components within an organization and system management that, while valuable, were not specifically developed to gauge the trustworthiness of organizations operating digital repositories.
  • Similarly, ISO 17799:2005 (reference [B10]) was developed specifically to address data security and information management systems. Like ISO 9000, it has some very valuable components to it but it was not designed to address the trustworthiness of digital repositories. Its requirements for information security seek data security compliance to a very granular level, but do not address organizational, procedural, and preservation planning components necessary for the long-term management of digital resources.
  • ISO 15489-1:2001 and ISO 15489-2:2001 (references [B11] and [B12]) define a systematic and process-driven approach that governs the practice of records managers and any person who creates or uses records during their business activities, treats information contained in records as a valuable resource and business asset, and protects/preserves records as evidence of actions. Conformance to ISO 15489 requires an organization to establish, document, maintain, and promulgate policies, procedures, and practices for records management, but, by design, addresses records management specifically rather than applying to all types of repositories and archives.
  • Finally, ISO 14721:2003, the Open Archival Information System Reference Model, provides a high-level reference model or framework identifying the participants in digital preservation, their roles and responsibilities, and the kinds of information to be exchanged during the course of deposit and ingest into and dissemination from a digital repository.

It is important to acknowledge that there is real value in knowing whether an institution is certified to related standards or meets other controls that would be relevant to an audit. Certainly, an institution that has undertaken any kind of certification process—even if none of the evaluated components overlap with a digital repository audit—will be better prepared for digital repository certification. And those that have achieved certification in related standards will be able to use those certifications as evidence during the digital repository audit.

ORGANIZATIONAL INFRASTRUCTURE

GOVERNANCE AND ORGANIZATIONAL VIABILITY

The repository shall have a mission statement that reflects a commitment to the preservation of, long term retention of, management of, and access to digital information.

Supporting Text
This is necessary in order to ensure commitment to preservation, retention, management and access at the repository’s highest administrative level.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Mission statement or charter of the repository or its parent organization that specifically addresses or implicitly calls for the preservation of information and/or other resources under its purview; a legal, statutory, or government regulatory mandate applicable to the repository that specifically addresses or implicitly requires the preservation, retention, management and access to information and/or other resources under its purview.

Discussion
The repository’s or its parent organization’s mission statement should explicitly address preservation. If preservation is not among the primary purposes of an organization that houses a digital repository then preservation may not be essential to the organization’s mission. In some instances a repository pursues its preservation mission as an outgrowth of the larger goals of an organization in which it is housed, such as a university or a government agency, and its narrower mission may be formalized through policies explicitly adopted and approved by the larger organization. Government agencies and other organizations may have legal mandates that require they preserve materials, in which case these mandates can be substituted for mission statements, as they define the purpose of the organization. Mission statements should be kept up to date and continue to reflect the common goals and practices for preservation.

The repository shall have a Preservation Strategic Plan that defines the approach the repository will take in the long-term support of its mission.

Supporting Text
This is necessary in order to help the repository make administrative decisions, shape policies, and allocate resources in order to successfully preserve its holdings.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Preservation Strategic Plan; meeting minutes; documentation of administrative decisions which have been made.

Discussion
The strategic plan should be based on the organization’s established mission, and on its defined values, vision and goals. Strategic plans typically cover a particular finite time period, normally in the 3-5 year range.

The repository shall have an appropriate succession plan, contingency plans, and/or escrow arrangements in place in case the repository ceases to operate or the governing or funding institution substantially changes its scope.

Supporting Text
This is necessary in order to preserve the information content entrusted to the repository by handing it on to another custodian in the case that the repository ceases to operate.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Written and credible succession and contingency plan(s); explicit and specific statement documenting the intent to ensure continuity of the repository, and the steps taken and to be taken to ensure continuity; escrow of critical code, software, and metadata sufficient to enable reconstitution of the repository and its content in the event of repository failure; escrow and/or reserve funds set aside for contingencies; explicit agreements with successor organizations documenting the measures to be taken to ensure the complete and formal transfer of responsibility for the repository’s digital content and related assets, and granting the requisite rights necessary to ensure continuity of the content and repository services.

Discussion
A repository’s failure threatens the long-term sustainability of a repository’s information content. It is not sufficient for the repository to have an informal plan or policy regarding where its data goes should a failure occur. A formal plan with identified procedures needs to be in place.

The repository shall monitor its organizational environment to determine when to execute its succession plan, contingency plans, and/or escrow arrangements.

Supporting Text
This is necessary in order to ensure that the repository can recognize when it is necessary to execute those plans.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Administrative policies, procedures, protocols, requirements; budgets and financial analysis documents; fiscal calendars; business plan(s); any evidence of active monitoring and preparedness.

Discussion
The management of a repository should have formal procedures in place to periodically check on the viability of the repository. This periodic check should be used to determine if, or when, to execute the repository’s formal succession plan, contingency plans, and/or escrow arrangements.

The repository shall have a Collection Policy or other document that specifies the type of information it will preserve, retain, manage, and provide access to.

Supporting Text
This is necessary in order that the repository has guidance on acquisition of digital content it will preserve, retain, manage and provide access to.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Collection policy and supporting documents; Preservation Policy, mission, goals and vision of the repository.

Discussion
The collection policy can be used to understand what the repository holds, what it does not hold, and why. The collection policy supports the broader mission of the repository. Without such a policy the repository is likely to collect in a haphazard manner, or store large amounts of low-value digital content. The collection policy helps the organization to identify what digital content it will and will not accept for ingestion. In an organization with a broader mission than preservation of digital content the collection policy helps to define the role of the repository within the larger organizational context.


ORGANIZATIONAL STRUCTURE AND STAFFING

The repository shall have identified and established the duties that it needs to perform and shall have appointed staff with adequate skills and experience to fulfill these duties.

Discussion
Staffing of the repository should be by personnel with the required training and skills to carry out the activities of the repository. The repository should be able to document through development plans, organizational charts, job descriptions, and related policies and procedures that the repository is defining and maintaining the skills and roles that are required for the sustained operation of the repository.


The repository shall have identified and established the duties that it needs to perform.

Supporting Text
This is necessary in order to ensure that the repository can complete all tasks associated with the long-term preservation and management of the data objects.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
A staffing plan; competency definitions; job descriptions; staff professional development plans; certificates of training and accreditation; plus evidence that the repository reviews and maintains these documents as requirements evolve.

Discussion
Preservation depends upon a range of activities from maintaining hardware and software to migrating content and storage media to negotiating intellectual property rights agreements. In order to ensure long-term sustainability, a repository must be aware of all required activities and demonstrate that it can successfully complete them. The repository can achieve these aims by, for example, identifying the competencies and skill sets required to carry out its activities over time—e.g., archival training, technical skills, and legal expertise.

The repository shall have the appropriate number of staff to support all functions and services.

Supporting Text
This is necessary in order to ensure repository staffing levels are adequate for preserving the digital content and providing a secure, quality repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Organizational charts; definitions of roles and responsibilities; comparison of staffing levels to industry benchmarks and standards.

Discussion
The repository should determine the appropriate number and level of staff that corresponds to requirements and commitments. The repository should also demonstrate how it evaluates staff effectiveness and suitability to support its functions and services.

The repository shall have in place an active professional development program that provides staff with skills and expertise development opportunities.

Supporting Text
This is necessary to ensure that staff skill sets evolve as the repository technology and preservation procedures change.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Professional development plans and reports; training requirements and training budgets, documentation of training expenditures (amount per staff); performance goals and documentation of staff assignments and achievements, copies of certificates awarded.

Discussion
Technology and general practices for digital preservation will continue to change, as will the requirements of its Designated Community, so the repository must ensure that its staff’s skill sets evolve. Ideally the repository will meet this requirement through a lifelong learning approach to developing and retaining staff.


PROCEDURAL ACCOUNTABILITY AND PRESERVATION POLICY FRAMEWORK

Documentation assures stakeholders (consumers, producers, and contributors of digital content) that the repository is meeting its requirements and fully performing its role as a trustworthy digital repository. A repository must create documentation that reflects its Mission Statement and Strategic Plan and captures its normal activities. This entails documenting all repository processes, decision-making, and goal setting. Documentation is provided so that the activities of the repository will be understood by stakeholders and management. It ensures that repository policies and procedures are carried out in approved, consistent ways, resulting in long-term preservation and access to digital content in its care. Certification, the clearest indicator of a repository’s sound and standards-based practice, is facilitated by procedural accountability and documentation.

The repository shall have defined its Designated Community and associated knowledge base(s) and shall have these definitions appropriately accessible.

Supporting Text
This is necessary in order that it is possible to test that the repository meets the needs of its Designated Community.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
A written definition of the Designated Community.

Discussion
The Designated Community is defined as ‘an identified group of potential Consumers who should be able to understand a particular set of information. The Designated Community may be composed of multiple user communities. A Designated Community is defined by the archive and this definition may change/evolve over time’ (OAIS Glossary, reference [1]). Examples of Designated Community definitions include:

  • General English-reading public educated to high school and above, with access to a Web Browser (HTML 4.0 capable).
  • For Geographic Information System (GIS) data: GIS researchers—undergraduates and above—having an understanding of the concepts of Geographic data and having access to current (2005, USA) GIS tools/computer software, e.g., ArcInfo (2005).
  • Astronomer (undergraduate and above) with access to Flexible Image Transport System (FITS) software such as FITSIO, familiar with astronomical spectrographic instruments.
  • Student of Middle English with an understanding of Text Encoding Initiative (TEI) encoding and access to an XML rendering environment.
    • Variant 1: Cannot understand TEI;
    • Variant 2: Cannot understand TEI and no access to XML rendering environment;
    • Variant 3: No understanding of Middle English but does understand TEI and XML.
  • The repository has defined the external parties, and its assets, owners, and uses. Two groups: the publishers of scholarly journals and their readers, each of whom have different rights to access material and different services offered to them.

Some repositories may call themselves, for example, a ‘dark archive’, an archive that has a policy not to allow consumers to get access to its contents for a certain period of time, but they would nevertheless need a Designated Community.

The repository shall have Preservation Policies in place to ensure its Preservation Strategic Plan will be met.

Supporting Text
This is necessary in order to ensure that the repository can fulfill that part of its mission related to preservation.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Preservation Policies; Repository Mission Statement.

Discussion
Repository policies show how the repository fulfills the requirements of the repository’s preservation strategic plan. For example, a preservation strategic plan may contain a requirement that the repository ‘comply with current preferred preservation standards’. The preservation policy might then require that the repository ‘monitor current preservation standards and ensure repository compliance with the preferred preservation standards’. In another example the repository may be required by the strategic plan to keep its data understandable. The preservation policy might then include information about the expected level of understandability by the repository’s Designated Community for each Archival Information Package.

The repository shall have mechanisms for review, update, and ongoing development of its Preservation Policies as the repository grows and as technology and community practice evolve.

Supporting Text
This is necessary in order that the repository has up-to-date, complete policies and procedures in place that reflect the current requirements and practices of its community(ies) for preservation.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Current and past written documentation in the form of Preservation Policies, Preservation Strategic Plans and Preservation Implementation Plans, procedures, protocols, and workflows; specifications of review cycles for documentation; documentation detailing reviews, surveys and feedback. If documentation is embedded in system logic, functionality should demonstrate the implementation of policies and procedures.

Discussion
Preservation Policies capture organizational commitments and intents for staffing, security and other preservation-related concerns. Preservation Implementation Plans address preservation activities and practices such as transfer, submission, quality control, storage management, metadata management, and access and rights management. The repository may find it beneficial to maintain all versions of the preservation policies (e.g., outdated versions are clearly identified and maintained in some organized way) in order to document the results of monitoring for new developments, showing the repository’s responsiveness to prevailing standards and practice, emerging requirements, and standards that are specific to the domain, if appropriate, and similar developments. Qualified staff and peers are an important part of the review process, as they help to update and expand these documents. The policies should be understandable by the repository staff in order for them to carry out their work. Preservation Policies and procedures must be demonstrated to be understandable and implementable.

The repository shall have a documented history of the changes to its operations,

procedures, software, and hardware.

Supporting Text
This is necessary in order to provide an ‘audit trail’ through which stakeholders can identify and trace decisions made by the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Capital equipment inventories; documentation of the acquisition, implementation, update, and retirement of critical repository software and hardware; file retention and disposal schedules and policies, copies of earlier versions of policies and procedures; minutes of meetings.

Discussion
This documentation may include decisions about the organizational and technical infrastructure. Documentation of or interviews with appropriate staff who can explain repository practices and workflow should be available.

The repository shall commit to transparency and accountability in all actions supporting the operation and management of the repository that affect the preservation of digital content over time.

Supporting Text
This is necessary because transparency, in the sense of being available to anyone who wishes to know, is the best assurance that the repository operates in accordance with accepted standards and practices.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Reports of financial and technical audits and certifications; disclosure of governance documents, independent program reviews, and contracts and agreements with providers of funding and critical services.

Discussion
If the repository uses software to capture information about its history, it should be able to demonstrate these tracking tools. Where appropriate, the history is linked to relevant preservation strategies and describes potential effects on preserving digital content. This requirement does not mean that the organization must make information which would make it vulnerable to competitors available, but rather that the organization commits to disclosing its methods for preserving digital content at least to the Designated Community or other appropriate stakeholder in order to demonstrate that it is meeting all current preservation requirements.

The repository shall define, collect, track, and appropriately provide its information integrity measurements.

Supporting Text
This is necessary in order to provide documentation that it has developed or adapted appropriate measures for ensuring the integrity of its holding.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Written definition or specification of the repository’s integrity measures (for example, computed checksum or hash value); documentation of the procedures and mechanisms for monitoring integrity measurements and for responding to results of integrity measurements that indicate digital content is at risk; an audit process for collecting, tracking, and presenting integrity measurements; Preservation Policy and workflow documentation.

Discussion
The mechanisms to measure integrity will evolve as technology evolves. The repository may provide documentation that it has developed or adapted appropriate measures for ensuring the integrity of its holdings. If protocols, rules and mechanisms are embedded in the repository software, there should be some way to demonstrate the implementation of integrity measures.

The repository shall commit to a regular schedule of self-assessment and external certification.

Supporting Text
This is necessary in order to ensure the repository continues to be trustworthy and there is no threat to its content.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Completed, dated checklists from self-assessments and/or third-party audits; certificates awarded for compliance with relevant ISO standards; timetables and evidence of adequate budget allocations for future certification.

Discussion
A one-time check on trustworthiness is not adequate because many things will change over time. A longer term commitment should be demonstrated.


FINANCIAL SUSTAINABILITY

The repository shall have short- and long-term business planning processes in place to sustain the repository over time.

Supporting Text
This is necessary in order to ensure the viability of the repository over the period of time it has promised to provide access to its contents for its Designated Community.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Up-to-date, multi-year strategic, operating and/or business plans; audited annual financial statements; financial forecasts with multiple budget scenarios; contingency plans; market analysis.

Discussion
An annual business planning process is commonly accepted as the standard for most organizations.

The repository shall have financial practices and procedures which are transparent, compliant with relevant accounting standards and practices, and audited by third parties in accordance with territorial legal requirements.

Supporting Text
This is necessary in order to guard against malfeasance or other untoward activity that might threaten the economic viability of the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Demonstrated dissemination requirements for business planning and practices; citations to and/or examples of accounting and audit requirements, standards, and practice; audited annual financial statements.

Discussion
The repository cannot simply claim transparency, but should show that it adjusts its business practices to keep them transparent, compliant, and auditable. Confidentiality requirements may prohibit making information about the repository’s finances public, but the repository should be able to demonstrate that it is satisfying the needs of its Designated Community.


The repository shall have an ongoing commitment to analyze and report on financial risk, benefit, investment, and expenditure (including assets, licenses, and liabilities).

Supporting Text
This is necessary in order to demonstrate that the repository has identified and documented these categories, and actively manages them, including identifying and responding to risks, describing and leveraging benefits, specifying and balancing investments, and anticipating and preparing for expenditures.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Risk management documents that identify perceived and potential threats and planned or implemented responses (a risk register); technology infrastructure investment planning documents; cost/benefit analyses; financial investment documents and portfolios; requirements for and examples of licenses, contracts, and asset management; evidence of revision based on risk.

Discussion
The repository should have a goal of maintaining an appropriate balance between risk and benefits, investment and return.

CONTRACTS, LICENSES, AND LIABILITIES

The repository shall have and maintain appropriate contracts or deposit agreements for digital materials that it manages, preserves, and/or to which it provides access.

Supporting Text
This is necessary in order to ensure that the repository has the rights and authorizations needed to enable it to collect and preserve digital content over time, make that information available to its Designated Community, and defend those rights when challenged.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Properly signed and executed deposit agreements and licenses in accordance with local, national, and international laws and regulations; policies on third-party deposit arrangements; definitions of service levels and permitted uses; repository policies on the treatment of ‘orphan works’ and copyright dispute resolution; reports of independent risk assessments of these policies; procedures for regularly reviewing and maintaining agreements, contracts, and licenses.

Discussion
Repositories may need to show evidence that their contracts are being followed. This is especially important for those with third-party deposit arrangements. These arrangements may require the repository to guarantee that relevant contracts, licenses, or deposit agreements express rights, responsibilities, and expectations of each party. Contracts and formal deposit agreements should be legitimate; that is, they need to be countersigned and current. When the relationship between depositor and repository is less formal (e.g., a faculty member depositing work in an academic institution’s preservation repository), documentation articulating the repository’s capabilities and commitments should be provided to each depositor. Repositories engaged in Web harvesting may find this requirement difficult because of the way in which Web-based information is harvested/captured for long- term preservation, and so contracts or deposit agreements are rarely required. Some repositories capture, manage, and preserve access to this material without written permission from the content creators. Others go through the very time-consuming and costly process of contacting content owners before capturing and ingesting information. Ideally, agreements are tracked, linked, managed, and made accessible in a contracts database.

The repository shall have contracts or deposit agreements which specify and transfer all necessary preservation rights, and those rights transferred shall be documented.

Supporting Text
This is necessary in order to have sufficient control of the information for preservation and limit the repository’s exposure to liability or legal and financial harm.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Contracts, deposit agreements; specification(s) of rights transferred for different types of digital content (if applicable); policy statements on requisite preservation rights.

Discussion
Because the right to change or alter digital information is often restricted by law to the creator, it is important that digital repository contracts and agreements address the need to be able to work with and potentially modify digital objects to keep them accessible. Repository agreements with depositors must specify and/or transfer to the repository certain rights enabling appropriate and necessary preservation actions for the digital objects within the repository. Because legal negotiations can take time, potentially preventing or slowing the ingest of digital objects at risk, it is acceptable for a digital repository to take in or accept digital objects even with only minimal preservation rights using an open-ended agreement and then deal with expanding to detailed rights later.

The repository shall have specified all appropriate aspects of acquisition, maintenance, access, and withdrawal in written agreements with depositors and other relevant parties.

Supporting Text
This is necessary in order to ensure that the respective roles of repository, producers, and contributors in the depositing of digital content and transfer of responsibility for preservation are understood and accepted by all parties.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Properly executed submission agreements, deposit agreements, and deeds of gift; written standard operating procedures.

Discussion
The deposit agreement specifies all aspects of these issues that are necessary for the repository to carry out its function. There may be a single agreement covering all deposits, or specific agreements for each deposit, or a standard agreement supplemented by special conditions for some deposits. These special conditions may add to the standard agreement or override some aspects of the standard agreement. Agreements may need to cover restrictions on access and will need to cover all property rights in the digital objects. Agreements may place responsibilities on depositors, such as ensuring that Submission Information Packages (SIPs) conform to some pre-agreed standards, and may allow repositories to refuse SIPs that do not meet these standards. Other repositories may take responsibility for fixing errors in SIPs. The division of responsibilities must always be clear. Agreements, written or otherwise, may not always be necessary. The burden of proof is on the repository to demonstrate that it does not need such agreements because, for instance, it has a legal mandate for its activities. An agreement should include, at a minimum, property rights, access rights, conditions for withdrawal, level of security, level of finding aids, SIP definitions, time, volume, and content of transfers. One example of a standard to follow for this is the CCSDS/ISO Producer-Archive Interface Methodology Abstract Standard (reference [B4]).

The repository shall have written policies that indicate when it accepts preservation responsibility for contents of each set of submitted data objects.

Supporting Text
This is necessary in order to avoid misunderstandings between the repository and producer/depositor as to when and how the transfer of responsibility for the digital content occurs.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Properly executed submission agreements, deposit agreements, and deeds of gift; confirmation receipt sent back to producer/depositor.

Discussion
If this requirement is not met, there is a risk that, for example, the original is erased before the repository has taken responsibility for the submitted data objects. Without the understanding that the repository has already taken preservation responsibility for the SIP, there is the risk that the producer/depositor may make changes to the data and these would not be properly preserved since they had already been ingested by the repository. For example, for convenience the repository could receive a copy of raw science data from the instrument at the same time the science team gets it, but the science team would have responsibility for it until they turn over responsibility to the final repository. Repositories that report back to their depositors generally will mark this acceptance with some form of notification (for example, confirmation receipts) to the depositor. (This may depend on repository responsibilities as designated in the depositor agreement.) A repository may mark the transfer by sending a formal document, often a final signed copy of the transfer agreement, back to the depositor signifying the completion of the transformation from SIP to AIP process. Other approaches are equally acceptable. Brief daily updates may be generated by a repository that only provides annual formal transfer reports.

The repository shall have policies in place to address liability and challenges to ownership/rights.

Supporting Text
This is necessary in order to minimize potential liability and challenges to the rights of the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
A definition of rights, licenses, and permissions to be obtained from producers and contributors of digital content; citations to relevant laws and regulations; policy on responding to challenges; documented track record for responding to challenges in ways that do not inhibit preservation; records of relevant legal advice sought and received.

Discussion
The repository’s Preservation Policies and Preservation Implementation Plans and mechanisms should be vetted by appropriate institutional authorities and/or legal experts to ensure that responses to challenges adhere to relevant laws and requirements, and that the potential liability for the repository is minimized.

The repository shall track and manage intellectual property rights and restrictions on use of repository content as required by deposit agreement, contract, or license.

Supporting Text
This is necessary in order to allow the repository to track, act on, and verify rights and restrictions related to the use of the digital objects within the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
A Preservation Policy statement that defines and specifies the repository’s requirements and process for managing intellectual property rights; depositor agreements; samples of agreements and other documents that specify and address intellectual property rights; documentation of monitoring by repository over time of changes in status and ownership of intellectual property in digital content held by the repository; results from monitoring, metadata that captures rights information.

Discussion
The repository should have a mechanism for tracking licenses and contracts to which it is obligated. Whatever the format of the tracking system, it must be sufficient for the institution to track, act on, and verify rights and restrictions related to the use of the digital objects within the repository.


DIGITAL OBJECT MANAGEMENT

INGEST: ACQUISITION OF CONTENT

The repository shall identify the Content Information and the Information Properties that the repository will preserve.

Supporting Text
This is necessary in order to make it clear to funders, depositors, and users what responsibilities the repository is taking on and what aspects are excluded. It is also a necessary step in defining the information which is needed from the information producers or depositors.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Mission statement; submission agreements/deposit agreements/deeds of gift; workflow and Preservation Policy documents, including written definition of properties as agreed in the deposit agreement/deed of gift; written processing procedures; documentation of properties to be preserved.

Discussion
This process begins in general with the repository’s mission statement and may be further specified in pre-accessioning agreements with producers or depositors (e.g., producer-archive agreements) and made very specific in deposit or transfer agreements for specific digital objects and their related documentation. For example, one repository may only commit to preserving the textual content of a document and not its exact appearance on a screen. Another may wish to preserve the exact appearance and layout of textual documents, while others may choose to keep the units of the measurement of data fields and to normalize the data during the ingest process. If unique identifiers are associated with digital objects before ingest, they may also be properties that need to be preserved.

The repository shall have a procedure(s) for identifying those Information Properties that it will preserve.

Supporting Text
This is necessary to establish a clear understanding with depositors, funders, and the repository’s Designated Communities how the repository determines and checks what the characteristics and properties of preserved items will be over the long term. These procedures will be necessary to confirm authenticity or to identify erroneous claims of authenticity of the preserved digital record.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Definitions of the Information Properties which should be preserved; submission agreements/deposit agreements, Preservation Policies, written processing procedures, workflow documentation.

Discussion
These procedure(s) document the methods and factors a repository uses to determine the aspects of different types of Content Information for which it accepts preservation responsibility to its designated communities. For example, a repository’s procedure may be to use file formats in order to determine the properties it will preserve unless otherwise specified in a deposit agreement. In this case, the repository would be able to demonstrate provenance for objects that may have been the same file format when received but are preserved differently over the long term.

The repository shall have a record of the Content Information and the Information Properties that it will preserve.

Supporting Text
This is necessary in order to identify in writing the Content Information of the records for which it has taken preservation responsibility and the Information Properties it has committed to preserve for those records based on their Content Information.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Preservation Policies, processing manuals, collection inventories or surveys, logs of Content Information types, acquired preservation strategies, and action plans.

Discussion
The repository must demonstrate that it establishes and maintains an understanding of its digital collections sufficient to carry out the preservation necessary to persist the properties to which it has committed. The repository can use this information to determine the effectiveness of its preservation activities over time.

The repository shall clearly specify the information that needs to be associated with specific Content Information at the time of its deposit.

Supporting Text
This is necessary in order that there is a clear understanding of what needs to be acquired from the Producer.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Transfer requirements; producer-archive agreements; workflow plans to produce the AIP.

Discussion
For most types of digital objects to be ingested, the repository should have written criteria, prepared by the repository on its own or in conjunction with other parties, that specify exactly what digital object(s) are transferred, what documentation is associated with the object(s), and any restrictions on access, whether technical, regulatory, or donor-imposed. These criteria document what information the repository and its designated communities may expect for digital object(s) upon deposit. The depositor may be a harvesting process created by the repository. The level of precision in these specifications will vary with the nature of the repository’s collection policy and its relationship with creators. For instance, repositories engaged in Web harvesting, or those that rescue digital materials long after their creators have abandoned them, cannot impose conditions on the creators of material, since they are not ‘depositors’ in the usual sense of the word. But Web harvesters can, for instance, decide which metadata elements from the HTTP transactions that captured a site are to be preserved along with the site’s files, and this still constitutes ‘information associated with the digital material’. They may also choose to record the information or decisions—whether taken by humans or by automated algorithms—that led to the site’s being captured. The repository can check what it receives from the producer based on the specifications.

The repository shall have adequate specifications enabling recognition and parsing of the SIPs.

Supporting Text
This is necessary in order to be sure that the repository is able to extract information from the SIPs.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Packaging Information for the SIPs; Representation Information for the SIP Content Data, including documented file format specifications; published data standards; documentation of valid object construction.

Discussion
The repository must be able to determine what the contents of a SIP are with regard to the technical construction of its components. For example, the repository needs to be able to recognize a TIFF file and confirm that it is not simply a file with a filename ending in ‘TIFF’. Another example, would be a website for which the repository would need to be able to recognize and test the validity of the variety of file types (e.g., HTML, images, audio, video, CSS, etc.) that are part of the website. This is necessary in order to confirm: 1) the SIP is what the repository expected; 2) the Content Information is correctly identified; and 3) the properties of the Content Information to be preserved have been appropriately selected.

The repository shall have mechanisms to appropriately verify the identity of the Producer of all materials.

Supporting Text
This is necessary in order to avoid providing erroneous provenance to the information which is preserved.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Legally binding submission agreements/deposit agreements/deeds of gift, evidence of appropriate technological measures; logs from procedures and authentications.

Discussion
The repository’s written standard operating procedures and actual practices must ensure the digital objects are obtained from the expected depositor. Examples of a Producer include persons, organizations, corporate entities, or harvesting processes. Different repositories will adopt different levels of proof needed; the Designated Community should have the opportunity to review the evidence.

The repository shall have an ingest process which verifies each SIP for completeness and correctness.

Supporting Text
This is necessary in order to detect and correct errors in the SIP when created and potential transmission errors between the depositor and the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Appropriate Preservation Policy and Preservation Implementation Plan documents and system log files from system(s) performing ingest procedure(s); logs or registers of files received during the transfer and ingest process; documentation of standard operating procedures, detailed procedures, and/or workflows; format registries; definitions of completeness and correctness.

Discussion
Information collected during the ingest process must be compared with information from some other source to verify the correctness of the data transfer and ingest process. Other sources will include technical and descriptive metadata obtained prior to ingest and may also include expectations set by the depositor, the object producer, a format registry, or the repository’s own expectations. The extent to which a repository can determine correctness will depend on what it knows about the SIP and what tools are available for verifying correctness. It can mean simply checking that file formats are what they claim to be (TIFF files are valid TIFF format, for instance), or can imply checking the content. This might involve human checking in some cases, such as confirming that the description of a picture matches the image. This allows the repository to demonstrate that its preserved objects have completely and correctly copied what it intended to copy from the SIPs. It also allows the repository to document reasons for other SIP-related actions such as rejecting the transfer, suspending processing until the missing information is received, or simply reporting the errors. Similarly, the definition of ‘completeness’ should be appropriate to a repository’s activities. If an inventory of files was provided by a producer as part of pre-ingest negotiations, one would expect checks to be carried out against that inventory. Whatever checks are carried out must be consistent with the repository’s own documented definition and understanding of completeness and correctness. One thing that a repository might want to do is check for network drop out or other corruption during the transmission process.

The repository shall obtain sufficient control over the Digital Objects to preserve them.

Supporting Text
This is necessary in order to ensure that the preservation can be accomplished, with physical control, and is authorized, with legal control.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documents showing the level of physical control the repository actually has. A separate database/metadata catalog listing all of the digital objects in the repository and metadata sufficient to validate the integrity of those objects (file size, checksum, hash, location, number of copies, etc.)

Discussion
The repository must obtain complete control of the bits of the digital objects conveyed with each SIP. Sufficient physical and legal control is necessary for the archives to make any changes required by their Preservation Implementation Plan for that data and to distribute it to their consumers. For example, in cases where SIPs only reference digital objects, the repository must also reference the digital objects or preserve them if the current repository is not committed to such preservation.

The repository shall provide the producer/depositor with appropriate responses at agreed points during the ingest processes.

Supporting Text
This is necessary in order to ensure that the producer can verify that there are no inadvertent lapses in communication which might otherwise allow loss of SIPs.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Submission agreements/deposit agreements/deeds of gift; workflow documentation; standard operating procedures; evidence of ‘reporting back’ such as reports, correspondence, memos, or emails.

Discussion
Based on the initial processing plan and agreement between the repository and the producer/depositor, the repository must provide the producer/depositor with progress reports at agreed points throughout the ingest process. Repository responses can range from nothing at all to predetermined, periodic reports of the ingest completeness and correctness, error reports and any final transfer of custody document. Producers/Depositors can request further information on an ad hoc basis when the previously agreed upon reports are insufficient.

The repository shall have contemporaneous records of actions and administration processes that are relevant to content acquisition.

Supporting Text
This is necessary to ensure that such documentation, which may be needed in an audit, is captured and is accurate and authentic.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Written documentation of decisions and/or action taken; preservation metadata logged, stored, and linked to pertinent digital objects, confirmation receipts sent back to providers.

Discussion
These records should be created on or about the time of the actions they refer to and are related to actions taken during the Ingest: Acquisition of Content process (4.1). The records may be automated or may be written by individuals, depending on the nature of the actions described. Where community or international standards are used, the repository must demonstrate that all relevant actions are carried through.


INGEST: CREATION OF THE AIP

The repository shall have for each AIP or class of AIPs preserved by the repository an associated definition that is adequate for parsing the AIP and fit for long- term preservation needs.

Supporting Text
This is necessary to ensure that the AIP and its associated definition, including appropriate Packaging Information, can always be found, processed and managed within the archive.

The repository shall be able to identify which definition applies to which AIP.

Supporting Text
This is necessary to ensure that the appropriate definition is used when parsing/interpreting an AIP.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation clearly linking each AIP, or class of AIPs, to its definition.

Discussion
The repository may use any method for associating the definitions and the AIPs that provides for the continued and continuous linkage of the two entities.

The repository shall have a definition of each AIP that is adequate for long- term preservation, enabling the identification and parsing of all the required components within that AIP.

Supporting Text
This is necessary in order to explicitly show that the AIPs are fit for their intended purpose, that each component of an AIP has been adequately conceived and executed and the plans for the maintenance of each AIP are in place. (See 4.3, Preservation Planning, below.)

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Demonstration of the use of the definitions to extract Content Information and PDI (Provenance, Access Rights, Context, Reference, and Fixity Information) from AIPs. It should be noted that the Provenance of a digital object, for example, may be extended over time to reflect additional preservation actions.

Discussion
Documentation should identify each class of AIP and describe how each is implemented within the repository. Implementations may, for example, involve some combination of files, databases, and/or documents. Documentation shall relate the AIP component’s contents to the related preservation needs of the repository, with enough detail for the repository’s providers and consumers to be confident that the significant properties of AIPs will be preserved. Documentation should clearly show that AIP components such as Representation Information and Provenance can be managed and kept up to date. The repository should clearly identify when new versions of AIPs need to be created in order to keep them fit for purpose. The external dependencies of the AIP should also be recorded.

Definitions should exist for each AIP, or class of AIP if there are many instances of the same type. Repositories that store a wide variety of object types may need a specific definition for each AIP they hold, but it is expected that most repositories will establish class descriptions that apply to many AIPs. It must be possible to determine which definition applies to which AIP. It may also be necessary for the definitions to say something about the semantics or intended use of the AIPs if this could affect long-term preservation decisions. For example, two repositories might both preserve only digital still images, both using multi-image TIFF files as their preservation format. Repository 1 consists entirely of real-world photographic images intended for viewing by people and has a single definition covering all of its AIPs. (The definition may refer to a local or external definition of the TIFF format.) Repository 2 contains some images, such as medical x-rays, that are intended for computer analysis rather than viewing by the human eye, and other images that are like those in Repository 1. Repository 2 should perhaps define two classes of AIPs, even though it only uses one storage format for both. A future preservation action may depend on the intended use of the image— an action that changes the bit-depth of the image in a way that is not perceivable to the human eye may be satisfactory for real-world photographs but not for medical images, for example. An AIP contains these key components: the primary data object to be preserved, its supporting Representation Information (format and meaning of the format elements), and the various categories of Preservation Description Information (PDI) that also need to be associated with the primary data object: Fixity, Provenance, Context, and Reference. There should be a definition of how these categories of information are linked.

The repository shall have a description of how AIPs are constructed from SIPs.

Supporting Text
This is necessary in order to ensure that the AIP(s) adequately represents the information in the SIP(s).

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Process description documents; documentation of the SIP-AIP relationship; clear documentation of how AIPs are derived from SIPs.

Discussion
In some cases, the AIP and SIP will be almost identical apart from packaging and location, and the repository need only state this. In other cases, complex transformations (e.g., data normalization) may be applied to objects during the ingest process, and a precise description of these actions may be necessary to reflect how the AIP(s) has been adequately transformed from the information in the SIP(s). The AIP construction description should include documentation that gives a detailed description of the ingest process for each SIP to AIP transformation, typically consisting of an overview of general processing being applied to all such transformations, augmented with description of different classes of such processing and, when applicable, with special transformations that were needed.

Some repositories may need to produce these complex descriptions case by case. Under such circumstances case diaries or logs of actions taken to produce each AIP should be created and maintained. In these cases, documentation should be mapped to individual AIPs, and the mapping should be available for examination. Other repositories that can run a more production-line approach may have a description for how each class of incoming objects is transformed to produce the AIP. It must be clear which definition applies to which AIP. If, to take a simple example, two separate processes each produce a TIFF file, it must be clear which process was applied to produce a particular TIFF file.

The repository shall document the final disposition of all SIPs. In particular the following aspect must be checked.

The repository shall follow documented procedures if a SIP is not incorporated into an AIP or discarded and shall indicate why the SIP was not incorporated or discarded.

Supporting Text
This is necessary in order to ensure that the SIPs received have been dealt with appropriately, and in particular have not been accidentally lost.

Examples of Ways the Repository can Demonstrate it is Meeting these Requirements
System processing files; disposal records; donor or depositor agreements/deeds of gift; provenance tracking system; system log files; process description documents; documentation of SIP relationship to AIP; clear documentation of how AIPs are derived from SIPs; documentation of standard/process against which normalization occurs; documentation of normalization outcome and how the resulting AIP is different from the SIP(s).

Discussion
The timescale of this process will vary between repositories from seconds to many months, but SIPs must not remain in an unprocessed limbo-like state forever. The accessioning procedures and the internal processing and audit logs should maintain records of all internal transformations of SIPs to demonstrate that they either become AIPs (or part of AIPs) or are disposed of. Appropriate descriptive information should also document the provenance of all digital objects.

The repository shall have and use a convention that generates persistent, unique identifiers for all AIPs.

In particular the following aspects must be checked.

The repository shall uniquely identify each AIP within the repository.
The repository shall have unique identifiers.
The repository shall assign and maintain persistent identifiers of the AIP and its components so as to be unique within the context of the repository.
Documentation shall describe any processes used for changes to such identifiers.
The repository shall be able to provide a complete list of all such identifiers and do spot checks for duplications.
The system of identifiers shall be adequate to fit the repository’s current and foreseeable future requirements such as numbers of objects.

Supporting Text
This is necessary in order to ensure that each AIP can be unambiguously found in the future. This is also necessary to ensure that each AIP can be distinguished from all other AIPs in the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation describing naming convention and physical evidence of its application (e.g., logs).

The repository shall have a system of reliable linking/resolution services in order to find the uniquely identified object, regardless of its physical location.

Supporting Text
This is necessary in order that actions relating to AIPs can be traced over time, over system changes, and over storage changes.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation describing naming convention and physical evidence of its application (e.g., logs).

Discussion
A repository needs to ensure that there is in place an accepted, standard naming convention that identifies its materials uniquely and persistently for use both in and outside the repository. The ‘visibility’ requirement here means ‘visible’ to repository managers and auditors. It does not imply that these unique identifiers need to be visible to end users or that they serve as the primary means of access to digital objects. Ideally, the unique ID lives as long as the AIP; if it does not, there must be traceability. Subsection 4.2.1 requires that the components of an AIP be suitably bound and identified for long-term management, but places no restrictions on how AIPs are identified with files. Thus, in the general case, an AIP may be distributed over many files, or a single file may contain more than one AIP. Therefore identifiers and filenames may not necessarily correspond to each other. Documentation must represent these relationships.

The repository shall have access to necessary tools and resources to provide authoritative Representation Information for all of the digital objects it contains. In particular the following aspects must be checked.

The repository shall have tools or methods to identify the file type of all submitted Data Objects.
The repository shall have tools or methods to determine what Representation Information is necessary to make each Data Object understandable to the Designated Community.
The repository shall have access to the requisite Representation Information.
The repository shall have tools or methods to ensure that the requisite Representation Information is persistently associated with the relevant Data Objects.

Supporting Text
This is necessary in order to ensure that the repository’s digital objects are understandable to the Designated Community.

Examples of Ways the Repository can Demonstrate it is Meeting these Requirements
Subscription or access to registries of Representation Information (including format registries); viewable records in local registries (with persistent links to digital objects); database records that include Representation Information and a persistent link to relevant digital objects.

Discussion
These tools and resources can be held internally or can be shared via, for example, a trusted set of registries. However, this requirement does not demand that each repository has such tools and resources, merely that it has access to them. For example a repository may access external registries. 1 Any such registry is a specialized type of repository, which itself must be certified/trustworthy. The repository may use these types of standardized, authoritative information sources to identify and/or verify the Representation Information components of Content Information and PDI. This will reduce the long-term maintenance costs to the repository and improve quality control. Sometimes there is both general Representation Information (e.g., format information) and specific Representation Information (e.g., meanings of individual fields within a dataset). Often the general information will be available in an external repository, but the local repository may need to maintain the instance-specific information. It is likely that many repositories would wish to keep local copies of relevant Representation Information; however, this may not be practical in all cases. Even where a repository strives to keep all such information locally there may be, for example, a schedule of updates which means that until an update is performed, the local Representation Information is incomplete. This may be regarded as a kind of local caching of, for example, the Representation Information held in registries. Alternatively one may say that in these cases, the use of international registries is not meant to replace local registries but instead serve as a resource to verify or obtain independent, authoritative information about any and all Representation Information. Good practice suggests that any locally held Representation Information should also be made available to other repositories via a trusted registry. In addition any item of Representation Information should itself have adequate Representation Information to ensure that the Designated Community can understand and use the data object being preserved.

The repository shall have documented processes for acquiring Preservation Description Information (PDI) for its associated Content Information and acquire PDI in accordance with the documented processes. In particular the following aspects must be checked.

The repository shall have documented processes for acquiring PDI.
The repository shall execute its documented processes for acquiring PDI.
The repository shall ensure that the PDI is persistently associated with the relevant Content Information.

Supporting Text
This is necessary in order to ensure that an auditable trail to support claims of authenticity is available, that unauthorized changes to the digital holdings can be detected, and that the digital objects can be identified and placed in their appropriate context.

The Unified Digital Formats Registry (UDFR, http://www.gdfr.info/udfr.html) and the UK Digital Curation Centre’s Registry Repository of Representation Information (RRORI, http://registry.dcc.ac.uk) are two emerging examples.

Examples of Ways the Repository can Demonstrate it is Meeting these Requirements
Standard operating procedures; manuals describing ingest procedures; viewable documentation on how the repository acquires and manages Preservation Description Information (PDI); creation of checksums or digests, consulting with Designated Community about Context.

Discussion
PDI is needed not only by the repository to help ensure the Content Information is not corrupted (Fixity) and is findable (Reference Information), but to help ensure the Content Information is adequately understandable by providing a historical perspective (Provenance Information) and by providing relationships to other information (Context Information). The extent of such information needs is best addressed by members of the Designated Community(ies). The PDI must be permanently associated with Content Information.

The repository shall ensure that the Content Information of the AIPs is understandable for their Designated Community at the time of creation of the AIP. In particular the following aspects must be checked.

Repository shall have a documented process for testing understandability for their Designated Communities of the Content Information of the AIPs at their creation.
The repository shall execute the testing process for each class of Content Information of the AIPs.
The repository shall bring the Content Information of the AIP up to the required level of understandability if it fails the understandability testing.

Supporting Text
This is necessary in order to ensure that one of the primary tests of preservation, namely that the digital holdings are understandable by their Designated Community, can be met. (See 4.3 for additional requirements for understandability beyond ingest.)

Examples of Ways the Repository can Demonstrate it is Meeting these Requirements
Test procedures to be run against the digital holdings to ensure their understandability to the defined Designated Community; records of such tests being performed and evaluated; evidence of gathering or identifying Representation Information to fill any intelligibility gaps which have been found; retention of individuals with the discipline expertise.

Discussion
This requirement is concerned with the understandability of the AIP. If the ingested material is not understandable, the repository needs to ingest or make available additional information to make sure that the AIPs are understandable to the Designated Community(ies). For example, if documents are written in a dying language and the Designated Community is no longer able to understand the language the documents are written in, the repository would need to provide additional documentation that would allow the Designated Community to understand the documents (e.g., translations of the documents in a language the Designated Community could understand or dictionaries that would allow the Designated Communities to translate the documents into a language its members understand).

The repository shall verify each AIP for completeness and correctness at the point it is created.

Supporting Text
This is necessary in order to ensure that what is maintained over the long term is as it should be and can be traced to the information provided by the Producers.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Description of the procedure that verifies completeness and correctness of the AIPs; logs of the procedure.

Discussion
The repository should be sure that the AIPs it creates are as they are expected to be by checking them against the associated definition for each AIP or class of AIP (see 4.2.1) and the description of how AIPs are constructed from SIPs (see 4.2.2). If the repository has a standard process to verify SIPs for both completeness and correctness and a demonstrably correct process for transforming SIPs into AIPs, then it simply needs to demonstrate that the initial checks were carried out successfully and that the transformation process was carried out without indicating errors. On the other hand repositories that must create unique processes for many of their AIPs will also need to generate unique methods for validating the completeness and correctness of AIPs. This may include performing tests of some sort on the content of the AIP that can be compared with tests on the SIP. Such tests might be simple (counting the number of records in a file, or performing some simple statistical measure), but they might be complex. Documentation should describe how the completeness and correctness of AIPs is ensured, starting with receipt from the producer and continuing through AIP creation and supporting long-term preservation. Example approaches include the use of checksums, testing that checksums are still correct at various points during ingest and preservation, logs that such checks have been made, and any special tests that may be required for a particular AIP instance or class.

The repository shall provide an independent mechanism for verifying the integrity of the repository collection/content.

Supporting Text
This is necessary to enable the audit of the integrity of the collection as a whole.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation provided for 4.2.1 through 4.2.4; documented agreements negotiated between the producer and the repository (see 4.1.1-4.1.8); logs of material received and associated action (receipt, action, etc.) dates; logs of periodic checks.

Discussion
It is the responsibility of the repository to choose the appropriate mechanism for checking the completeness and correctness of its collections. In general, it is likely that a repository that meets all the previous criteria will satisfy this one without needing to demonstrate anything more. As a separate requirement, it demonstrates the importance of being able to audit the integrity of the collection as a whole. For example, if a repository claims to have all e-mail sent or received by The Yoyodyne Corporation between 1985 and 2005, it has been required to show that:

  • the content it holds came from Yoyodyne’s e-mail servers;
  • it is all correctly transformed into a preservation format;
  • each monthly SIP of e-mail has been correctly preserved, including original unique identifiers such as Message-IDs.

However, it may still have no way of showing whether this really represents all of Yoyodyne’s email. For example, if there is a three-day period with no messages in the repository, is this because Yoyodyne was shut down for those three days, or because the e- mail was lost before the SIP was constructed? This case could be resolved by the repository’s amending its description of the collection, but other cases may not be so straightforward. A familiar mechanism from the world of traditional materials in libraries and archives is an accessions or acquisitions register that is independent of other catalog metadata. A repository should be able to show, for each item in its accessions register, which AIP(s) contain content from that item. Alternatively, it may need to show that there is no AIP for an item, either because ingest is still in progress, or because the item was rejected for some reason. Conversely, any AIP should be able to be related to an entry in the acquisitions register.

The repository shall have contemporaneous records of actions and administration processes that are relevant to AIP creation.

Supporting Text
This is necessary in order to ensure that there is omitted from the record nothing relevant that might be needed to provide an independent means to verify that all AIPs have been properly created in accord with the documented procedures (see 4.2.1 through 4.2.9). It is the responsibility of the repository to justify its practice in this respect.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Written documentation of decisions and/or action taken with timestamps; preservation metadata logged, stored, and linked to pertinent digital objects.

Discussion
These records must be created on or about the time of the actions they refer to and are related to actions associated with AIP creation. The records may be automated or may be written by individuals, depending on the nature of the actions described. Where community or international standards are used, the repository must demonstrate that all relevant actions are carried through.

PRESERVATION PLANNING

The repository shall have documented preservation strategies relevant to its holdings.

Supporting Text
This is necessary in order that it is clear how the repository plans to ensure the information will remain available and usable for future generations and to provide a means to check and validate the preservation work of the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation identifying each preservation risk identified and the strategy for dealing with that risk.

Discussion
These documented preservation strategies will describe how the repository will act upon identified risks, as part of the preservation strategic plan. These preservation strategies and the preservation strategic plan will typically address the degradation of storage media, the obsolescence of media drives, and the obsolescence or inadequacy of Representation Information (including formats) as the knowledge base of the Designated Community changes, and safeguards against accidental or intentional digital corruption. For example, if migration is the chosen approach to some of these issues, there also needs to be Preservation Policies on what triggers a migration and what types of migration are expected to solve the preservation risk identified. The preservation strategy will describe the range of activities that need to be done in case of a migration.

The repository shall have mechanisms in place for monitoring its preservation environment.

Supporting Text
This is necessary so that the repository can react to changes and thereby ensure that the preserved information remains understandable and usable by the Designated Community.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Surveys of the Designated Community of the repository.

Discussion
The repository should show that it has some active mechanism to ensure that the preserved information remains understandable and usable by the Designated Community and that it has mechanisms in place for monitoring and notification when Representation Information (including formats) approaches obsolescence or is no longer viable. For most repositories, the concern will be with the Representation Information used to preserve information, which may include information on how to deal with a file format or software that can be used to render or process it. Sometimes the format needs to change because the repository can no longer deal with it. Sometimes the format is retained and the information about what software is needed to process it needs to change. If the mechanism depends on an external registry, the repository must demonstrate how it uses the information from that registry.

The repository shall have mechanisms in place for monitoring and notification when Representation Information is inadequate for the Designated Community to understand the data holdings.

Supporting Text
This is necessary in order to ensure that the preserved information remains understandable and usable by the Designated Community.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Subscription to a Representation Information registry service; subscription to a technology watch service, surveys amongst its Designated Community members, relevant working processes to deal with this information.

Discussion
The repository must show that it has some active mechanism to warn of impending obsolescence. Obsolescence is determined largely in terms of the knowledge base of the Designated Community.

The repository shall have mechanisms to change its preservation plans as a result of its monitoring activities.

Supporting Text
This is necessary in order for the repository to be prepared for changes in the external environment that may make its current preservation plans a bad choice as the time to implement draws near.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Preservation Plans tied to formal or informal technology watch(es); preservation planning or processes that are timed to shorter intervals (e.g., not more than five years); proof of frequent Preservation Policies and Preservation Plans updates; sections of Preservation Policies that address how plans may be updated and that address how often the plans are required to be reviewed and reaffirmed or updated.

Discussion
The repository should demonstrate or describe how it reacts to information from monitoring, which sometimes requires a repository to change how it deals with the material it holds in ways that could not have been anticipated at an earlier stage. The repository should periodically review its preservation plans and the technology environment and, if necessary, makes changes to those plans to ensure their continued effectiveness. Another possible response to information gathered by monitoring is for the repository to update and create additional Representation Information and/or PDI.

The repository shall have mechanisms for creating, identifying or gathering any extra Representation Information required.

Supporting Text
This is necessary in order to ensure that the preserved information remains understandable and usable by the Designated Community.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Subscription to a format registry service; subscription to a technology watch service; preservation plans.

Discussion
The repository should have mechanisms in place for monitoring and notification when Representation Information (including formats) approaches obsolescence or is no longer viable, and it should be able to show that it has mechanisms to address such notifications.

The repository shall provide evidence of the effectiveness of its preservation activities.

Supporting Text
This is necessary in order to assure the Designated Community that the repository will be able to make the information available and usable over the mid-to-long-term.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Collection of appropriate preservation metadata; proof of usability of randomly selected digital objects held within the system; demonstrable track record for retaining usable digital objects over time; Designated Community polls.

Discussion
The repository should be able to demonstrate the continued preservation, including understandability, of its holdings. This could be evaluated at a number of degrees and depends on the specificity of the Designated Community. If a Designated Community is fairly broad, an auditor could represent the test subject in the evaluation. More specific Designated Communities could require significant efforts.


AIP PRESERVATION

The repository shall have specifications for how the AIPs are stored down to the bit level.

Supporting Text
This is necessary in order to ensure that the information can be extracted from the AIP over the long-term.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation of the format of AIPs; EAST and Data Entity Dictionary Specification Language (DEDSL) descriptions of the data components (see references [B6] and [B7]).

Discussion
The repository should specify the Representation information down to the bit level of each AIP component and must specify how the separate components are packaged together. The Representation Information must be available for each AIP and must be appropriately linked to the AIP. Often, repositories are tempted to describe AIP content only down to a level where a program will then be used to convert the information to a form understandable to their Designated Communities. However, if those programs ever fail to operate, then the information would be lost in all the AIPs that relied on that program.

The repository shall preserve the Content Information of AIPs.

Supporting Text
This is necessary because it is the fundamental mission of a repository to preserve the Content Information for its Designated Communities.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Preservation workflow procedure documentation; workflow procedure documentation; Preservation Policy documents specifying treatment of AIPs and under what circumstances they may ever be deleted; ability to demonstrate the sequence of conversions for an AIP for any particular digital object or group of objects ingested; documentation linking ingested objects and the current AIPs.

Discussion
The repository should be able to demonstrate that the AIPs faithfully reflect the information that was captured during ingest and that any subsequent or future planned transformations will continue to preserve all the required Information Properties of the Content Information. One approach to this requirement assumes that the repository has a policy specifying that AIPs cannot be deleted at any time. This particularly simple and robust implementation preserves links between what was originally ingested, as well as new versions that have been transformed or changed in any way. Depending upon implementation, these newer objects may be completely new AIPs or merely updated AIPs. Either way, persistent links between the ingested object and the resulting AIP should be maintained.

The repository shall actively monitor the integrity of AIPs.

Supporting Text
This is necessary in order to protect the integrity of the archival objects over time.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Fixity information (e.g., checksums) for each ingested digital object/AIP; logs of fixity checks; documentation of how AIPs and Fixity information are kept separate; documentation of how AIPs and accession registers are kept separate.

Discussion
A repository should have logs that show actions taken to check the integrity of archival objects in order to assure funders, producers, and users—and to allow them to audit/validate—that the repository is taking the necessary steps to ensure the long-term integrity of the digital objects. The repository should also document that integrity checks are carried out on a regular basis, in order to catch any changes in AIPs as soon as possible so that corrective action can be taken as soon as possible. The repository should allow interested parties to verify that this is the case.

At present, most repositories deal with this at the level of individual information objects by using a checksum of some form, such as MD5. In this case, the repository should be able, and may want to demonstrate that, the Fixity Information (checksums, and the information that ties them to AIPs) are stored separately or protected separately from the AIPs themselves, so that accidental alteration of the AIP would not also damage the Fixity Information. Also, someone who can maliciously alter an AIP would not likely be able as easily to alter the Fixity Information as well.

The repository shall have contemporaneous records of actions and administration processes that are relevant to storage and preservation of the AIPs.

Supporting Text
This is necessary in order to ensure documentation is not omitted or erroneous or of questionable authenticity.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Written documentation of decisions and/or action taken; preservation metadata logged, stored, and linked to pertinent digital objects.

Discussion
The records may be automated or may be written by individuals, depending on the nature of the actions described. Where community or international standards are used, the repository must demonstrate that all relevant actions are appropriately performed.

The repository shall have procedures for all actions taken on AIPs.

Supporting Text
This is necessary in order to ensure that any actions performed against an AIP do not alter the AIP information in a manner unacceptable to its Designated Communities.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Written documentation describing all actions that can be performed against an AIP.

Discussion
This documentation is normally created during design of the repository. It should detail the normal handling of AIPs, all actions that can be performed against the AIPs, including success and failure conditions and details of how these processes can be monitored.

The repository shall be able to demonstrate that any actions taken on AIPs were compliant with the specification of those actions.

Supporting Text
This is necessary in order to ensure that any actions performed against an AIP do not alter the AIP information in a manner unacceptable to its Designated Communities.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Preservation metadata logged, stored, and linked to pertinent digital objects and documentation of that action; procedural audits of the repository showing that all actions conform to the documented processes.

Discussion
Successful preservation of information in the archive is strongly linked to following established and documented procedures to complete any actions that affect the repository data. The more often ‘special handling’ of repository data occurs and the more often this ‘special handling’ is not overseen in a consistent manner, the more likely that the data held by the repository will be compromised. When procedures are regularly followed, any deviation from procedures that would be likely to cause an alteration in the data will more likely be noticed or, if not noticed, may more likely be able to be corrected, or the timing and likely change could be identified in the future.


INFORMATION MANAGEMENT

The repository shall specify minimum information requirements to enable the Designated Community to discover and identify material of interest.

Supporting Text
This is necessary in order to enable discovery of the repository’s holdings.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Retrieval and descriptive information, discovery metadata, such as Dublin Core, and other documentation describing the object.

Discussion
The repository should be able to deal with the types of requests that will come from a typical user from the Designated Community. A repository does not necessarily have to satisfy every possible request. Retrieval metadata is distinct from descriptive information that describes what has been found.

The repository shall capture or create minimum descriptive information and ensure that it is associated with the AIP.

Supporting Text
This is required in order to ensure that descriptive information is associated with the AIP.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Descriptive metadata; internal or external persistent, unique identifier or locator that is associated with the AIP (see also 4.2.4 about persistent, unique identifier); system documentation and technical architecture; depositor agreements; metadata policy documentation, incorporating details of metadata requirements and a statement describing where responsibility for its procurement falls; process workflow documentation.

Discussion
The repository should show that it associates with each AIP, minimum descriptive information that was either received from the producer or created by the repository. Associating the descriptive information with the object is important, although it does not require one-to-one correspondence, and may not necessarily be stored with the AIP. Hierarchical schemes of description can allow some descriptive elements to be associated with many items.

The repository shall maintain bi-directional linkage between each AIP and its descriptive information.

Supporting Text
This is necessary to ensure that all AIPs can be located and retrieved.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Descriptive metadata; unique, persistent identifier or locator associated with the AIP; documented relationship between the AIP and its metadata; system documentation and technical architecture; process workflow documentation.

Discussion
Repositories must implement procedures to establish and maintain relationships to associate descriptive information for each AIP, and should ensure that every AIP has some descriptive information associated with it and that all descriptive information must point to at least one AIP.

The repository shall maintain the associations between its AIPs and their descriptive information over time.

Supporting Text
This is necessary to ensure that all AIPs can continue to be located and retrieved.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Log detailing ongoing maintenance or checking of the integrity of the data and its relationships to the associated descriptive information, especially following repair or modification of the AIP; legacy descriptive information; persistence of identifier or locator; documented relationship between AIP and its descriptive information; system documentation and technical architecture; process workflow documentation.

Discussion
Repositories must implement procedures that let them know when the relationship between the data and the associated descriptive information is temporarily broken to ensure that it can be restored.

ACCESS MANAGEMENT

The term ‘access’ has a number of different senses, including access by users to the repository system, for example, physical security and user authentication, and the different stages of accessing records (making a request, verifying the rights of the requester, and preparing and sending a Dissemination Information Package [DIP]). This subsection is concerned with all of these. It is divided into two main requirements, one concerned with the existence and implementation of access policies, and one with the capacity of the repository to provide demonstrably authentic objects as DIPs. Thus the first requirement relates to requests initiated by a user and how the repository handles them to ensure that rights and agreements are respected, that security is monitored, that requests are fulfilled, etc. The second requirement relates to what is delivered to the Consumer and the trust that can be placed in it.

It must be understood that the capabilities and sophistication of the access system will vary depending on the repository’s Designated Community and the access mandates of the repository. Because of the variety of repositories and access mandates, these criteria may be subject to questions about applicability and interpretation at a local level.

The repository shall comply with Access Policies.

Supporting Text
This is necessary in order to ensure the repository has fully addressed all aspects of usage which might affect the trustworthiness of the repository, particularly with reference to support of the user community.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Statements of policies that are available to the user communities; information about user capabilities (authentication matrices); logs and audit trails of access requests; explicit tests of some types of access.

Discussion
Depending on the nature of the repository, the Access Policies may cover:

  • statements of what is accessible to which community, and on what conditions;
  • requirements for authentication and authorization of accessors;
  • enforcement of agreements applicable to access conditions;
  • recording of access actions.

Access may be managed partly by computers and partly by humans; checking passports, for instance, before issuing a user ID and password may be an appropriate part of access management for some institutions.

The repository shall log and review all access management failures and anomalies.

Supporting Text
This is necessary in order to identify security threats and access management system failures.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Access logs, capability of the system to use automated analysis/monitoring tools and generate problem/error messages; notes of reviews undertaken or action taken as a result of reviews.

Discussion
A repository should have some automated mechanism to note anomalous or unusual denials and use them to identify either security threats or failures in the access management system, such as valid users’ being denied access. This does not mean looking at every denied access.

The repository shall follow policies and procedures that enable the dissemination of digital objects that are traceable to the originals, with evidence supporting their authenticity.

Supporting Text
This is necessary to establish an auditable chain of authenticity from the AIP to disseminated digital objects.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
System design documents; work instructions (if DIPs involve manual processing); process walkthroughs; production of a sample copy with evidence of authenticity; documentation of community requirements for evidence of authenticity.

Discussion
Authenticity is not an ‘all or nothing’ concept, but is a matter of degree, judged on the basis of evidence. Thus the adequacy of the evidence is of key importance in assessing this requirement.

This requirement ensures that ingest, preservation, and transformation actions do not lose information that would support an auditable trail of authenticity between the original deposited object and the eventual disseminated object. A repository should record the processes to construct the DIPs from the relevant AIPs. This is a key part of establishing that DIPs reflect the content of AIPs, and hence of original material, in a trustworthy and consistent fashion. DIPs may simply be a copy of AIPs, or may result from a simple format transformation of an AIP. But in other cases, they may be derived in complex ways. A user may request a DIP consisting of the title pages from all e-books published in a given period, for instance, which will require these to be extracted from many different AIPs. Or a repository may disseminate automatically generated transcripts of voice recordings. A repository that allows requests for such complex DIPs will need to put more effort into demonstrating how it meets this requirement than a repository that only allows requests for DIPs that correspond to an entire AIP.

This requirement is concerned only with the relation between DIPs and the AIPs from which they are derived; elsewhere the link between the originals SIPs and the AIPs is considered.

The repository shall record and act upon problem reports about errors in data or responses from users.

Supporting Text
This is necessary in order for the users to consider the repository to be a trustworthy source of information.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
System design documents; work instructions (if DIPs involve manual processing); process walkthroughs; logs of orders and DIP production; documentation of error reports and the actions taken.

Discussion
The objective of access management is to ensure that a user receives a usable and correct version of the digital object(s) (i.e., DIP) that he or she requested. A repository should show that any problems that do occur and are brought to its attention are investigated and acted on. Such responsiveness is essential for the repository to be considered trustworthy.


INFRASTRUCTURE AND SECURITY RISK MANAGEMENT

TECHNICAL INFRASTRUCTURE RISK MANAGEMENT

The repository shall identify and manage the risks to its preservation operations and goals associated with system infrastructure.

Supporting Text
This is necessary to ensure a secure and trustworthy infrastructure.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Infrastructure inventory of system components; periodic technology assessments; estimates of system component lifetime; export of authentic records to an independent system; use of strongly community supported software e.g., Apache, iRODS, Fedora); re-creation of archives from backups.

Discussion
The repository should conduct or contract assessments of the risks related to hardware and software infrastructure, and operational procedures. The repository should provide mechanisms that minimize risk from dependencies on proprietary or obsolete system infrastructure and from operational error. The degree of support required relates to the criticality of the subsystem(s) involved in long-term preservation. The repository should maintain a system that is scalable (e.g., able to handle anticipated future volumes of both bytes and files) without a major disruption of the system. The repository should maintain a system that is evolvable. That is, the system should be designed in such a way that major components of the system can be replaced with newer technologies without major disruption of the system as a whole. The repository system should be extensible. That is, the system should be designed to accommodate future formats (media and files) without major disruption of the system as a whole. The repository should be able to export its holdings to a future custodian. The repository should be able to re-create the archives after an operational error that overwrites or deletes digital holdings.

The repository shall employ technology watches or other technology monitoring notification systems.

Supporting Text
This is necessary to track when hardware or software components will become obsolete and migration is needed to new infrastructure.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Management of periodic technology assessment reports. Comparison of existing technology to each new assessment.

Discussion
The objective is to understand when any subsystem poses a risk of obsolescence, and enable planning migration to new technology before interoperability mechanisms are no longer available. This can be driven by proprietary software dependencies (the vendor no longer supports the subsystem component), and by emergence of new protocols (the mechanism for accessing the system has become obsolete and is no longer supported).

The repository shall have hardware technologies appropriate to the services it provides to its designated communities.

Supporting Text
This is necessary to provide expected, contracted, secure, and persistent levels of service including: ease of ingest and dissemination through appropriate depositor and user interfaces and technologies such as upload mechanisms; on-going digital object management; preservation approaches and solutions, such as migration; and system security.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Maintenance of up-to-date Designated Community technology, expectations, and use profiles; provision of bandwidth adequate to support ingest and use demands; systematic elicitation of feedback regarding hardware and service adequacy; maintenance of a current hardware inventory.

Discussion
The repository should be aware of the types of storage, file management, preservation and access services expected by its Designated Community, including where applicable, the types of media to be delivered, and needs to make sure its hardware capabilities can support these services. The objective is to track when changes in service requirements by the designated communities require a corresponding change in the hardware technology, when changes in ingestion policies require expanded capabilities, and when changes in preservation policies require new preservation capabilities. This can be driven by changes in capacity requirements (the time needed to read all media is longer than the media lifetime), by changes in delivery mechanisms (new clients for displaying authentic records), and changes in the number and size of archived records.

The repository shall have procedures in place to monitor and receive notifications when hardware technology changes are needed.

Supporting Text
This is necessary to ensure expected, contracted, secure, and persistent levels of service.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Audits of capacity versus actual usage; audits of observed error rates; audits of performance bottlenecks that limit ability to meet user community access requirements; documentation of technology watch assessments; documentation of technology updates from vendors.

Discussion
The repository should conduct or contract frequent environmental scans regarding hardware status, sources of failure, and interoperability among hardware components. The repository should also be in contact with its hardware vendors regarding technology updates, points of likely failure, and how new components may affect system integration and performance. The objective is to track when changes in service requirements by the designated communities require a corresponding change in the hardware technology, when changes in ingestion policies require expanded capabilities, and when changes in preservation policies require new preservation capabilities. This can be driven by changes in capacity requirements (the time needed to read all media is longer than the media lifetime), by changes in delivery mechanisms (new clients for displaying authentic records), and changes in the number and size of archived records.

The repository shall have procedures in place to evaluate when changes are needed to current hardware.

Supporting Text
This is necessary to ensure that the repository has the capacity to make informed and timely decisions when information indicates the need for new hardware.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Evaluation procedures in place; documented staff expertise in each technology subsystem.

Discussion
Given information from technology watches or other technology monitoring notification systems, the repository should have procedures and expertise to evaluate this data and make sound decisions regarding the need for new hardware. The objective is to track when technology providers have developed subsystems that minimize risk, or that minimize cost, or that improve performance. This is necessary to track emerging technologies and plan for upgrades before capacity limits occur. The evaluation should identify when the risk of using new technology outweighs the expected benefit, and when the new technology is sufficiently mature to minimize risk.

The repository shall have procedures, commitment and funding to replace hardware when evaluation indicates the need to do so.

Supporting Text
This is necessary to ensure hardware replacement in a timely fashion so as to avert system failure or performance inadequacy. Without such a commitment, and more importantly, without escrowed financial resources or a secure funding stream, technology watches and notifications are of little value. The repository must have mechanisms for evaluating the efficacy of the new systems before implementation in the production system.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Statement of commitment to provide expected and contracted levels of service; evidence of ongoing financial assets set aside for hardware procurement; demonstration of cost savings through amortized cost of new system.

Discussion
The objective is to demonstrate that the repository has the ability to incorporate new technology, both financially through funding commitments or cost reduction, and operationally through verification of the capabilities of the new systems.

The repository shall have software technologies appropriate to the services it provides to its designated communities.

Supporting Text
This is necessary to provide expected, contracted, secure, and persistent levels of service including: ease of ingest and dissemination through appropriate depositor and user interfaces and technologies such as upload mechanisms; on-going digital object management; preservation approaches and solutions, such as migration; and system security.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Maintenance of up-to-date Designated Community technology, expectations, and use profiles; provision of software systems adequate to support ingest and use demands; systematic elicitation of feedback regarding software and service adequacy; maintenance of a current software inventory.

Discussion
The objective is to track when changes in service requirements by the designated communities require a corresponding change in the software components, when changes in ingestion policies require support for new data formats and when changes in software technology require new format migration capabilities. This can be driven by changes in access requirements (new clients that require new data formats become preferred), by changes in delivery mechanisms (new data transfer mechanisms), and changes in the number and size of archived records that require more scalable software.

The repository shall have procedures in place to monitor and receive notifications when software changes are needed.

Supporting Text
This is necessary to ensure expected, contracted, secure, and persistent levels of service.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Audits of capacity versus actual usage; audits of observed error rates; audits of performance bottlenecks that limit ability to meet user community access requirements; documentation of technology watch assessments; documentation of software updates from vendors.

Discussion
The objective is to track when changes in service requirements by the designated communities require a corresponding change in the software technology, when changes in ingestion policies require expanded capabilities, and when changes in preservation policies require new preservation capabilities. This can be driven by security updates (vendor supplied corrections to newly identified vulnerabilities), by changes in delivery mechanisms (new software clients for displaying authentic records), and changes in the number and size of archived records (expanded database requirements). The repository should conduct or contract frequent environmental scans regarding software evolution, likely points of failure, and interoperability among the software and hardware components. The repository should also be in contact with its software vendors regarding technology updates, points of likely failure, and how new programs may affect system integration and performance.

The repository shall have procedures in place to evaluate when changes are needed to current software.

Supporting Text
This is necessary to ensure that the repository has the capacity to make informed and timely decisions when information indicates the need for new software.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Evaluation procedures in place; documented staff expertise in each software technology subsystem.

Discussion
Given information from technology watches or other technology monitoring notification systems, the repository should have procedures and expertise to evaluate this data and make sound decisions regarding the need for new software. The objective is to track when technology providers have developed software infrastructure that minimizes risk, or that minimizes cost, or that improves performance. This is necessary to track emerging technologies, and plan for upgrades before capacity limits occur. The evaluation should identify when the risk of using new technology outweighs the expected benefit, and when the new technology is sufficiently mature to minimize risk.

The repository shall have procedures, commitment, and funding to replace software when evaluation indicates the need to do so.

Supporting Text
This is necessary to ensure software replacement in a timely fashion so as to avert system failure or performance inadequacy. Without such a commitment, and more importantly, without escrowed financial resources or a secure funding stream, technology watches and notifications are of little value. The repository must have mechanisms for evaluating the efficacy of the new systems before implementation in the production system.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Statement of commitment to provide expected and contracted levels of service; evidence of ongoing financial assets set aside for software procurement; demonstration of cost savings through amortized cost of new system.

Discussion
The objective is to demonstrate that the repository has the ability to incorporate new technology, both financially through funding commitments or cost reduction, and operationally through verification of the capabilities of the new systems.

The repository shall have adequate hardware and software support for backup functionality sufficient for preserving the repository content and tracking repository functions.

Supporting Text
This is necessary in order to ensure continued access to and tracking of preservation functions applied to the digital objects in their custody.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation of what is being backed up and how often; audit log/inventory of backups; validation of completed backups; disaster recovery plan, policy and documentation; fire drills; testing of backups; support contracts for hardware and software for backup mechanisms; demonstrated preservation of system metadata such as access controls, location of replicas, audit trails, checksum values.

Discussion
The repository should be able to demonstrate the adequacy of the processes, hardware, and software for its backup systems and the full range of ingest, preservation, and dissemination functions required of a repository entrusted with long-term preservation. Simple backup mechanisms must preserve not only the repository main content, but also the system metadata generated by the preservation functions. Repositories need to develop backup plans that ensure their continuity of operations across all failure modes.

The repository shall have effective mechanisms to detect bit corruption or loss.

Supporting Text
This is necessary in order to ensure that AIPs and metadata are uncorrupted or any data losses are detected and fall within the tolerances established by repository policy (see 3.3.5).

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documents that specify bit error detection and correction mechanisms used; risk analysis; error reports; threat analysis; periodic analysis of the integrity of repository holdings.

Discussion
The objective is a comprehensive treatment of the sources of data loss and their real-world complexity. Any data or metadata that is (temporarily) lost should be recoverable from backups. Routine systematic failures must not be allowed to accumulate and cause data loss beyond the tolerances established by the repository policies. Mechanisms such as checksums (MD5 signatures) or digital signatures should be recognized for their effectiveness in detecting bit loss and incorporated into the overall approach of the repository for validating integrity.

The repository shall record and report to its administration all incidents of data corruption or loss, and steps shall be taken to repair/replace corrupt or lost data.

Supporting Text
This is necessary in order to ensure the repository administration is being kept informed of incidents and recovery actions, and to enable identification of sources of data corruption or loss.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Procedures related to reporting incidents to administrators; preservation metadata (e.g., PDI) records; comparison of error logs to reports to administration; escalation procedures related to data loss; tracking of sources of incidents; remediation actions taken to remove sources of incidents.

Discussion
Having effective mechanisms to detect bit corruption and loss within a repository system is critical but it is only the initial step of a larger process. In addition to recording, reporting, and repairing as soon as possible all violations of data integrity, these incidents and the recovery actions and their results must be reported to administrators and made available to all relevant staff. Given identification of the sources of data loss, an assessment of revisions to software and hardware systems, or operational procedures, or management policies is needed to minimize future risk of data loss.

The repository shall have a process to record and react to the availability of new security updates based on a risk-benefit assessment.

Supporting Text
This is necessary in order to protect the integrity of the archival objects from unauthorized changes or deletions.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Risk register (list of all patches available and risk documentation analysis); evidence of update processes (e.g., server update manager daemon); documentation related to the update installations.

Discussion
Decisions to apply security updates are likely to be the outcome of a risk-benefit assessment; security patches are frequently responsible for upsetting alternative aspects of system functionality or performance. It may not be necessary for a repository to implement all software patches, and the application of any must be carefully considered. Each security update implemented by the repository must be documented with details about how it is completed; both automated and manual updates are acceptable. Significant security updates might pertain to software other than core operating systems, such as database applications and Web servers, and these should also be documented. Security updates are not limited to software security updates. Updates to actual hardware or to the hardware system’s firmware are included. Over time it is likely that security updates will also be needed for the repository processes and for its physical security. Although security updates can be considered as a part of the change control, they are identified separately here because there are often outside services that compile and circulate information on security issues and updates. At a minimum, repositories should be monitoring these services to ensure that repository-held data is not subject to compromise by identified threats.

The repository shall have defined processes for storage media and/or hardware change (e.g., refreshing, migration).

Supporting Text
This is necessary in order to ensure that data is not lost when either the media fail or the supporting hardware can no longer be used to access the data.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation of migration processes; policies related to hardware support, maintenance, and replacement; documentation of hardware manufacturer’s expected support life cycles; policies related to migration of records to alternate hardware systems.

Discussion
The repository should have estimates of the access speed and the quantity of information for each type of storage media. Then with estimates of the reliable lifetime of the storage media and information of system loading, etc., the repository can estimate the time required for storage media migration, or refreshing or copying between media without reformatting the bit stream. The repository can then set triggers for initiating the action at an appropriate time so the actions will be completed before data is lost. Copying large quantities of data can take a long time and can affect other system performance metrics. Repositories should also consider the obsolescence of any and all hardware components within the repository system as potential trigger events for migration. Increasingly, long-term, appropriate support for system hardware components is difficult to obtain, exposing repositories to risks and liabilities should they choose to continue to operate the hardware beyond the manufacturer or third-party support warranties. Repositories will likely need to perform media migration off of some types of media onto better supported media based on the estimated lifetime of hardware support rather than on the longer life expected from the media. It is important that the process include a check that the copying has happened correctly.

The repository shall have identified and documented critical processes that affect its ability to comply with its mandatory responsibilities.

Supporting Text
This is necessary in order to ensure that the critical processes can be monitored to ensure that they continue to meet the mandatory responsibilities and to ensure that any changes to those processes are examined and tested.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Traceability matrix between processes and mandatory requirements.

Discussion
Examples of critical processes include data management, access, archival storage, ingest, and security processes. Traceability makes it possible to understand which repository processes are required to meet each of the mandatory responsibilities.

The repository shall have a documented change management process that identifies changes to critical processes that potentially affect the repository’s ability to comply with its mandatory responsibilities.

Supporting Text
This is necessary in order to ensure that the repository can specify not only the current processes, but the prior processes that were applied to the repository holdings.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documentation of change management process; assessment of risk associated with a process change; analysis of the expected impact of a process change; comparison of logs of actual changes to processes versus associated analyses of their impact and criticality.

Discussion
Examples of this would include changes to processes for data management, access, archival storage, ingest, and security. The really important thing is to be able to know what changes were made and when they were made. Traceability makes it possible to understand what was affected by particular changes to the systems. If unintended consequences are later discovered, then having this record may make it possible to reverse the changes or at least to document the changes that were introduced. Change management is a component of the broader topic of configuration management described by ISO 10007:2003 which includes configuration management planning, configuration identification, change control, configuration status accounting and configuration audit. Configuration Management efforts should result in a complete audit trail of decisions and design modifications.

The repository shall have a process for testing and evaluating the effect of changes to the repository’s critical processes.

Supporting Text
This is necessary in order to protect the integrity of the repository’s critical processes such that they continue in their ability to meet the repository’s mandatory requirements.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Documented testing procedures; documentation of results from prior tests and proof of changes made as a result of tests; analysis of the impact of a process change.

Discussion
Changes to critical systems should be, where possible, pre-tested separately, the expected behaviors documented, and roll-back procedures prepared. After changes, the systems should be monitored for unexpected and unacceptable behavior. If such behavior is discovered the changes and their consequences should be reversed. Whole-system testing or unit testing can address this requirement; complex safety-type tests are not required. Testing can be very expensive, but there should be some recognition of the fact that a completely open regime where no changes are ever evaluated or tested will have problems.

The repository shall manage the number and location of copies of all digital objects.

Supporting Text
This is necessary in order to assert that the repository is providing an authentic copy of a particular digital object.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Random retrieval tests; validation of object existence for each registered location; validation of a registered location for each object on storage systems; provenance and fixity checking information; location register/log of digital objects compared to the expected number and location of copies of particular objects.

Discussion
A repository can have different preservation policies for different classes of objects, depending on factors such as the producer, the information type, or its value. Repositories may require a different number of copies for each class, or manage versions needed to meet access requirements. There may be additional identification requirements if the data integrity mechanisms use alternative copies to replace failed copies. The location of each digital object must be described such that the object can be located precisely, without ambiguity. The location can be an absolute physical location or a logical location within a storage media or a storage subsystem. Provenance information about copying and moving the data must be maintained/updated, including the identification of those responsible. This is necessary in order to track chain of custody and assert that the repository is providing an authentic copy of a particular digital object. The repository must be able to distinguish between versions of objects or copies and identical copies. This is necessary in order that a repository can assert that it is providing an authentic copy of the correct version of an object.

The repository shall have mechanisms in place to ensure any/multiple copies of digital objects are synchronized.

Supporting Text
This is necessary in order to ensure that multiple copies of a digital object remain identical, within a time established as acceptable by the repository, and that a copy can be used to replace a corrupted copy of the object.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Synchronization workflows; system analysis of how long it takes for copies to synchronize; procedures/documentation of synchronization processes.

Discussion
The disaster recovery plan should address what to do should a disaster and an update coincide. For example, if one copy of an object is altered and a disaster occurs while the second is being updated, there needs to be a mechanism to assure that the copy will be updated at the first available opportunity. The mechanisms to synchronize copies of digital objects should be able to detect bit corruption and validate fixity checks before synchronization is attempted.


SECURITY RISK MANAGEMENT

The repository shall maintain a systematic analysis of security risk factors associated with data, systems, personnel, and physical plant.

Supporting Text
This is necessary to ensure ongoing and uninterrupted service to the Designated Community.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Repository employs the codes of practice found in the ISO 27000 series of standards system control list; risk, threat, or control analysis.

Discussion
The repository should conduct regular risk assessments and maintain adequate security protection in order to provide expected and contracted levels of service, following codes of practice such as ISO 27000.

‘System’ here refers to more than IT systems, such as hardware, software, communications equipment and facilities, and firewalls. Fire protection and flood detection systems are also significant, as are means to assess personnel, management, and administration procedures, resources, as well as operations and service delivery. Loss of income, budget and reputation are significant threats to overall operations as is loss of mandate. On-going internal and external evaluation should be conducted to assess quality of service and relevance to user community served and periodic financial audits should be secured to ascertain ethical and legal practice and maintenance of required operating funds. Intellectual property rights practices should also be reviewed regularly as well as the repository’s liability for regulatory non-compliance as applicable. The repository should assess its staff’s skills against those required in the evolving digital repository environment and ensure acquisition of new staff or retraining of existing staff as necessary. Regular risk assessment should also address external threats and denial of service attacks and loss of or unacceptable quality of third party services. The repository may conduct overall risk assessments with tools such as DRAMBORA. 2

The repository shall have implemented controls to adequately address each of the defined security risks.

Supporting Text
This is necessary in order to ensure that controls are in place to meet the security needs of the repository.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Repository employs the codes of practice found in the ISO 27000 series of standards; system control list; risk, threat, or control analyses; and addition of controls based on ongoing risk detection and assessment. Repository maintains ISO 17799 certification.

Discussion
The repository should show how it has dealt with its security requirements. If some types of material are more likely to be attacked, the repository will need to provide more protection, for instance. Repositories that have experienced incidents could record such instances, including the times when systems or content were affected and describe procedures that have been put in place to prevent similar occurrences in the future. Repositories may also conduct a variety of disaster drills that may involve their parent organization or the community at large. Contingency plans are especially important and need to be tested, updated, and revised on a regular basis. See http://www.repositoryaudit.eu/.

The repository staff shall have delineated roles, responsibilities, and authorizations related to implementing changes within the system.

Supporting Text
This is necessary in order to ensure that individuals have the authority to implement changes, that adequate resources have been assigned for the effort, and that the responsible individuals will be accountable for implementing such changes.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Repository employs the codes of practice found in the ISO 27000 series of standards; organizational chart; system authorization documentation. Repository maintains ISO 17799 certification.

Discussion
Authorizations are about who can do what: who can add users, who has access to change metadata, who can access audit logs. It is important that authorizations are justified, that staff understand what they are authorized to do, that staff have required skills associated with various roles and authorizations, and that there is a consistent view of this across the organization.

The repository shall have suitable written disaster preparedness and recovery plan(s), including at least one off-site backup of all preserved information together with an offsite copy of the recovery plan(s).

Supporting Text
This is necessary in order to ensure that sufficient backup and recovery capabilities are in place to facilitate continuing preservation of and access to systems and their content with limited disruption of services.

Examples of Ways the Repository Can Demonstrate It Is Meeting This Requirement
Repository employs the codes of practice found in the ISO 27000 series of standards; disaster and recovery plans; information about and proof of at least one off-site copy of preserved information; service continuity plan; documentation linking roles with activities; local geological, geographical, or meteorological data or threat assessments. Repository maintains ISO 17799 certification.

Discussion
The level of detail in a disaster plan, and the specific risks addressed need to be appropriate to the repository’s location and service expectations. Fire is an almost universal concern, but earthquakes may not require specific planning at all locations. The disaster plan must, however, deal with unspecified situations that would have specific consequences, such as lack of access to a building or widespread illness among critical staff. In the event of a disaster at the repository, the repository may want to contact local and/or national disaster recovery bodies for assistance. Repositories may also conduct a variety of disaster drills that may involve their parent organization or the community at large.


ANNEX A SECURITY CONSIDERATIONS (INFORMATIVE)

A1 INTRODUCTION

The use of the Audit and Certification Recommended Practice has several potential areas of security concern.

One security concern is the possibility that the repository is fooled into undergoing an audit by someone unqualified or even malicious. Another concern involves the possible release of confidential information which is collected as evidence by the auditor.

A2 SECURITY CONCERNS WITH RESPECT TO THE CCSDS DOCUMENT

The repository may ask someone to perform an audit using this Recommended Practice. There is a possibility that the person contacted is not in fact the person that the repository believes him or her to be. Alternatively the correct person may be contacted but in fact another, possibly malicious, person may turn up to perform the audit. In the process of collecting evidence for the various metrics the auditor may collect information which is confidential or sensitive, for example details of security weaknesses. There is a danger that such information may fall into the wrong hands and expose the repository to increased risk. Alternatively in the process of collecting evidence the repository system may be damaged.

While these are all valid security concerns, they fall outside the purview of this Recommended Practice, which applies only to the metrics which an auditor should use for auditing a repository.

A3 POTENTIAL THREATS AND ATTACK SCENARIOS

Impersonation of an auditor and/or release of confidential information could both result in exposing the repository and its holdings to increased risk and loss of reputation of the repository.

A4 CONSEQUENCES OF NOT APPLYING SECURITY TO THE TECHNOLOGY

While these security issues are of concern, they are out of scope with respect to this document. This document aims to provide the basis for an audit and certification process for assessing the trustworthiness of digital repositories. Providing protection against false auditors must rely on the repository’s identification and authorization systems. Protection against loss of confidential information in the possession of the auditor must be provided by the security system of that auditor and the method of transmission of information which is agreed between the repository and auditor. Protection against damage to the repository or its holdings during an audit must rely on the security and safety systems of the repository.


ANNEX B REFERENCES (INFORMATIVE)

[B1] D. Waters and J. Garrett. Preserving Digital Information. Report of the Task Force on Archiving of Digital Information. Washington, DC: CLIR, May 1996.

[B2] Trusted Digital Repositories: Attributes and Responsibilities. An RLG-OCLC Report. Mountain View, CA: RLG, May 2002.

[B3] Trustworthy Repositories Audit & Certification: Criteria and Checklist. Version 1.0. Chicago: CRL, February 2007.

[B4] Producer-Archive Interface Methodology Abstract Standard. Recommendation for Space Data System Standards, CCSDS 651.0-M-1. Magenta Book. Issue 1. Washington, D.C.: CCSDS, May 2004.

[B5] XML Formatted Data Unit (XFDU) Structure and Construction Rules. Recommendation for Space Data System Standards, CCSDS 661.0-B-1. Blue Book. Issue 1. Washington, D.C.: CCSDS, September 2008.

[B6] The Data Description Language EAST Specification (CCSD0010). Recommendation for Space Data System Standards, CCSDS 644.0-B-3. Blue Book. Issue 3. Washington, D.C.: CCSDS, July 2009.

[B7] Data Entity Dictionary Specification Language (DEDSL)—Abstract Syntax (CCSD0011). Recommendation for Space Data System Standards, CCSDS 647.1-B- 1. Blue Book. Issue 1. Washington, D.C.: CCSDS, June 2001.

[B8] “Digital Preservation Management: Implementing Short-Term Strategies for Long- Term Problems.” Digital Preservation Management Resources. Cornell University.

[B9] Quality Management Systems—Fundamentals and Vocabulary. International Standard, ISO 9000:2005. 3rd edition. Geneva: ISO, 2005.

[B10] Information Technology—Security Techniques—Code of Practice for Information Security Management. International Standard, ISO/IEC 17799:2005. 2nd edition. Geneva: ISO, 2005.

[B11] Information and Documentation—Records Management—Part 1: General. International Standard, ISO 15489-1:2001. Geneva: ISO, 2001.

[B12] Information and Documentation—Records Management—Part 2: Guidelines. International Standard, ISO/TR 15489-2:2001. Geneva: ISO, 2001.

[B13] Trustworthy Information Systems Handbook. Version 4. Saint Paul, Minnesota: Minnesota Historical Society, July 2002.

[B14] Ron Ross, et al. Guide for Assessing the Security Controls in Federal Information Systems. National Institute of Standards and Technology Special Publication 800- 53A. Gaithersburg, Maryland: NIST, July 2008.

[B15] Susanne Dobratz, Astrid Schoger, and Stefan Strathmann. “The nestor Catalogue of Criteria for Trusted Digital Repository Evaluation and Certification.” Journal of Digital Information 8, no. 2 (2007).