AIP structure¶
This page describes the standard structure of an AIP produced by Archivematica.
On this page:
Name¶
The AIP name is composed of the following:
- The name assigned to the transfer. This may come from human input or it may be the name of the transfer directory, depending on the transfer type and the transfer method.
- A UUID assigned to the AIP during ingest.
For example, looking at the AIP name
my-aip-d31cc44f-ce01-4e67-affe-513868d9cf3d
, my-aip
is the name assigned
by the user and d31cc44f-ce01-4e67-affe-513868d9cf3d
is the UUID generated
during ingest.
AIP contents¶
Archivematica AIPs are structurally consistent regardless of variables in original content, processing, and storage. They are is packaged into a bag in accordance with the IETF Trust BagIt File Packaging Format. This tree structure depicts a typical Archivematica AIP:
[1] my-aip-d31cc44f-ce01-4e67-affe-513868d9cf3d
[2] ├── bag-info.txt
[3] ├── bagit.txt
[4] ├── manifest-sha512.txt
[5] ├── tagmanifest-md5.txt
[6] └── data
[7] ├── logs
[8] ├── objects
[9] ├── thumbnails
[10] ├── METS.d31cc44f-ce01-4e67-affe-513868d9cf3d.xml
[11] └── README.html
- [1] AIP root directory
- [2 - 5] AIP BagIt files: standard packaging files produced in accordance with the IETF Trust BagIt File Packaging Format.
- [6] Data directory: a standard directory required by the BagIt specification. The data directory contains the AIP Content Information and Preservation Description Information (PDI).
- [7] Logs directory: contains log outputs of some of the tools that Archivematica uses to generate the AIP.
- [8] Objects directory: contains the original digital objects as well as any normalized versions.
- [9] Thumbnails directory: contains any thumbnails generated from the original object.
- [10] AIP METS file
- [11] README file
For more information about each of these components, see the appropriate section below.
BagIt files¶
Archivematica uses a very simple implementation of BagIt. All AIPs contain the following BagIt files:
bag-info.txt
: a tag file that contains metadata about the bag, including:Payload-Oxum
: the octet stream sum of the bag payload.Bagging-Date
: a yyyy-mm-dd formatted date on which the bag was created (e.g. 2018-11-01).Bag-Size
: a human-readable file size (e.g. 42kB).External-Identifier
: the UUID of the AIP.
bagit.txt
: the bag declaration, stating the version and encoding.manifest-sha256.txt
: a list of each payload file name with corresponding SHA256 checksums.tagmanifest-md5.txt
: a tag file that lists other tag files with corresponding MD5 checksums.
This example shows the contents of the top-level directory of the AIP.
my-aip-d31cc44f-ce01-4e67-affe-513868d9cf3d
├── bag-info.txt
├── bagit.txt
├── data
├── manifest-sha256.txt
└── tagmanifest-sha256.txt
Data directory¶
The data directory consists of the METS file for the AIP, a README file, and
three folders: logs
, objects
and thumbnails
.
This example shows the contents of the AIP’s data directory.
my-aip-d31cc44f-ce01-4e67-affe-513868d9cf3d
└── data
├── logs
├── METS.d31cc44f-ce01-4e67-affe-513868d9cf3d.xml
├── objects
├── README.html
└── thumbnails
AIP METS file¶
The AIP METS file, /data/METS.uuid.xml
, lists all of the digital objects
in the AIP (original files, preservation masters, license files, OCR text files,
submission documentation, etc.), describes their relationships to each other,
and links digital objects to their descriptive, technical, provenance, and
rights metadata.
The AIP METS file name is composed from the prefix METS.
, the UUID of the
AIP, and the extension .xml
. Note that the presence of the UUID
differentiates the AIP METS file from the transfer METS file, described in the
Objects section below.
For more information about Archivematica’s METS implementation, see METS in Archivematica.
README file¶
The AIP README file, /data/README.html
, is a human-readable file that
describes the basic structure of an Archivematica AIP. It introduces
Archivematica, OAIS, METS and PREMIS, and other concepts that future users may
find helpful when they encounter an AIP.
Logs¶
The logs directory, /data/logs
, contains log outputs for some of the tools
and tasks that run inside of Archivematica.
This is an example of the contents of an AIP’s logs directory:
my-aip-d31cc44f-ce01-4e67-affe-513868d9cf3d
└── data
└── logs
├── arrange.log
├── fileFormatIdentification.log
├── filenameChanges.log
└── transfers
├── first-transfer-abbff451-f077-4f66-a6e0-d83f6ebbeebf
│ └── logs
│ ├── fileFormatIdentification.log
│ └── filenameChanges.log
└── second-transfer-52fd11fa-fca8-4bc7-9214-e6510863759a
└── logs
├── fileFormatIdentification.log
└── filenameChanges.log
The top-level logs (arrange.log
, fileFormatIdentification.log
, etc.) are
outputs for tasks that took place either in the Appraisal tab or on the Ingest
tab. For example, data/logs/fileFormatIdentification.log
is the log that was
created during the Identify file format job that takes place during the
Normalize microservice on the Ingest tab.
The logs directory has a transfers subdirectory, /data/logs/transfers
, which
contains logs for tools that ran on the Transfer tab. Since it is possible to
combine multiple transfers into one SIP (which becomes one AIP), the transfers
subdirectory may contain multiple directories. Continuing to use the example
above, two transfers (first-transfer
and second-transfer
) were combined
to create one AIP (my-aip
). Therefore, there are two more
fileFormatIdentification.log
files:
data/logs/transfers/first-transfer-abbff451-f077-4f66-a6e0-d83f6ebbeebf/logs/fileFormatIdentification.log
is the log that was created during Microservice: Identify file format on the Transfer tab whenfirst-transfer
was processed.data/logs/transfers/second-transfer-52fd11fa-fca8-4bc7-9214-e6510863759a/logs/fileFormatIdentification.log
is the log that was created during Microservice: Identify file format on the Transfer tab whensecond-transfer
was processed.
The example above does not show all possible logs. Depending on how you have set your Processing configuration, you may see a greater or lesser number of logs or logs of different types in your AIP.
Objects¶
The objects directory, /data/objects
, contains original objects,
preservation masters, and two folders: /metadata
and
/submissionDocumentation
. If the SIP contained any lower-level directories,
either from the original transfer or because it was arranged on the Appraisal
tab, the lower-level directories will be present as well.
my-aip-d31cc44f-ce01-4e67-affe-513868d9cf3d
└── data
└── objects
├── 799px-Euroleague-LE_Roma_vs_Toulouse_IC-27-3e2bcabd-f33f-485b-a566-ff71c141b930.tif
├── 799px-Euroleague-LE_Roma_vs_Toulouse_IC-27.bmp
├── BBhelmet.ai
├── G31DS-60559b5e-38a5-44f5-8c63-bb41bda5d2e8.tif
├── G31DS.TIF
├── metadata
│ └── transfers
│ ├── first-transfer-abbff451-f077-4f66-a6e0-d83f6ebbeebf
│ │ ├── directory_tree.txt
│ │ └── metadata.csv
│ └── second-transfer-52fd11fa-fca8-4bc7-9214-e6510863759a
│ └── directory_tree.txt
└── submissionDocumentation
├── transfer-first-transfer-abbff451-f077-4f66-a6e0-d83f6ebbeebf
│ └── METS.xml
└── second-transfer-52fd11fa-fca8-4bc7-9214-e6510863759a
└── METS.xml
The filenames of original objects will be unchanged. Preservation master copies
have a UUID appended to the filename. In the example above,
799px-Euroleague-LE_Roma_vs_Toulouse_IC-27.bmp
is the original object and
799px-Euroleague-LE_Roma_vs_Toulouse_IC-27-3e2bcabd-f33f-485b-a566-ff71c141b930.tif
is the preservation master. The creation of preservation master copies is guided
by the rules on the Preservation planning tab.
The /metadata
directory contains metadata associated with the AIP. The
metadata directory has a transfers subdirectory, /data/metadata/transfers
,
which separates the metadata files into folders specific to the transfer where
they originated. Since it is possible to combine multiple transfers into one SIP
(which becomes one AIP), the transfers subdirectory may contain multiple
folders.
The /submissionDocumentation
directory contains submission
documentation for the AIP. Similar to the metadata
directory, there is a transfers subdirectory,
/data/submissionDocumentation/transfers
, which separates the submission
documentation files into folders specific to the transfer where they originated.
Since it is possible to combine multiple transfers into one SIP (which becomes
one AIP), the transfers subdirectory may contain multiple folders. Note that the
transfer folders contain a METS.XML file - this is the transfer METS, which was
generated for each transfer on the Transfer tab. When the transfer (or multiple
transfers combined) become a SIP, the transfer METS files are combined into a new
METS file which becomes the AIP METS file.
Thumbnails¶
The objects directory, /data/thumbnails
, will contain thumbnail images if
you chose to generate them during the Normalize for thumbnails job.
my-aip-d31cc44f-ce01-4e67-affe-513868d9cf3d
└── data
└── thumbnails
├── 0e9fd6db-ac57-453c-ba0f-c9cff9d0ac56.jpg
├── 7d1c5e44-f1e1-4cf7-8b79-ca2284a6ce79.jpg
└── dd1a6fb8-7e49-47ca-921b-87b234c939b9.jpg
The creation of thumbnails is optional and configurable in the processing configuration.