METS in Archivematica¶
General description¶
The Metadata Encoding & Transmission Standard, or METS, is a schema for encoding various metadata, expressed in XML. It essentially acts a wrapper around other metadata standards, such as PREMIS and Dublin Core that are used in Archivematica.
The METS file in Archivematica will have a basic generic structure that is comprised of the following sections:
- metsHdr (METS header) (one only)
- dmdSec (descriptive metadata section) (one or more)
- amdSec (administrative metadata section) (one or more)
- fileSec (file section) (one only)
- structMap (structural map) (one or more)
<mets:mets xmlns:mets="http://www.loc.gov/METS/" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/METS/ http://www.loc.gov/standards/mets/version1121/mets.xsd">
<mets:metsHdr CREATEDATE="2019-11-14T11:15:34"/>
<mets:dmdSec ID="dmdSec_1">...</mets:dmdSec>
<mets:mdWrap MDTYPE="PREMIS:OBJECT">...</mets:mdWrap>
</mets:dmdSec>
<mets:dmdSec ID="dmdSec_2">...</mets:dmdSec>
<mets:amdSec ID="amdSec_1">
<mets:techMD ID="techMD_1">...</mets:techMD>
<mets:rightsMD ID="rightsMD_1">...</mets:rightsMD>
<mets:digiprovMD ID="digiprovMD_1">...</mets:digiprovMD>
<mets:digiprovMD ID="digiprovMD_2">...</mets:digiprovMD>
<mets:digiprovMD ID="digiprovMD_3">...</mets:digiprovMD>
<mets:digiprovMD ID="digiprovMD_4">...</mets:digiprovMD>
</mets:amdSec>
<mets:amdSec ID="amdSec_2">...</mets:amdSec>
<mets:amdSec ID="amdSec_3">...</mets:amdSec>
<mets:amdSec ID="amdSec_4">...</mets:amdSec>
<mets:amdSec ID="amdSec_5">...</mets:amdSec>
<mets:fileSec>
<mets:fileGrp USE="original">
<mets:fileGrp USE="submissionDocumentation">
<mets:fileGrp USE="preservation">
<mets:fileGrp USE="metadata">
</mets:fileSec>
<mets:structMap ID="structMap_1" LABEL="Archivematica default" TYPE="physical">...</mets:structMap>
</mets:mets>
Below is a more detailed outline of the above sections present in AIPs created in Archivematica.
Skip to:
<metsHdr>¶
There will be one METS header section for each METS file. It will contain a CREATEDATE attribute.
<metsHdr CREATEDATE=”2019-06-18T23:52:11”/>
If the AIP has been reingested, the metsHder section will also contain a LASTMODDATE.
<metsHdr CREATEDATE=”2019-06-18T23:52:11” LASTMODDATE=”2019-06-27T00:28:56”/>
<dmdSec>¶
There will be one dmdSec for the whole AIP, which contains PREMIS Intellectual Entity information.
<mets:dmdSec ID="dmdSec_1">
<mets:mdWrap MDTYPE="PREMIS:OBJECT">
<mets:xmlData>
<premis:object xmlns:premis="http://www.loc.gov/premis/v3" xsi:type="premis:intellectualEntity" xsi:schemaLocation="http://www.loc.gov/premis/v3 http://www.loc.gov/standards/premis/v3/premis.xsd" version="3.0">
<premis:objectIdentifier>
<premis:objectIdentifierType>UUID</premis:objectIdentifierType>
<premis:objectIdentifierValue>6a82ffa2-91e2-48e1-b0d0-55b0de21568e</premis:objectIdentifierValue>
</premis:objectIdentifier>
<premis:objectIdentifier>
<premis:objectIdentifierType>ULID</premis:objectIdentifierType>
<premis:objectIdentifierValue>01D9T8MJM9NEQ390H239VT50CQ</premis:objectIdentifierValue>
</premis:objectIdentifier>
<premis:objectIdentifier>
<premis:objectIdentifierType>EXID</premis:objectIdentifierType>
<premis:objectIdentifierValue>https://example.com/AIP/id/1348554167</premis:objectIdentifierValue>
</premis:objectIdentifier>
<premis:originalName>demo-1-6a82ffa2-91e2-48e1-b0d0-55b0de21568e</premis:originalName>
</premis:object>
</mets:xmlData>
</mets:mdWrap>
</mets:dmdSec>
There may also be one or more dmdSecs for descriptive metadata. The dmdSecs are numbered dmdSec_1, dmdSec_2, etc. and contain Dublin Core metadata as a default.
<mets:dmdSec ID="dmdSec_3">
<mets:mdWrap MDTYPE="DC">
<mets:xmlData>
<dcterms:dublincore xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xsi:schemaLocation="http://purl.org/dc/terms/ https://dublincore.org/schemas/xmls/qdc/2008/02/11/dcterms.xsd">
<dc:title>Directory containing artwork</dc:title>
<dc:creator>Milton Bradley</dc:creator>
<dc:subject>Art</dc:subject>
<dc:subject>Photography</dc:subject>
<dc:subject></dc:subject>
<dc:subject></dc:subject>
<dc:subject></dc:subject>
<dc:description>Color photographs taken in the last century. The photographs are colorful and vibrant. </dc:description>
<dc:publisher></dc:publisher>
<dc:contributor></dc:contributor>
<dc:date></dc:date>
<dc:type></dc:type>
<dc:format></dc:format>
<dc:identifier></dc:identifier>
<dc:source></dc:source>
<dc:language></dc:language>
<dc:language></dc:language>
<dc:relation></dc:relation>
<dc:coverage></dc:coverage>
<dc:rights></dc:rights>
</dcterms:dublincore>
</mets:xmlData>
</mets:mdWrap>
</mets:dmdSec>
MDTYPE will be DC for Dublin Core metadata; if another metadata standard is included then the MDTYPE will be OTHER.
If the user does not enter any DC metadata during transfer/ingest and no DC metadata was included with the transfer (e.g. as a metadata.csv file, see Import metadata), there will be no additional dmdSecs.
<amdSec>¶
There is one amdSec for each original object and preservation object. The amdSec wraps PREMIS entities as follows:
- PREMIS Object: techMD (tehnical metadata)
- PREMIS Event: digiprovMD (digital provenance metadata)
- PREMIS Agent: digiprovMD (digital provenance metadata)
- PREMIS Rights: rightsMD (rights metadata)
Each amdSec will include one techMD. The techMD is used to wrap PREMIS Object semantic units, such as objectIdentifier, objectCharacteristics and originalName. See the example techMD below:
<mets:amdSec ID="amdSec_1">
<mets:techMD ID="techMD_1">
<mets:mdWrap MDTYPE="PREMIS:OBJECT">
<mets:xmlData>
<premis:object xmlns:premis="http://www.loc.gov/premis/v3" xsi:type="premis:file" xsi:schemaLocation="http://www.loc.gov/premis/v3 http://www.loc.gov/standards/premis/v3/premis.xsd" version="3.0">
<premis:objectIdentifier>
<premis:objectIdentifierType>UUID</premis:objectIdentifierType>
<premis:objectIdentifierValue>d8717b3a-d12c-408a-9c37-732425331f44</premis:objectIdentifierValue>
</premis:objectIdentifier>
<premis:objectCharacteristics>
<premis:compositionLevel>0</premis:compositionLevel>
<premis:fixity>
<premis:messageDigestAlgorithm>sha256</premis:messageDigestAlgorithm>
<premis:messageDigest>1cad1038c85dde1d018116c15fd32d2dac34d645b548611eb042f37169fcdee0</premis:messageDigest>
</premis:fixity>
<premis:size>47941948</premis:size>
<premis:format>
<premis:formatDesignation>
<premis:formatName>Tagged Image File Format</premis:formatName>
<premis:formatVersion></premis:formatVersion>
</premis:formatDesignation>
<premis:formatRegistry>
<premis:formatRegistryName></premis:formatRegistryName>
<premis:formatRegistryKey></premis:formatRegistryKey>
</premis:formatRegistry>
</premis:format>
<premis:creatingApplication>
<premis:dateCreatedByApplication>2019-11-14</premis:dateCreatedByApplication>
</premis:creatingApplication>
</premis:objectCharacteristics>
<premis:originalName>%SIPDirectory%objects/View_from_lookout_over_Queenstown_towards_the_Remarkables_in_spring-d8717b3a-d12c-408a-9c37-732425331f44.tif</premis:originalName>
<premis:relationship>
<premis:relationshipType>derivation</premis:relationshipType>
<premis:relationshipSubType>has source</premis:relationshipSubType>
<premis:relatedObjectIdentifier>
<premis:relatedObjectIdentifierType>UUID</premis:relatedObjectIdentifierType>
<premis:relatedObjectIdentifierValue>d0c46bbb-63b1-4530-ad1f-f65d9a32e434</premis:relatedObjectIdentifierValue>
</premis:relatedObjectIdentifier>
<premis:relatedEventIdentifier>
<premis:relatedEventIdentifierType>UUID</premis:relatedEventIdentifierType>
<premis:relatedEventIdentifierValue>d7746761-d98c-4a78-80e0-2e91e4c187d4</premis:relatedEventIdentifierValue>
</premis:relatedEventIdentifier>
</premis:relationship>
</premis:object>
</mets:xmlData>
</mets:mdWrap>
</mets:techMD>
Each amdsec will include multiple digiprovMDs for PREMIS events related to the PREMIS Object. Below is an example of a PREMIS event for a fixity check:
<mets:digiprovMD ID="digiprovMD_10"> <mets:mdWrap MDTYPE="PREMIS:EVENT"> <mets:xmlData> <premis:event xmlns:premis="http://www.loc.gov/premis/v3" xsi:schemaLocation="http://www.loc.gov/premis/v3 http://www.loc.gov/standards/premis/v3/premis.xsd" version="3.0"> <premis:eventIdentifier> <premis:eventIdentifierType>UUID</premis:eventIdentifierType> <premis:eventIdentifierValue>8c6714b3-ab18-44e2-8507-80691ad1dafa</premis:eventIdentifierValue> </premis:eventIdentifier> <premis:eventType>fixity check</premis:eventType> <premis:eventDateTime>2019-11-14T11:00:50.632264+00:00</premis:eventDateTime> <premis:eventDetailInformation> <premis:eventDetail>program="sha512sum -c --strict /var/archivematica/sharedDirectory/currentlyProcessing/demo-1-581137cd-1d75-44cd-99bb-3edb2ecb8098/metadata/checksum.sha512"; version="sha512sum (GNU coreutils) 8.28"</premis:eventDetail> </premis:eventDetailInformation> <premis:eventOutcomeInformation> <premis:eventOutcome>pass</premis:eventOutcome> <premis:eventOutcomeDetail> <premis:eventOutcomeDetailNote></premis:eventOutcomeDetailNote> </premis:eventOutcomeDetail> </premis:eventOutcomeInformation> <premis:linkingAgentIdentifier> <premis:linkingAgentIdentifierType>preservation system</premis:linkingAgentIdentifierType> <premis:linkingAgentIdentifierValue>Archivematica-1.10</premis:linkingAgentIdentifierValue> </premis:linkingAgentIdentifier> <premis:linkingAgentIdentifier> <premis:linkingAgentIdentifierType>repository code</premis:linkingAgentIdentifierType> <premis:linkingAgentIdentifierValue>12345</premis:linkingAgentIdentifierValue> </premis:linkingAgentIdentifier> <premis:linkingAgentIdentifier> <premis:linkingAgentIdentifierType>Archivematica user pk</premis:linkingAgentIdentifierType> <premis:linkingAgentIdentifierValue>1</premis:linkingAgentIdentifierValue> </premis:linkingAgentIdentifier> </premis:event> </mets:xmlData> </mets:mdWrap> </mets:digiprovMD>
An amdSec for an original object may also contain one or more rightsMDs. The rightsMD may contain a reference to metadata in another file, such as rights.csv uploaded with a transfer. Below is an example of a PREMIS Rights statement in rightsMD section of the AIP METS file:
<mets:rightsMD ID="rightsMD_1">
<mets:mdWrap MDTYPE="PREMIS:RIGHTS">
<mets:xmlData>
<premis:rightsStatement xmlns:premis="http://www.loc.gov/premis/v3" xsi:schemaLocation="http://www.loc.gov/premis/v3 http://www.loc.gov/standards/premis/v3/premis.xsd">
<premis:rightsStatementIdentifier>
<premis:rightsStatementIdentifierType>UUID</premis:rightsStatementIdentifierType>
<premis:rightsStatementIdentifierValue>3ee20e47-1cbd-4a2b-881f-39a6ce66ecc2</premis:rightsStatementIdentifierValue>
</premis:rightsStatementIdentifier>
<premis:rightsBasis>Copyright</premis:rightsBasis>
<premis:copyrightInformation>
<premis:copyrightStatus>copyright status</premis:copyrightStatus>
<premis:copyrightJurisdiction>CA</premis:copyrightJurisdiction>
<premis:copyrightStatusDeterminationDate>2001-01-01</premis:copyrightStatusDeterminationDate>
<premis:copyrightNote>Note 1</premis:copyrightNote>
<premis:copyrightApplicableDates>
<premis:startDate>2002-02-02</premis:startDate>
<premis:endDate>2003-03-03</premis:endDate>
</premis:copyrightApplicableDates>
</premis:copyrightInformation>
<premis:rightsGranted>
<premis:act>Act 1</premis:act>
<premis:restriction>Allow</premis:restriction>
<premis:termOfGrant>
<premis:startDate>2004-04-04</premis:startDate>
<premis:endDate>2005-05-05</premis:endDate>
</premis:termOfGrant>
<premis:rightsGrantedNote>Grant note 1</premis:rightsGrantedNote>
</premis:rightsGranted>
<premis:linkingObjectIdentifier>
<premis:linkingObjectIdentifierType>UUID</premis:linkingObjectIdentifierType>
<premis:linkingObjectIdentifierValue>4d1b15a5-f5ea-43b6-a02a-a9be2eaab383</premis:linkingObjectIdentifierValue>
</premis:linkingObjectIdentifier>
</premis:rightsStatement>
</mets:xmlData>
</mets:mdWrap>
</mets:rightsMD>
An amdSec will also have zero or more source metadata, or sourceMD, sections. This is the only non-PREMIS section in an amdSec and no XML schema is used for the contents. SourceMD is created from the contents of the bag-info.txt whenever a zipped or unzipped bag transfer type is used. It is also created for a disk image transfer if the user entered meteadata about the disk image in the Transfer tab.
<fileSec>¶
There is one fileSec listing all files. The fileSec is organized into two or more of the following fileGrps:
- original
- preservation
- service
- submissionDocumentation
- metadata
- license
- text/ocr
- deleted
These are known as USE attributes and must accompany each fileGrp. The fileGrp original is required for all METS files. If the AIP includes normalized for preservation files then the fileGrp preservation is used. The service fileGrp may be used if the AIP contains the sub-folder (e.g. as the output of digitization workflows).
Each fileGrp will have a GROUPID used to relate different versions of files and an ID used later as the FILEID to identify the file in the Structural Map.
The GROUPID includes the UUID (from the objectIdentifierValue) of the original file. The ID is the UUID (from the objectIdentifierValue) of the file being described. This means original files will have the same GROUPID and ID. However, related files (such as those normalized for preservation) will have the same GROUPID as the original file but a different ID.
Below is an example of the original fileGrp section in the AIP METS file:
<mets:fileSec>
<mets:fileGrp USE="original">
<mets:file GROUPID="Group-d0c46bbb-63b1-4530-ad1f-f65d9a32e434" ID="file-d0c46bbb-63b1-4530-ad1f-f65d9a32e434" ADMID="amdSec_2">
<mets:FLocat xlink:href="objects/View_from_lookout_over_Queenstown_towards_the_Remarkables_in_spring.jpg" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
<mets:file GROUPID="Group-b255e64f-4478-4010-961f-34cc2ba3ebf9" ID="file-b255e64f-4478-4010-961f-34cc2ba3ebf9" ADMID="amdSec_4">
<mets:FLocat xlink:href="objects/artwork/MARBLES.TGA" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
<mets:file GROUPID="Group-4d1b15a5-f5ea-43b6-a02a-a9be2eaab383" ID="file-4d1b15a5-f5ea-43b6-a02a-a9be2eaab383" ADMID="amdSec_6">
<mets:FLocat xlink:href="objects/beihai.tif" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
<mets:file GROUPID="Group-38c544d8-ed55-45e3-9d0f-3de6352bb13d" ID="file-38c544d8-ed55-45e3-9d0f-3de6352bb13d" ADMID="amdSec_8">
<mets:FLocat xlink:href="objects/bird.mp3" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
<mets:file GROUPID="Group-294a27e3-95de-4036-a092-e65401b8564a" ID="file-294a27e3-95de-4036-a092-e65401b8564a" ADMID="amdSec_18">
<mets:FLocat xlink:href="objects/ocr-image.png" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
<mets:file GROUPID="Group-4c8ccb7f-ab05-48a4-b8e3-d67056ab27bb" ID="file-4c8ccb7f-ab05-48a4-b8e3-d67056ab27bb" ADMID="amdSec_19">
<mets:FLocat xlink:href="objects/piiTestDataCreditCardNumbers.txt" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
</mets:fileGrp>
<structMap>¶
There will be at least one physical Structural Map, or structMap, in the METS. It is labeled “Archivematica Default” and lists the directories and files in the objects directory as they are laid out on disk.
<mets:structMap ID="structMap_1" LABEL="Archivematica default" TYPE="physical">
<mets:div LABEL="demo-1-6a82ffa2-91e2-48e1-b0d0-55b0de21568e" TYPE="Directory" DMDID="dmdSec_1">
<mets:div LABEL="objects" TYPE="Directory" DMDID="dmdSec_8">
<mets:div LABEL="View_from_lookout_over_Queenstown_towards_the_Remarkables_in_spring-d8717b3a-d12c-408a-9c37-732425331f44.tif" TYPE="Item">
<mets:fptr FILEID="file-d8717b3a-d12c-408a-9c37-732425331f44"/>
</mets:div>
<mets:div LABEL="View_from_lookout_over_Queenstown_towards_the_Remarkables_in_spring.jpg" TYPE="Item" DMDID="dmdSec_2">
<mets:fptr FILEID="file-d0c46bbb-63b1-4530-ad1f-f65d9a32e434"/>
</mets:div>
<mets:div LABEL="artwork" TYPE="Directory" DMDID="dmdSec_3">
<mets:div LABEL="MARBLES-7733ab3a-656e-4a8d-986c-1fc5ab890d7e.tif" TYPE="Item">
<mets:fptr FILEID="file-7733ab3a-656e-4a8d-986c-1fc5ab890d7e"/>
</mets:div>
<mets:div LABEL="MARBLES.TGA" TYPE="Item" DMDID="dmdSec_4">
<mets:fptr FILEID="file-b255e64f-4478-4010-961f-34cc2ba3ebf9"/>
</mets:div>
</mets:div>
If you have chosen to run the job document empty directories microservice runs, then a second structMap is created. It is labeled “Normative Directory Structure.” At AIP re-ingest this new ‘logical’ structMap will be parsed to re-create the empty directories.
A logical structmap describes a relationship between objects in an Archivematica AIP that might not be true to their ‘physical’ layout on disk, as is here where empty directories have previously been removed but are still recorded so as to be recreated in the future.
If a user includes a custom structMap in the transfer, then this will also be included in the METS file.
Archivematica 1.14.1
Contents
Search
Open the general index or type your search in the search box.
Available projects
Archivematica
- Version 1.16.0 (stable-current)
- Version 1.15.1 (stable-previous)
- Version 1.14.1 (legacy)
- Version 1.13.2 (legacy)
- Version 1.12.2 (legacy)
- Version 1.11.2 (legacy)
- Version 1.10.2 (legacy)
- Version 1.9.3 (legacy)
- Version 1.8.1 (legacy)
- Version 1.7.2 (legacy)
- Version 1.6.1 (legacy)
- Version 1.5 (legacy)
- Version 1.4 (legacy)
Archivematica Storage Service
- Version 0.22.0 (stable-current)
- Version 0.21.1 (stable-previous)
- Version 0.20.1 (legacy)
- Version 0.19.0 (legacy)
- Version 0.18.1 (legacy)
- Version 0.17.1 (legacy)
- Version 0.16.1 (legacy)
- Version 0.15.1 (legacy)
- Version 0.14.1 (legacy)
- Version 0.13.0 (legacy)
- Version 0.12.0 (legacy)
- Version 0.11.1 (legacy)
- Version 0.10 (legacy)
- Version 0.9 (legacy)
- Version 0.8 (legacy)
License
Archivematica documentation by Artefactual Systems Inc. is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.