DSpace exports¶
Archivematica can be used to build a “dark archive” for a DSpace repository - that is, it can provide back-end preservation functionality while DSpace remains the user deposit and access system. Material exported from DSpace must be placed in a transfer source location that the Archivematica pipeline can access.
On this page
DSpace version compatibility¶
Archivematica has been tested using exports from DSpace 1.7.x. Ingest has not been tested on exports from DSpace 1.8.x; however, there were no changes in the DSpace AIP export structure between 1.7.x and 1.8.x so it is anticipated that performance will be identical.
DSpace export structure¶
Archivematica expects standard DSpace exports, where each DSpace item is packaged as a ZIP file. Typically, the DSpace item ZIP file will contain the uploaded object plus a license file, a METS file and possibly an OCR text file.
In the example pictured above, the DSpace item includes four files:
bitstream_39691.txt
: the OCR text file.bitstream_8272.pdf
: the original object deposited in DSpace.bitstream_8273
: the license file.mets.xml
: the METS file for the item.
Archivematica will reuse portions of the DSpace METS file to populate the METS file that is generated for the AIP.
You can either transfer items one-by-one or place them within a directory to transfer multiple items at a time. Depending on the setup of your DSpace instance and your export parameters, the export may contain multiple individual items as well as a collection-level description, as in the example below.
Processing DSpace exports¶
Place your exported material in a transfer source location that your Archivematica pipeline can access.
On the Transfer tab in Archivematica, use the Transfer type dropdown menu to select the DSpace transfer type.
Use the Browse button to find your DSpace export and select either an individual item or a collection of items. In the screenshot below, a collection of items has been selected. Click Add.
If you selected a collection of items, enter a name for the transfer in the Transfer name box.
If you selected an individual DSpace item, you do not need to give your transfer a name. The name of the item (e.g.
ITEM@2429-1521.zip
) will be used as the transfer name.Click Start transfer and process as required.
DSpace AIP METS files¶
Each object in the AIP has 2 DSpace-specific descriptive metadata sections (dmdSec). The first contains Xpointers to descriptive and rights metadata in the original mets.xml files exported from DSpace.
<mets:dmdSec ID="dmdSec_2">
<mets:mdRef LABEL="mets.xml-Group-2f7c645f-a0b0-4d36-9933-0c8d9e47784b" xlink:href="objects/ITEM_2429-2700.zip-2020-02-07T00_26_40.346055_00_00/mets.xml" MDTYPE="OTHER" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM" XPTR="xpointer(id('dmdSec_366 dmdSec_367'))"/>
</mets:dmdSec>
The second dmdSec reflects the parent-child relationship between a DSpace object and its collection, using the handles as identifiers:
<mets:dmdSec ID="dmdSec_3">
<mets:mdWrap MDTYPE="DC">
<mets:xmlData>
<dcterms:dublincore xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xsi:schemaLocation="http://purl.org/dc/terms/ https://dublincore.org/schemas/xmls/qdc/2008/02/11/dcterms.xsd">
<dcterms:isPartOf>hdl:2429/1314</dcterms:isPartOf>
<dc:identifier>hdl:2429/2700</dc:identifier>
</dcterms:dublincore>
</mets:xmlData>
</mets:mdWrap>
</mets:dmdSec>
In the file section (fileSec
) of the AIP METS file, you can find information
about the different types of DSpace files contained in the AIP. They are sorted
by file group (fileGrp
): original
, submissionDocumentation
(the
original DSpace mets.xml files), preservation
(if the material was
normalized for preservation), license
, and text/ocr
.
<mets:fileSec>
<mets:fileGrp USE="original">
<mets:file GROUPID="Group-2f7c645f-a0b0-4d36-9933-0c8d9e47784b" ID="file-2f7c645f-a0b0-4d36-9933-0c8d9e47784b" ADMID="amdSec_2" DMDID="dmdSec_2 dmdSec_3">
<mets:FLocat xlink:href="objects/ITEM_2429-2700.zip-2020-02-07T00_26_40.346055_00_00/bitstream_8266.pdf" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
</mets:fileGrp>
<mets:fileGrp USE="submissionDocumentation">
<mets:file GROUPID="Group-2f7c645f-a0b0-4d36-9933-0c8d9e47784b" ID="file-6aeb8547-e866-4d47-ad11-35a1b85fe8e8" ADMID="amdSec_4">
<mets:FLocat xlink:href="objects/ITEM_2429-2700.zip-2020-02-07T00_26_40.346055_00_00/mets.xml" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
<mets:file GROUPID="Group-bd549305-08b7-4173-b594-d1acf2f23bdd" ID="file-bd549305-08b7-4173-b594-d1acf2f23bdd" ADMID="amdSec_5">
<mets:FLocat xlink:href="objects/submissionDocumentation/transfer-ITEM_2429-2700-6c06573d-069d-4092-a98a-fbfbc29ee5fa/METS.xml" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
</mets:fileGrp>
<mets:fileGrp USE="license">
<mets:file GROUPID="Group-2f7c645f-a0b0-4d36-9933-0c8d9e47784b" ID="file-06ce97e6-edfe-4a51-bec3-5cc7bed9e0b1" ADMID="amdSec_3">
<mets:FLocat xlink:href="objects/ITEM_2429-2700.zip-2020-02-07T00_26_40.346055_00_00/bitstream_8267" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
</mets:fileGrp>
<mets:fileGrp USE="text/ocr">
<mets:file GROUPID="Group-2f7c645f-a0b0-4d36-9933-0c8d9e47784b" ID="file-e20b2cb6-71b4-4323-b6c7-84c431c5e88f" ADMID="amdSec_1">
<mets:FLocat xlink:href="objects/ITEM_2429-2700.zip-2020-02-07T00_26_40.346055_00_00/bitstream_40314.txt" LOCTYPE="OTHER" OTHERLOCTYPE="SYSTEM"/>
</mets:file>
</mets:fileGrp>
</mets:fileSec>