Upgrade from Archivematica 1.15.x to 1.16.0¶

On this page:

Clean up completed transfers watched directory
Create a backup
Upgrade Ubuntu package install
Upgrade Rocky Linux/Red Hat package install
Upgrade in indexless mode
Upgrade with output capturing disabled
Update search indices
Review the processing configuration
Migrate from MySQL 5.x to 8.x

Note

While it is possible to upgrade a GitHub-based source install using ansible, these instructions do not cover that scenario.

Clean up completed transfers watched directory¶

Note

Ignore this section if you upgrading from Archivematica 1.11 or newer.

Upgrading from Archivematica 1.10.x or older to Archivematica 1.16.0 can result in a number of completed transfers appearing as failed in the Archivematica dashboard, as well as corresponding failure notification emails being sent. These are not actual failures, but are unintentional side effects of changes made in Archivematica 1.11 to the workflow and to how metadata files are stored and copied into the SIP.

To prevent these failures from occuring during an upgrade from Archivematica 1.10 or earlier:

Confirm that all transfers and ingests are complete.

Check that there are no transfers or SIPs that are still being processed or awaiting decisions in the Transfer and Ingest tabs. If there are, finish processing the transfers/ingests before proceeding.

Delete all contents of the completedTransfers watched directory.

sudo rm -rf /var/archivematica/sharedDirectory/watchedDirectories/SIPCreation/completedTransfers/*

Perform the upgrade as described below.

Create a backup¶

Before starting any upgrade procedure on a production system, we strongly recommend backing up your system. If you are using a virtual machine, take a snapshot of it before making any changes. Alternatively, back up the file systems being used by your system. Exact procedures for updating will depend on your local installation. At a minimum you should make backups of:

The Storage Service SQLite (or MySQL) database
The dashboard MySQL database

This is a simple example of backing up these two databases:

sudo cp /var/archivematica/storage-service/storage.db ~/storage_db_backup.db
mysqldump -u root -p MCP > ~/am_backup.sql

If you do not have a password set for the root user in MySQL, you can take out the ‘-p’ portion of that command. If there is a problem during the upgrade process, you can restore your MySQL database from this backup and try the upgrade again.

If you’re upgrading from Archivematica 1.8 or lower to the 1.9 version or higher, the Elasticsearch version support changed from 1.x to 6.x and it’s also recommended to create a backup of your Elasticsearch data, especially if you don’t have access to the AIP storage locations in the local filesystem.

You can follow these steps in order to create a backup of Elasticsearch:

# Remove and recreate the folder that stores the backup
sudo rm -rf /var/lib/elasticsearch/backup-repo/
sudo mkdir -p /var/lib/elasticsearch/backup-repo/
sudo chown elasticsearch:elasticsearch /var/lib/elasticsearch/backup-repo/
# Allow elasticsearch to write files to the backup
echo 'path.repo: ["/var/lib/elasticsearch/backup-repo"]' |sudo tee -a /etc/elasticsearch/elasticsearch.yml
# Restart ElasticSearch and wait for it to start
sudo service elasticsearch restart
sleep 60s
# Configure the ES backup
curl -XPUT "localhost:9200/_snapshot/backup-repo" -H 'Content-Type: application/json' -d \
'{
     "type": "fs",
     "settings": {
     "location": "./",
     "compress": true
     }
 }'
# Take the actual backup, and copy it to a safe place
curl -X PUT "localhost:9200/_snapshot/backup-repo/am_indexes_backup?wait_for_completion=true"
cp /var/lib/elasticsearch/backup-repo elasticsearch-backup -rf

For more info, refer to the ElasticSearch 6.8 docs.

Upgrade on Ubuntu packages¶

Update the operating system.

sudo apt-get update && sudo apt-get upgrade

Update package sources.

echo 'deb [arch=amd64] http://packages.archivematica.org/1.16.x/ubuntu jammy main' >> /etc/apt/sources.list
echo 'deb [arch=amd64] http://packages.archivematica.org/1.16.x/ubuntu-externals jammy main' >> /etc/apt/sources.list

Optionally you can remove the lines referencing packages.archivematica.org/1.15.x from /etc/apt/sources.list.

Update the Storage Service.

sudo apt-get update
sudo apt-get install archivematica-storage-service

Update Archivematica. During the update process you may be asked about updating configuration files. Choose to accept the maintainers versions. You will also be asked about updating the database - say ‘ok’ to each of those steps. If you have set a password for the root MySQL database user, enter it when prompted.
```
sudo apt-get install archivematica-common
sudo apt-get install archivematica-dashboard
sudo apt-get install archivematica-mcp-server
sudo apt-get install archivematica-mcp-client
sudo apt-get install archivematica
```

Restart services.

sudo service archivematica-storage-service restart
sudo service gearman-job-server restart
sudo service archivematica-mcp-server restart
sudo service archivematica-mcp-client restart
sudo service archivematica-dashboard restart
sudo service nginx restart

Depending on your browser settings, you may need to clear your browser cache to make the dashboard pages load properly. For example in Firefox or Chrome you should be able to clear the cache with control-shift-R or command-shift-F5.

Upgrade on Rocky Linux/Red Hat packages¶

Upgrade the repositories for 1.16:

sudo sed -i 's/1.15.x/1.16.x/g' /etc/yum.repos.d/archivematica*

Remove the current installed version of ghostscript:

sudo rpm -e --nodeps ghostscript ghostscript-x11 \
                     ghostscript-core ghostscript-fonts

Upgrade Archivematica packages:
```
sudo yum update
```

Apply the Archivematica database migrations:

sudo -u archivematica bash -c " \
    set -a -e -x
    source /etc/default/archivematica-dashboard || \
        source /etc/sysconfig/archivematica-dashboard \
            || (echo 'Environment file not found'; exit 1)
    cd /usr/share/archivematica/dashboard
    /usr/share/archivematica/virtualenvs/archivematica/bin/python manage.py migrate --noinput
";

Apply the Storage Service database migrations:

Warning

In Archivematica 1.13 or newer, the new default database backend is MySQL. Please follow our migration guide to move your data to a MySQL database before these migrations are applied.

If you want to continue using SQLite, please edit the environment configuration found in /etc/sysconfig/archivematica-storage-service. Comment out SS_DB_URL and indicate the path of the SQLite database with SS_DB_NAME, e.g.: SS_DB_NAME=/var/archivematica/storage-service/storage.db.
```
sudo -u archivematica bash -c " \
    set -a -e -x
    source /etc/default/archivematica-storage-service || \
        source /etc/sysconfig/archivematica-storage-service \
            || (echo 'Environment file not found'; exit 1)
    cd /usr/lib/archivematica/storage-service
    /usr/share/archivematica/virtualenvs/archivematica-storage-service/bin/python manage.py migrate
";
```

Restart the Archivematica related services, and continue using the system:

sudo systemctl restart archivematica-storage-service
sudo systemctl restart archivematica-dashboard
sudo systemctl restart archivematica-mcp-client
sudo systemctl restart archivematica-mcp-server

Depending on your browser settings, you may need to clear your browser cache to make the dashboard pages load properly. For example in Firefox or Chrome you should be able to clear the cache with control-shift-R or command-shift-F5.

Upgrade on Vagrant / Ansible¶

This upgrade method will work with Vagrant machines, but also with cloud based virtual machines, or physical servers.

Connect to your Vagrant machine or server

vagrant ssh # Or ssh <your user>@<host>

Install Ansible

sudo pip install ansible==2.9.10 jmespath jinja2==3.0.3

Checkout the deployment repo:

git clone https://github.com/artefactual/deploy-pub.git

Go into the appropiate playbook folder, and install the needed roles

Ubuntu 22.04 (Jammy):

cd deploy-pub/playbooks/archivematica-jammy
ansible-galaxy install -f -p roles/ -r requirements.yml

Rocky Linux 9:

cd deploy-pub/playbooks/archivematica-rocky9
ansible-galaxy install -f -p roles/ -r requirements.yml

All the following steps should be run from the respective playbook folder for your operating system.

Verify that the vars-singlenode.yml has the appropiate contents for Elasticsearch and Archivematica, or update it with your own

Create a hosts file.

echo 'am-local   ansible_connection=local' > hosts

Upgrade Archivematica running

ansible-playbook -i hosts singlenode.yml --tags=elasticsearch,archivematica-src

Upgrade in indexless mode¶

As of Archivematica 1.7, Archivematica can be run in indexless mode; that is, without Elasticsearch. Installing Archivematica without Elasticsearch, or with limited Elasticsearch functionality, means reduced consumption of compute resources and lower operational complexity. By setting the archivematica_src_search_enabled configuration attribute, administrators can define how many things Elasticsearch is indexing, if any. This can impact searching across several different dashboard pages.

Upgrade your existing Archivematica pipeline following the instructions above.

Modify the relevant systemd EnvironmentFile files by adding lines that set the relevant environment variables to false.

If you are using Ubuntu, run the following commands.

sudo sh -c 'echo "ARCHIVEMATICA_DASHBOARD_DASHBOARD_SEARCH_ENABLED=false" >> /etc/default/archivematica-dashboard'
sudo sh -c 'echo "ARCHIVEMATICA_MCPSERVER_MCPSERVER_SEARCH_ENABLED=false" >> /etc/default/archivematica-mcp-server'
sudo sh -c 'echo "ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_SEARCH_ENABLED=false" >> /etc/default/archivematica-mcp-client'

If you are using Rocky Linux, run the following commands.

sudo sh -c 'echo "ARCHIVEMATICA_DASHBOARD_DASHBOARD_SEARCH_ENABLED=false" >> /etc/sysconfig/archivematica-dashboard'
sudo sh -c 'echo "ARCHIVEMATICA_MCPSERVER_MCPSERVER_SEARCH_ENABLED=false" >> /etc/sysconfig/archivematica-mcp-server'
sudo sh -c 'echo "ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_SEARCH_ENABLED=false" >> /etc/sysconfig/archivematica-mcp-client'

Restart services.

If you are using Ubuntu, run the following commands.

sudo service archivematica-dashboard restart
sudo service archivematica-mcp-client restart
sudo service archivematica-mcp-server restart

If you are using Rocky Linux, run the following commands.

sudo -u root systemctl restart archivematica-dashboard
sudo -u root systemctl restart archivematica-mcp-client
sudo -u root systemctl restart archivematica-mcp-server

If you had previously installed and started the Elasticsearch service, you can turn it off now.

sudo -u root systemctl stop elasticsearch
sudo -u root systemctl disable elasticsearch

Upgrade with output capturing disabled¶

As of Archivematica 1.7.1, output capturing can be disabled at upgrade or at any other time. This means the stdout and stderr from preservation tasks are not captured, which can result in a performane improvement. See the Task output capturing configuration <task-output-capturing-admin> page for more details. In order to disable output capturing, set the ARCHIVEMATICA_MCPCLIENT_MCPCLIENT_CAPTURE_CLIENT_SCRIPT_OUTPUT environment variable to false and restart the MCP Client process(es). Consult the installation instructions for your deployment method for more details on how to set environment variables and restart Archivematica processes.

Back to the top

Update search indices¶

Note

Ignore this section if you are planning to run Archivematica without search indices.

Archivematica releases may introduce changes that require updating the search indices to function properly, e.g. Archivematica v1.12.0 introduced new fields to the search indices and made some changes to text field types. Please keep an eye on our release notes before you start the upgrade.

The update can be accomplished one of two ways. Preferably, you can reindex the documents which is usually faster because the same documents that you already have indexed will be re-ingested. We would love to know if this is not working for you, but when that’s the case, it is possible to recreate the indices which will take much longer to complete because it accesses the original data, e.g. your AIPs.

Reindex the documents¶

In Elasticsearch, it is possible to add new fields to search indices but it is not possible to update existing ones. The recommended strategy is to create new indices with our desired mapping and reindex our documents. This is based on the Reindex API.

It is a multi-step process that we have automated with a script: es-reindex.sh. Please follow the link and read the instructions carefully.

Warning

Before you continue, we recommend backing up your Elasticsearch data. Please read the official docs for instructions.

Note

We may implement this script as a Django command in the future for better usability. For the time being, please download the script and tweak as needed.

Recreate the indices¶

This method will allow you to delete and rebuild the existing Elasticsearch indices so that all the Backlog and Archival Storage column fields are fully populated, including for transfers and AIPs ingested prior to the upgrade to Archivematica 1.16.0. Run the commands described in Rebuild the indexes to fully delete and rebuild the indices.

Execution example:

sudo -u archivematica bash -c " \
    set -a -e -x
    source /etc/default/archivematica-dashboard || \
        source /etc/sysconfig/archivematica-dashboard \
            || (echo 'Environment file not found'; exit 1)
    cd /usr/share/archivematica/dashboard
    /usr/share/archivematica/virtualenvs/archivematica/bin/python \
        manage.py rebuild_transfer_backlog --from-storage-service --no-prompt
";

sudo -u archivematica bash -c " \
    set -a -e -x
    source /etc/default/archivematica-dashboard || \
        source /etc/sysconfig/archivematica-dashboard \
            || (echo 'Environment file not found'; exit 1)
    cd /usr/share/archivematica/dashboard
    /usr/share/archivematica/virtualenvs/archivematica/bin/python \
        manage.py rebuild_aip_index_from_storage_service --delete-all
";

Note

Please note, the use of encrypted or remote Transfer Backlog and AIP Store locations may require use of the option to rebuild indices from the Storage Service API rather than from the filesystem. At this time, it is not possible to rebuild the indices for all types of remote locations.

Note

Please note, the execution of this command may take a long time for big AIP and Transfer Backlog storage locations, especially if the packages are stored compressed or encrypted, or you are using a third party service. If that is the case, you may want to reindex the Elasticsearch documents instead.

Review the processing configuration¶

After any Archivematica upgrade, it is recommended to perform a sanity check on your processing configurations. Look for new decision points where you want to establish a default, like the new “Scan for viruses” introduced in Archivematica 1.13.

The default and automated bundled configurations can be reset to the Archivematica defaults.

Migrate from MySQL 5.x to 8.x¶

It is recommended the MySQL databases for Archivematica and Storage Service use the MySQL 8 utf8mb4 character set and its default collation utf8mb4_0900_ai_ci (or utf8mb4_general_ci in MariaDB).

If you migrate your databases from MySQL 5.x you can check the character set and encoding of their tables with:

SELECT
   t.table_schema, t.table_name, c.character_set_name, t.table_collation
FROM
   information_schema.tables t,
   information_schema.collation_character_set_applicability c
WHERE
   c.collation_name = t.table_collation
   AND t.table_type = 'BASE TABLE'
   AND (t.table_schema = 'MCP' OR t.table_schema = 'SS');

If they use the utf8mb3 character set and collation you should update them to avoid potential migration conflicts like this:

Running migrations:
  Applying admin.0003_logentry_add_action_flag_choices... OK
  Applying auth.0009_alter_user_last_name_max_length... OK
  Applying auth.0010_alter_group_name_max_length... OK
  Applying auth.0011_update_proxy_permissions... OK
  Applying auth.0012_alter_user_first_name_max_length... OK
  Applying locations.0031_rclone_space...Traceback (most recent call last):
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
    return self.cursor.execute(sql, params)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/django/db/backends/mysql/base.py", line 73, in execute
    return self.cursor.execute(query, args)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/MySQLdb/cursors.py", line 179, in execute
    res = self._query(mogrified_query)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/MySQLdb/cursors.py", line 330, in _query
    db.query(q)
  File "/pyenv/data/versions/3.9.18/lib/python3.9/site-packages/MySQLdb/connections.py", line 255, in query
    _mysql.connection.query(self, query)
MySQLdb.OperationalError: (3780, "Referencing column 'space_id' and referenced column 'uuid' in foreign key constraint 'locations_rclone_space_id_adb7fd1d_fk_locations_space_uuid' are incompatible.")

django.db.utils.OperationalError: (3780, "Referencing column 'space_id' and referenced column 'uuid' in foreign key constraint 'locations_rclone_space_id_adb7fd1d_fk_locations_space_uuid' are incompatible.")

The following script can be used as a reference to update the character set of the databases and their tables.

#!/usr/bin/env bash

set -o errexit   # abort on nonzero exitstatus
set -o nounset   # abort on unbound variable
set -o pipefail  # do not hide errors within pipes

# Array of database names
DATABASES=(
  MCP
  SS
)

# Collation and CHARSET
CHARSET="utf8mb4"
COLLATION="utf8mb4_0900_ai_ci"

# MySQL authentication (optional, default no auth)
MYSQL_USE_AUTH=False
MYSQL_USER=root
MYSQL_PASSWORD="THE_PASSWORD"

# Function to execute a query
execute_query() {
    local query="$1"
    local db_name="$2"
    local user_arg=""

    if [ "$MYSQL_USE_AUTH" = "True" ]; then
        user_arg="-u$MYSQL_USER"
        export MYSQL_PWD="$MYSQL_PASSWORD"
    fi

    mysql -N -B $user_arg -e "$query" "$db_name"
}

# Function to fix database charset and collation
fix_database_charset() {
    local query="ALTER DATABASE ${DB_NAME} CHARACTER SET $CHARSET COLLATE $COLLATION;"
    echo "Fixing database charset and collation"
    execute_query "$query" "$DB_NAME"
    echo "Fixed database charset and collation"
}

# Function to fix tables charset and collation
fix_tables_charset() {
    local query="SELECT CONCAT('ALTER TABLE \`',  table_name, '\` CHARACTER SET $CHARSET COLLATE $COLLATION;') \
    FROM information_schema.TABLES AS T, information_schema.\`COLLATION_CHARACTER_SET_APPLICABILITY\` AS C \
    WHERE C.collation_name = T.table_collation \
    AND T.table_schema = '$DB_NAME' \
    AND (C.CHARACTER_SET_NAME != '$CHARSET' OR C.COLLATION_NAME != '$COLLATION');"

    local alter_table_queries=$(execute_query "$query" "$DB_NAME")
    alter_table_queries_no_foreign_key_checks=$(echo -e "SET FOREIGN_KEY_CHECKS=0;\n$alter_table_queries\nSET FOREIGN_KEY_CHECKS=1;")
    # echo "$alter_table_queries_no_foreign_key_checks"
    echo "Fixing tables charset and collation"
    execute_query "$alter_table_queries_no_foreign_key_checks" "$DB_NAME"
    echo "Fixed tables charset and collation"
}

# Function to fix column collation for varchar columns
fix_varchar_columns_collation() {
    local query="SELECT CONCAT('ALTER TABLE \`', table_name, '\` MODIFY \`', column_name, '\` ', DATA_TYPE, \
    '(', CHARACTER_MAXIMUM_LENGTH, ') CHARACTER SET $CHARSET COLLATE $COLLATION', \
    (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';') \
    FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = '$DB_NAME' AND DATA_TYPE = 'varchar' AND \
    ( CHARACTER_SET_NAME != '$CHARSET' OR COLLATION_NAME != '$COLLATION');"

    local alter_table_queries=$(execute_query "$query" "$DB_NAME")
    alter_table_queries_no_foreign_key_checks=$(echo -e "SET FOREIGN_KEY_CHECKS=0;\n$alter_table_queries\nSET FOREIGN_KEY_CHECKS=1;")
    # echo "$alter_table_queries_no_foreign_key_checks"
    echo "Fixing column collation for varchar columns"
    execute_query "$alter_table_queries_no_foreign_key_checks" "$DB_NAME"
    echo "Fixed column collation for varchar columns"
}

# Function to fix column collation for non-varchar columns
fix_non_varchar_columns_collation() {
    local query="SELECT CONCAT('ALTER TABLE \`', table_name, '\` MODIFY \`', column_name, '\` ', DATA_TYPE, ' \
    CHARACTER SET $CHARSET COLLATE $COLLATION', (CASE WHEN IS_NULLABLE = 'NO' THEN ' NOT NULL' ELSE '' END), ';') \
    FROM information_schema.COLUMNS \
    WHERE TABLE_SCHEMA = '$DB_NAME' \
    AND DATA_TYPE != 'varchar' \
    AND (CHARACTER_SET_NAME != '$CHARSET' OR COLLATION_NAME != '$COLLATION');"

    local alter_table_queries=$(execute_query "$query" "$DB_NAME")
    alter_table_queries_no_foreign_key_checks=$(echo -e "SET FOREIGN_KEY_CHECKS=0;\n$alter_table_queries\nSET FOREIGN_KEY_CHECKS=1;")
    # echo "$alter_table_queries_no_foreign_key_checks"
    echo "Fixing column collation for non-varchar columns"
    execute_query "$alter_table_queries_no_foreign_key_checks" "$DB_NAME"
    echo "Fixed column collation for non-varchar columns"
}

# Loop through each database in the array
for DB_NAME in "${DATABASES[@]}"; do
    echo "Processing database: $DB_NAME"
    fix_database_charset
    fix_tables_charset
    fix_varchar_columns_collation
    fix_non_varchar_columns_collation
    echo "Migration completed for $DB_NAME"
done

# Unset the MYSQL_PWD environment variable after executing the queries
unset MYSQL_PWD

Upgrade from Archivematica 1.15.x to 1.16.0¶

Clean up completed transfers watched directory¶

Create a backup¶

Upgrade on Ubuntu packages¶

Upgrade on Rocky Linux/Red Hat packages¶

Upgrade on Vagrant / Ansible¶

Upgrade in indexless mode¶

Upgrade with output capturing disabled¶

Update search indices¶

Reindex the documents¶

Recreate the indices¶

Review the processing configuration¶

Migrate from MySQL 5.x to 8.x¶

Archivematica 1.16.0

Contents

Available projects

Archivematica

Archivematica Storage Service

License

Language