How Can We Help?
How does backup, restore and data copy work in Pure?How does backup, restore and data copy work in Pure?
Backup of your Pure data
This document explains how we perform backups of Elsevier-hosted Pure instances:
- how long we keep the backups for
- what options we have for restore and data copy (e.g. from production to staging).
We follow best practices in data management and have many processes in place to ensure that you can recover data and that the process of transferring data between Pure instances works smoothly.
Database
Every night, we do a snapshot of the data store. This contains:
- Pure database
- Audit files
- Binary files
- Application configuration
We store each snapshot for 180 days, and snapshots can be retrieved near-instantly.
Binary files
The Preserved Content Update job syncs all files in /data/pure_data/perm/local to an S3 bucket specific to your Pure instance.
The S3 bucket is versioned, so we can roll that back 180 days. Note: The database needs to be in the same state as this bucket, as it contains all metadata for the binary files, so the database will also be rolled-back.
Data copy
We have the capability to:
- copy data from one Pure instance or environment to another
- restore your Pure to an earlier state for disaster-recovery purposes
- roll back data on a Pure instance (e.g. if you have done something irreversible with your data) to a snapshot less than 180 days old.
Process
The first step is to agree a course of action with the customer and schedule a timeslot. We will not perform any data tasks unless these have been specifically requested and agreed to by the customer.
We require the following information::
- Source server
- Destination server
- Days you want to go back (1 - 180).
After we have scheduled the task, it does the following:
- Stop Pure
- Attach the source snapshot to the destination instance
- Start Pure
- Wait until Pure is ready
- Update mount points
- Update Pure Portal API Key
- Start re-index
- Update settings for S3 filestore repository.
If we require a newer version of Pure, we then manually install this Pure version we want.
Note: The Pure version used must be equal or higher than the Pure version used when the backup was created. For example, if the snapshot was made on a 5.14.3 Pure, we can not install Pure version 5.13.3 on it.
Before requesting a data copy, we strongly suggest that you create a local admin account on the system that we are copying from, so that you can always log in, and make changes, for instance, to your single sign on solution.
Cron jobs
Pure will pause cron jobs automatically if it detects that there was an environment change (e.g. from production to staging).
If you need cron jobs in the new environment, you need to go in and actively acknowledge the environment change through the prompt shown in the Administrator tab, after which cron jobs will start to run again.
If you copy data from production to staging, Pure will detect whether it is a test/staging environment and ORCID is forced to run in sandbox mode.
If it's a rollback on an environment (e.g. staging to staging), the snapshot will be from the same environment, so Pure will not detect an environment change (and no actions are required).
Pure Portal
After the data copy/restore has finished, we need to initiate a full re-harvest of the Pure Portal for that environment so that it can be populated with the current data. This can take up to several days, and the Portal will be down with a maintenance error (503) during that period.
Updated at July 27, 2024