How Can We Help?
File storageFile storage
Configuration of Pure file stores. Pure can be configured to store files on the local file system and in external repositories.
Creating a store
Storage configurations for different types of repositories can be added, and the type of content to synchronise can be defined. Additional filters can be specified to further filter out content.
Conditions for storage
The conditions for storage section in the the store configuration is used to specify what content the store should hold.
The following filters are available:
Visibility (require the files or content is publicly visible)
Workflow status (for example require that the publication is validated)
File embargo (only contain files where the embargo date has been passed)
Publication status classification (For example, In press or Published)
- Type classification (For example only synchronise articles and patents)
Only content that matches the type and filters will be synchronised to the storage.
The files and metadata for content will be stored in all stores where the conditions match.
If the conditions between stores overlap, a copy of the files and metadata will be transferred to each store.
Re-configuring a store
If an existing store is re-configured, for example by change of visibility on filters, the files will be added/removed to the storage. The system will move the existing files according to the new rules on the next full preserved content update job run (each night), and modified content will be moved when then job performs one of the incremental runs.
Default storage
All files that does not match the conditions for storage on any store is stored in the default storage. By default the "Local file storage" is configured as the default store, but another store can be configured as the default store.
Because the default store needs to be able to store files for all content types, only "Local file system" and "Amazon S3" stores can be configured as the default store.
Changing the default store can result in a lot of file being moved.
Temporary storage
When new files are uploaded to Pure, they are first stored in temporary storage. Later the preserved content update job moves the files to other stores according to the conditions for storage.
Testing configuration
The test button validates the configuration, and attempts to contact the server to verify the URLs, credentials, and other configuration options to the extent possible by the repository API.
Analyse impact
The Analyse impact function displays an estimate of the changes the current configuration will have on content stored in the storage. Modifying the conditions for storage can cause content to be added or deleted from the store, and the Analyse impact dialogue will display the estimated number of addition, updates, and deletions.
Preserved content update job
The preserved content update job is responsible for moving files and metadata between stores.
The job performs the following operations:
- Scan all content that can contain files:
- Evaluate the conditions for storage for each store to determine if the metadata and files for that piece of content should be transferred to the store.
- Transfer files and metadata to the store if conditions match.
- Delete files and metadata if the conditions no longer match.
- Any files that are no longer attached to the content.
- If no conditions for any store match, transfer the files on the content to the default store.
- Evaluate the conditions for storage for each store to determine if the metadata and files for that piece of content should be transferred to the store.
- Finally delete all files and metadata for content that has been deleted in Pure.
The job has two modes of operations. Incremental and Full scan. Under normal scheduling the job runs a full scan every 24 hours, and an incremental scan if it has been less than that. Because incremental scans are relatively light weight, the job can be scheduled to run often to keep the stores in sync.
Full scan
Every 24 hours the job performs an full scan of all content in Pure that can contain files. For each piece of content the job evaluates the conditions for storage for each store to make sure the files and content is stored in the correct stores. It also retries any file operations that may have failed previously.
Incremental
An incremental run of the job only processes content that have been modified since the last run of the job.
Disabling file synchronisation
For testing purposes a storage can be set to read-only mode by switching off “Enable content writes to this storage”. The “PreservedContentUpdate” job will skip synchronisation of all content that would otherwise be updated, deleted or created in the storage.
All file synchronisation can also be turned off by starting Pure with the -DdisableFileSynchronization=true command-line option. This prevents the content update job from running, and allows administrators a chance to validate the storage configuration before any synchronisation takes place.
Deleting a store
Only empty storages can be deleted. To empty a storage remove all conditions for storage, and let the content update job move any stored content away from the store.
Storage specific options
Local file system
For more information about storing files in the local file system with Pure see the Local file system specific documentation
Dspace
For more information about using Dspace with Pure see the DSpace specific documentation
Eprints
For more information about using Eprints with Pure see the Eprints specific documentation
Amazon S3
For more information about using Amazon S3 with Pure see the Amazon S3 specific documentation
Updated at July 27, 2024