How Can We Help?
How to manage Master dataHow to manage Master data
Introduction
This page presents an overview of what Master Data is, along with some suggestions on how to best maintain and quality assure you Master Data.
Although most examples focus on External Organisations, External Persons, Journals and Events, the methods can be used on other Master Data types too.
The target audience here is primarily:
- Administrator of External organisations
- Administrator of External persons
- Administrator of Journals
- Administrator of Events
The page can also be a useful reference for System/Technical Administrators.
Master Data: DefinitionMaster Data is a group of content types which represent real-world entities such as Users, Persons, Organisational units and Journals. Master data records are reused often and linked to many other records. For example, a Journal record is linked with all records for articles published in it. Master Data is managed in a separate space in Pure (the Master Data tab), and can be modified only by certain users (restricted access), which helps ensure its consistency. Below you can see some Master Data types in Pure: Quality Assuring Master Data: WhyGood quality Master Data is the foundation of:
|
The picture shows how a single Journal can be linked to several Research Outputs, facilitating content reuse.
|
Master Data: Origin
Master data can be generated:
- Manually
By researchers during submission of content. When a specific item cannot be found, you can create a new item ('Create new' option in the lookup box).
Example: When adding a Journal to a Research Output, a lookup box will appear to enable easy search.
|
- By users with selected roles that have appropriate editing rights (Roles)
- via Import Process using
Bulk import:
XML import/sync
Master List
- Synchronized jobs: e.g Journals from Scopus, External Organizations from SciVal
- Legacy data gathered from legacy systems
- Import sources during content import:
Example: Creating content based on no match content
Managing: General Principles
-
Delete unused records
- When you delete unused records, you will get a better overview of your data. Unused records typically appear when relations to them are moved. For example, if a Research Output is initially linked to Journal A, and this is changed to Journal B, Journal A may become unused (if it was linked only to that one Research Output). Such 'relationless' records can be found using a filter:
- When you delete unused records, you will get a better overview of your data. Unused records typically appear when relations to them are moved. For example, if a Research Output is initially linked to Journal A, and this is changed to Journal B, Journal A may become unused (if it was linked only to that one Research Output). Such 'relationless' records can be found using a filter:
Next step is to delete unused content. This can be done using the bulk edit feature. See this video for an example:
When External Organizations are imported into Pure via a job (Scival External Organisation Synchronisation), they are automatically imported in the workflow step APPROVED, which prevents them from being deleted. To identify the external organizations that allow deletion filter by 'Unused Content' AND 'Workflow > For Approval'. The returned results can be deleted using the bulk edit feature. Watch the video for more details. |
- Merge duplicates (NEW IN 5.15)
Having duplicates makes it difficult to ensure all relations to a specific content type are linked with the same representation of it. If there are two journal records representing "The Lancet", it is almost certain some Research Outputs will be linked to one record, and some to the other. This will make it difficult to generate valid reports, etc. Similarly, if multiple variations of an External Organisation exist, some Persons may have links to all of those existing variants. Merging duplicates will strengthen overall data quality.
-
Check / enrich metadata
- It is important to regularly check and enrich metadata quality of your Master Data.
- See below for examples
- Optional: Allow only approved data to be selected from lookup boxes
This is very useful if you want Researchers to select only Master Data that has been approved, and not add potentially duplicate/incomplete data to the set.
This also improves the quality of data import. When Research Outputs are imported, matching on titles/names of Journals/External Organisations/Persons is only attempted on approved content. This will prevent the import of poor quality content.
Examples of metadata enrichment
This page will focus on examples for External Organisations, External Persons, Journals and Events.
It does not include Internal Persons and Organisational Units, which are most often maintained via synchronisations or masterlists.
External Organizations
Enrich metadata: Data Quality - Country information
External Organizations>Data Improvement>Data Quality
From here, you can click the Google maps or Bing button to identify the address information and update the organization's record in Pure.
To be used in collaboration map/matrix.
External Persons
Enrich metadata: External Organization Associated
To further increase data quality reason, it is recommended to set the relation External Organization ↔ External Persons, which would help track the collaboration between your institution and external organizations. To identify the External Persons that have been added into your Pure and have no associated External Organization data, go you Master Data > External Persons > Add Filter 'External Organization Associated' and toggle to view those without.
Journals
Enrich metadata - Check ISSNs:
Use the ISSN filter to find the Journals that do not have ISSN information, and add the missing ISSNs to increase metadata quality.
Enrich metadata - DOAJ Indexed:
Use the 'DOAJ status' filter to identify the Journals that have been indexed in the Directory of Open Access Journals. Note: The "DOAJ Indexing" job must be enabled before this can be checked.
Enrich metadata - Publisher:
Use the 'Publisher associated' filter to find the Journals that do not have any Publisher associated with them. Add the relation between the journal and the publisher.
We recommend going through the full list of filters available for a content type to see what is most appropriate for your needs.
Updated at July 27, 2024