The Digital Preservation Service (DPS) for Research Data ensures accessibility and preservation of digital research content. You can read more about digital preservation on the digitalpreservation.fi site. The service is provided free of charge for higher education and other research organizations in Finland. Before ingesting data to the DPS, the organization must have a service contract for using the DPS and a preservation agreement for the data in question. You can read more about the service contract and preservation agreements on the page Becoming a partner organisation with the DPS.
What routes are available for ingesting the data?
This page presumes that data is stored in the IDA service and is described using the Qvain tool.
Other routes to ingest data to the DPS are described in Finnish on the page Vaihtoehtoreitti tutkimusaineiston pitkäaikaissäilytykseen.
Who is this guide for?
This guide is aimed at organisations who have signed a service contract for using the DPS and have been approved a preservation agreement for their data (see above). The organisation decides internally which person performs each step in this guide. A researcher or an RDI agent has typically created the data, stored it in a service and documented it somehow. Representatives of the organisation, such as the data support, can perform the steps in this guide after the data has been created and documented. The organisation can also decide that the researchers responsible for creating the data can perform steps 1-2 and propose the data for digital preservation in the Management Interface (step 3.1).
0. Validate your files
- It is recommended to check that your files are well-formed and valid before freezing the data, describing it, and ingesting it to the DPS. This can be performed using the online file validation service: https://validation.digitalpreservation.fi/#/en
- If the files aren’t well-formed and valid, the DPS support can help you in interpreting the error messages at pas-support@csc.fi
1. Upload the data to the IDA service and freeze it
We recommend that, if the data has not yet been stored in the IDA service, a person from the organisation’s data support applies for IDA storage space. This can for example be the person that has applied for the preservation agreement for the data. The person responsible for the IDA project (=CSC project) manages access to the data and can add members to the project in the MyCSC portal. Members of the same CSC project can all create descriptive metadata for the data using the Qvain tool.
- Apply for IDA storage space, or join an existing CSC project, that has IDA access.
- Upload the files to the IDA service and freeze the data using the steps described in the IDA Quick Start Guide.
2. Create a new dataset using the Qvain tool
You can use the Qvain Quick Start Guide and the Qvain User Guide for guidance on how to create a dataset and creating descriptive metadata. The steps required to ingest data to the DPS using the IDA service and the Qvain tool are thoroughly described here below. Data description can also be created without graphical user interfaces using the Metax API.
- Log in to the Qvain tool (qvain.fairdata.fi).
- Click ‘Create dataset’ to start creating a new dataset.
- Choose IDA as the data origin and select your project from the drop-down list. Add the files that are to be included in the dataset.
- Check and add metadata for files and folders by clicking on the pencil icon and selecting Add metadata
- Mandatory metadata for files and folders are title (the file/folder name is used as default) and Use category. Metadata for a folder will recursively be used for all files within the folder.
- Check and modify the technical metadata for all CSV and other text files by clicking the pencil icon for the respective files and selecting Add Digital Preservation metadata.
- Mandatory metadata for CSV files are File format “text/csv” (otherwise it might be identified as a plain text file), Encoding and the CSV specific separator and delimiter characters.
- Mandatory metadata for other text files is Encoding. If an encoding is not given, the validation and identification tools will guess the encoding used in the file. This might not always be accurate.
- Check and add metadata for files and folders by clicking on the pencil icon and selecting Add metadata
- Describe the dataset. Mandatory fields are Title, Description, License, Access Type, Issued Date, Keywords, and the Name and Organization of the actor responsible for the published dataset. You can read more about the mandatory metadata in the Qvain User Guide.
- Save and publish the dataset.
3. Ingest the data in the Management interface
To successfully ingest data to digital preservation using the Management interface, a person with rights to propose the dataset for digital preservation, and a person with right to approve the dataset for digital preservation, must exist. The person with proposal rights proposes the published dataset for digital preservation, and the person with approval rights approves this. The same person can have both these rights, which somewhat simplifies the process as the waiting for approval step is skipped. The roles and rights in the Management interface are described in Finnish on the page Hallintaliittymän rooleista ja niiden asettamisesta.
3.1 Proposing a dataset
- Log in to the Management interface (https://manage.fairdata.fi).
- Select the dataset from the list or find it using its title.
- Open the dataset by clicking on its title.
- Select the correct preservation agreement for the dataset by clicking Select in the Preservation Agreement section, select the agreement and confirm the choice by clicking Change.
- Identify file formats and create technical metadata by clicking on the Identify Files button at the bottom of the page. This step starts a process running in the background that can take a while. You can safely leave the page and return to it at any time. When the file formats have been identified and the technical metadata generated, the state of the dataset is “Technical metadata generated”.
- When the dataset is in the state “Technical metadata generated”, you can view the generated metadata (if you have left the page, you must select the dataset from the initial list by clicking on its title.
- The generated technical metadata can be viewed from the directory tree that is visible.
- Check that the generated technical metadata is correct and confirm this by clicking the I have checked the metadata checkbox.
- Add a rationale for why this dataset is proposed for digital preservation (this is not mandatory but can be helpful to the person who approves the dataset). The rationale is visible only to the person who approves the dataset, and it is not stored anywhere else.
- Propose the dataset for digital preservation by clicking on the Propose for preservation button.
- When you have proposed the dataset for digital preservation, all the files in the dataset will be checked and their well-formedness and validity will be confirmed. This process can also take some time. It is run in the background, so you can safely leave the page and return to the dataset later.
- When this is completed, the dataset is in the state “Metadata confirmed”
- Notify the approver in your organisation that the dataset has been proposed for digital preservation (unless you have approve rights yourself).
3.2 Approving a dataset for digital preservation
- Log in to the Management interface (https://manage.fairdata.fi).
- Select the dataset from the list or find it using its title. A dataset that is to be approved should be in the state “Metadata confirmed”.
- Open the dataset by clicking on its title.
- The page displays the quota of the preservation agreement (in bytes) and the size of the proposed dataset. Confirm that the dataset fits in the available quota. Approve the dataset for digital preservation by clicking Accept. This will start the ingest process.
- The ingest process combines metadata and files, and creates a submission information package of the dataset that is sent to digital preservation. The process can take a long time, but it is run in the background. You can safely leave the page and return to it later.
- Check that the dataset is in the state “In digital preservation”.
- The dataset can be viewed in Etsin, by clicking on the “View dataset in Etsin” link, which is in Dataset tab of the Management interface. he public metadata and the DOI of the dataset is available in Etsin.
- The IDA version of the dataset can be accessed in Etsin by clicking on the “Click here to open the use copy” link.
- Congratulations! Your dataset is now in digital preservation!
4. Fetch the dataset from digital preservation
Fetching the dataset from digital preservation can take a while as a dissemination information package is created from the preserved data. We recommend using the IDA version of the data for frequent access. In order to be able to fetch a dataset from digital preservation, a user must have fetch rights for the dataset. The roles and rights in the Management interface are described in Finnish on the page Hallintaliittymän rooleista ja niiden asettamisesta.
- Log in to the Management interface (https://manage.fairdata.fi).
- Select the dataset from the list, or find it using its title. A dataset that is to be fetched should be in the state “In digital preservation”.
- Open the dataset by clicking on its title.
- Click on the Create DIP button to start the dissemination process. The process can take a long time, but it is run in the background. You can safely leave the page and return to it later.
- When the data can be fetched, i.e. the dissemination information package has been created, the Create DIP button reads DIP. Clicking on it will download the disseminated dataset as a ZIP file.
- The dataset is available for dissemination for 10 days until it expires. After that time, a new dissemination information package must be created for the dataset in order to fetch it.
The DPS support
Any issues or questions about any step of the process can be addressed to the support address pas-support@csc.fi.