Siirry pääsisältöön

Data management guide: Data preservation

How should the data be preserved and stored?

Take into account the following issues in storing data and its planning:

  • Where you store the data and how you process it
  • Permissions (during and after the project)
  • Backups
  • Version Control
  • Naming conventions (dataset name cannot be just dataset)
    • Name the files so that there are no errors or confusion in interpreting them
  • How to build a folder structure and/or database
    • Organise files in clearly named folders so that there are no errors or confusion in interpreting them
  • What file formats you use
    • Use common file formats that do not require a specific program, such as Office tools, to open
  • Which material or part of it should be destroyed during or after the project

Additionally you should make sure that sharing research data between different actors is as easy as possible. The goal is to guarantee the findability and usability of the research data even after the end of the project, if possible.

Ensuring data security is an essential part of storing data. Ensuring data security requires, among other things, compliance with the organisation's data security guidelines and various technical measures to ensure the reliable storage of data.

Organising and naming files

Organise and name files

Systematic organisation and documentation of the data make it easier to find and use the project data during the project and create the conditions for the possible further use of the data.

  • Plan and agree on naming conventions with others before compiling the research data.
  • Name files consistently and clearly.
  • Names should not be too long or too short.
  • A good filename is logically structured and describes the content (e.g. name of the project/project, name of the material, author, date YYYY-MM-DD, version number).
  • Use abbreviations if necessary. The meaning of abbreviations should be documented so that you know what they mean even after a long pause in the project.
  • Avoid special characters, use only numbers, letters, hyphens, or underscores (-, _) in file names.
  • Do not use spaces between the words.

To make sure the file names are understandable, ask your colleague if they understand what the file contains based on the name alone.
 

File format selection

  • It is a good idea to choose the file format/format early on to avoid unnecessary format changes.
    • Transferring data from one format to another is usually not completely successful, and information may be lost, e.g. text formatting, data content of tables, image resolution or sound quality.
  • The file format should be one that can be used for as long as possible.
  • Common file formats supported by most software:

Folder structure in a project

A well-designed and clearly named folder structure makes it easier to organise and find information. An intuitive structure provides an overview of the project content and ensures that current and future project group members understand what information is available.

Start planning the folder structure early in the project so that the structure will support the needs of the project in the best way possible. Using a logical folder structure across all projects improves the efficiency and consistency of data management.

If necessary, you can use the examples below as a model and choose the subfolders that suit your own project. Keep in mind that you can also create user-specific subfolders in addition to public folders. NOTE: If you have several projects running at the same time, it is good to name the parent folder with the name of the project instead of just the word project.

Folder structure examples (and one cautionary example)

project/
  raw data/
  documentation/
  processed_data/
    subfolder_1/
    subfolder_2/
  outputs/

project/
  code/                 code needed to go from input files to final results
  data/                 primary data
    raw/                raw data, never edit! 
    meta/
  doc/                  documentation of the study
  intermediate/         output files from intermediate analysis steps
  logs/                 logs from the different analysis steps
  notebooks/            notebooks that document your day-to-day work
  results/              output from workflows and analyses
    figures/
    reports/
    tables/
  scratch/              temporary files that can safely be deleted
  README.txt            file and folder description 

Cautionary example

Source: Story told in file names, PHD Comics, Copyright Jorge Cham.

Secure research data storage, sharing and open data

There are various options available for the staff of Turku University of Applied Sciences for storing, opening, and sharing research data. Some of these are services maintained and supported by Turku UAS, while others are recommended services maintained and supported by CSC (IT Center for Science, Finland).

When choosing a suitable service, consider for example what kind of data you are storing, how much there is of it, and who needs to be able to access and process it.

Data security

Taking care of data security is always important, but it is especially crucial when dealing with sensitive data. Data security is ensured when your actions prevent the accidental or intentional destruction, damage, alteration, or theft of the data, and when you manage access and user rights appropriately.

Using backups is part of data security. Keep more than one copy of the data in different locations so you will not lose all your work if the file you are working on is accidentally or intentionally destroyed.

More information: FSD: Data Management Guidelines – Physical data storage.

CSC Services for Higher Education Institutions

CSC offers a wide range of digital solutions to support research, development, and teaching activities in higher education institutions. To use these services, you have to register with the My CSC service using your HAKA credentials. Once registered, you can create a project and apply for access rights to suitable digital solutions provided by CSC. You can learn more about how to get started with My CSC by reading the instructions provided by CSC.

CSC provides solutions for different purposes and for various types of data. All CSC services are suitable for processing personal data, but sensitive personal data may only be processed in services specifically designed for handling that type of data. You can find CSC’s storage services below under the tabs CSC data services and CSC services for sensitive data.

If you have questions about selecting or implementing a suitable digital solution, you can contact Turku UAS Data Support at datasupport@turkuamk.fi. CSC also provides support for its services at servicedesk@csc.fi.

Various storage solutions

The tabs below list key storage locations for research data, categorised by different use cases. The services are organised into separate tables based on whether they can be used to process sensitive personal data or sensitive/confidential data. Some of the services are provided by Turku University of Applied Sciences, while others are provided by CSC for all higher education institutions.

Turku UAS storage services for staff Purpose of use Sharing Backup Version control
Home (Z) Personal storage space, available only on the Turku UAS network No Yes No
Group (R) Storage space for research groups and projects, available only on the Turku UAS network With limitations Yes No
B2DROP A cloud-based storage space primarily intended for storing data generated and processed in RDI projects. Log in with Haka credentials. To request additional storage space, contact datasupport (at) turkuamk.fi. Yes No Yes
Teams (Sharepoint) A cloud-based storage space for groups, primarily intended for internal group communication and working on shared documents. Yes No Yes
OneDrive Personal storage space primarily intended for storing, working on, and sharing documents. Yes No Yes
CSC's services for accessing, storing and sharing data Purpose of use More information
Fairdata IDA A general storage service for various research materials. The service supports the opening of research materials, but closed research materials can also be stored in the service after the project has ended. Fairdata IDA
Fairdata Qvain  A tool for describing research data. The service can be used to describe data stored in the IDA service, but it can also be used to describe data stored elsewhere. The metadata of the described research data can be viewed in  Fairdata Etsin service. Qvain
Fairdata PAS A service for long-term storage of research materials for tens and even hundreds of years. The suitability of the material for storage in the service is assessed before a decision is made on long-term storage. PAS
cPouta A service that allows you to use virtual machines, storage and high-performance computing for various needs. Enables, for example, the development of services and platforms as part of RDI projects. Infrastructure-as-a-service type storage and processing/computing environment. cPouta
Funet Filesender A browser-based data sharing service that can be used to transfer files up to 300 gigabytes in size. Funet Filesender
CSC's services for storing and sharing sensitive data Purpose of use More information
SD Services A secure service package for storing, sharing and analyzing sensitive data among RDI project members. Suitable for use while the research is active. SD Services
ePouta A solution that can be connected to the university's internal network, providing the project with the ability to use virtual machines and storage for sensitive data. An Infrastructure as a Service-type storage and processing/computing environment. ePouta

About the guide

This guide covers the Turku UAS instructions on data management.

Other guides to explore

Usage rights of the guide

   This resource has been licensed with a Creative Commons Attribution 4.0 International license. It does not apply to photos or videos unless otherwise stated.