Siirry pääsisältöön

Data management guide: Data processing

What does processing data mean?

The processing of data covers the collection and analysis of data. From the perspective of the research life cycle, you mainly process data during the active phase of the research. You may also need to process the data further at the end of the study, if, for example, you are going to open the data or use it for further research.

Research methods are required in the active of research and to gain results. During the active phase of the research, you can use several different research methods, for example, one method for collecting data and another for analysing the data.

Data management in the active phase of the project

In the active phase of the research, think about how you will document the progress of the research and the research methods used, and how you could share them openly. With good documentation, you can make sure the research has been carried out responsibly, and makes it easier to talk about the results and other information about the research for example to new employees of the project, or if you decide to reuse the data after the research has ended, for example, in preparing for another project or further research.

Documentation is a prerequisite for reuse. Clear documentation is particularly valuable when the research results are going to be verified by repeating the experiments carried out in the study. Documentation can be done by taking notes, photographs, code books, field and lab notebooks etc. and save information on:

  • How and when the data was collected
  • How the data has been processed, by whom and when
  • Instruments and equipment used
  • Calibration of equipment and other technical metadata
  • Variables, key vocabulary, measuring units and abbreviations
  • Code and software used
  • Version control
  • Quality Assurance Processes

Follow the practices of your own discipline in documentation. Create a README file in connection with the research data, where you describe the data contents and its documentation. More information below.

Readme files

Research data must always be accompanied by a README file describing and documenting its content. In it, you briefly describe what the data is, from which you can later form the metadata of the data. It is a good idea to build a readme file that proceeds logically and includes all necessary information so anyone looking at it does not need to know anything else about the discipline or project to understand its content.

Below are two example files for creating a project-specific and data-specific README file.

PROJECT INFORMATION
  Study/project title*:

  Description*: <provide a short description of the study/project>

  Project period:

  Project funder:
  
Funding decision:
  Organisations:

  Principle Investigator*:

  Storage period*: <provide the end of the project data storage period>

  Link to data management plan:

FILES & FOLDERS
  Folder structure: <default: Code, Data, Doc, Temp. Add user folders if necessary>

  File naming conventions: <i.e. YYYYMMDD_datasetname_vX.X_revXX>

  File formats: <provide a list of all file formats present in this project>

THIS README FILE
  Date: <YYYY-MM-DD>

  Created by: <name>

GENERAL INFORMATION
  Dataset title*:

  Description*: <provide description of the dataset, steps used, content and purpose>
  
Licence and access*:
  Publication date*:

  Key words*:

  Authors*: <name and ORCID>

  Publisher*: <person or organization>

  Storage period*: <provide the end of the research data storage period>

  Persistent identifier: <if any>

FILES & FOLDERS
  Folder structure: <default: Code, Data, Doc, Scratch. Add user folders if necessary>

  File naming conventions: <i.e. YYYYMMDD_datasetname_vX.X_revXX>
  
File formats: <provide a list of file formats present in this dataset>

DATA COLLECTION
  Institutional catalog ID (if applicable):

  Date of data collection: <provide single date, range, or approximate date: <YYYY-MM-DD>

  Link to electronic lab book (codebook) where the following is described:

    Methods used for data collection (including references, documentation, links):
    
Geographic location of collection (if applicable):
    Experimental & environmental conditions of collection (if applicable):

    Standards and calibration for data collection (if applicable):

    Uncertainty, precision and accuracy of measurements (if applicable):

    Known problems & caveats (sampling, blanks, etc.):

    Codes or symbols used to record missing data with description (if applicable):
    Link to data dictionary
(if applicable):

THIS README FILE
  Date: <YYYY-MM-DD>

  Creator: <NAME>

What does it mean to open research methods?

The aim of opening research methods is to increase the quality, transparency, and impact of research. A research method can cover the different stages of the research, including the collection of research data, the documentation of the research environment and the analysis and conclusions of the research. The research method can be published, for example, on a peer-reviewed platform.

Generally, the same FAIR principles can be applied to open research methods as to open research data. The findability, accessibility, interoperability and reusability of research methods contribute to the openness and use of research data in new research or innovation projects. In addition, the openness of research methods can be supported, for example, through using an open license.

Opening research methods requires responsible management of them. Documentation plays a key role in the responsible management of research methods, so that they can be used to trace the relevant methodological steps. In addition, it is necessary to pay attention to versioning and backup.

What are open research infrastructures?

Open research infrastructures support research activities in accordance with the principles of open science and research. A research infrastructure can consist of equipment, databases, materials and services that are essential for research. They enable research, development and innovation work, which is not necessarily limited to the field of higher education, but can also be used in other organisations to create and share innovations.

About the guide

This guide covers the Turku UAS instructions on data management.

Other guides to explore

More information on research methods and infrastructures

FAIR principles

FAIR principles have been developed in extensive, international research collaboration and they were published in 2016. The FORCE11 Group participated in commenting on the FAIR principles and the group shares information to support their implementation. The Council of the EU has issued a policy on compliance with the principles, and the Finnish Ministry of Education and Culture is committed to complying with them.

Policy for open research data and methods

The policy for open research data and methods was published in two parts. The first policy component on open access to research data was published in 2021 and the second policy component on open access to research methods was published in 2023.

Research infrastructures

The Research Council of Finland's website provides information on research infrastructures and related funding opportunities.

Use open research data

You don't necessarily have to collect all the data yourself. Suitable open data may already be available, e.g. register data or open data available from various data repositories. e.g.

Etsin - Research dataset search

Aila - Data catalogue of Finnish social science data archive

International registries of data repositories

International research data repositories

Usage rights of the guide

   This resource has been licensed with a Creative Commons Attribution 4.0 International license. It does not apply to photos or videos unless otherwise stated.