Open Data

Open Data refers to data resulting from scientific research that is freely accessible to the public and can be used and shared by anyone, for any purpose, including reuse. Limitations (e.g. attribution of the author) are defined by licensing conditions.

Open Data should meet the following criteria:

  • Availability – data are freely downloadable from the internet in a suitable and editable format,
  • Reusability – data are licensed in a way that allows reuse with minimal restrictions,
  • Universality – data are accessible to everyone for further use and sharing, including commercial use.

Benefits of Open Data

  • Enables verification and critical examination of research results,
  • Prevents unnecessary duplication of research,
  • Allows full analysis and use in follow-up projects,
  • Accelerates and streamlines research through data sharing,
  • Encourages new discoveries by combining data from various sources,
  • Increases citation impact and scientific credibility.

Data Management Plan (DMP)

A Data Management Plan (DMP) is a document that describes the handling of research data throughout its lifecycle: data collection, processing, analysis, storage, access, and reuse. It defines what data will be created and how they will be managed, including their availability during and after the research.

Benefits of a DMP:

  • Anticipates potential problems,
  • Reduces the risk of data loss, duplication, or security breaches,
  • Ensures accuracy, completeness, and reliability of data,
  • Facilitates data sharing and improves communication by assigning responsibility to specific individuals,
  • Ensures continuity of long-term processes and supports research integrity in case of staffing changes.

Recommended DMP Tools

  • DMPonline – a simple online tool for creating and managing DMPs, offering templates from various funders and sample plans.
  • Data Stewardship Wizard – an open-source tool supporting DMP planning, generating templates, export formats, and collaboration options.
  • Argos – an online tool integrated with the OpenAIRE platform.
  • Open Science Framework (OSF) – an open-source web platform for project-wide management.

Publishing Research Data

The basic principle of open data is “As open as possible, as closed as necessary.”
Careful consideration should be given to what data to publish and how. Ideally, all data should be made available except for those restricted by legal constraints (e.g. personal data). Sensitive data can be anonymized, for example, using the online tool Amnesia.

The moment of publication is up to the author. Due to concerns about misuse or data theft, some researchers choose to publish data only after project completion or once they can no longer extract new insights from it.

To ensure that data are truly open and usable, attention must be paid to:

  • appropriate format,
  • metadata description,

These elements are aligned with the FAIR principles:

  • Findable – data are stored in an online, easily searchable repository, described with metadata, and assigned a unique persistent identifier.
  • Accessible – data are accessible under clearly defined conditions; when data cannot be shared, at least metadata should be publicly available.
  • Interoperable – data use standardized terminology and include references to other data; suitable standards can be found via the DCC Metadata Standards Directory or the Czech Open Data Portal.
  • Reusable – data are well-described, appropriately licensed, and use discipline-relevant standards.

Licensing

A proper license is a crucial condition for the publication and reuse of open data. Repositories typically have a default license already set.

Recommended public licenses include:

  • CC-BY 4.0 – allows anyone to use the data for any purpose, with the condition of proper attribution.
  • CC0 – public domain dedication; allows unrestricted use of the data without the legal requirement for attribution (though citation is considered good practice); the author disclaims all warranties and responsibilities.

More information about Creative Commons licenses is available in the Open Science Guidelines (Managing Authority of OP JAK, 2022, p. 9).

Data Repositories and Data Journals

Research data can be stored in repositories specialized by field or subject, or published in data journals. Available repositories include:

  • General repositories (e.g., Zenodo),
  • Disciplinary repositories (e.g., Europe PMC, CLARIN-DK-UCPH),
  • Institutional repositories (e.g., ASEP).

A suitable repository can be found through:

  • the Re3data international registry of research data repositories, or
  • the OpenDOAR directory of open access repositories.

When choosing a repository, consider whether it:

  • offers open access,
  • is trusted or certified,
  • assigns a persistent identifier,
  • automatically applies a license,
  • allows versioning of datasets, etc.

More detailed information, including an overview of available repositories, is provided in the Open Science Guidelines(Managing Authority of OP JAK, 2022, p. 31).

 

Data Journals

Data journals publish peer-reviewed data papers, which describe openly available datasets, highlighting their structure, potential, and possible reuse.
Examples include the Research Data Journal for the Humanities and Social Sciences and others (see list here).