Our mission is to help the broad user community with their needs regarding
scientific experiments and as a supporting facility. There are growing efforts
in opening science by making the data acquisition process more transparent and
much more available, including the data reduction and evaluation procedures. A
strong foundation for this are FAIR principles, building on Findability,
Accessibility, Interoperability, and Reuse of digital assets.
The purpose of the data collection at MGML is to store all the relevant
information connected with performed experiments. This does not include only
measured data but also detailed documentation of the measurement’s whole
progress: electronic logbook, detailed history of all available sensors during
measurement, photo and video documentation, user remarks, and user scripts.
In addition, we are providing infrastructure for data reduction and evaluation.
By collecting all the mentioned data, MGML helps scientists perform reliable and
reproducible experiments following FAIR principles.
This approach will ensure that all data produced at MGML will be following FAIR
principles:
- MAKING DATA FINDABLE
Each dataset will be accurately described with metadata, and all instrument
related data will be categorized into respective folders. Our technical solution
will ensure that all the MGML datasets will be included in data search engines
like Google Dataset Search. Automatically assigned persistent identifiers will
allow citing the data, and all the users will be forced to cite their
datasets by our data policy.
- MAKING DATA OPENLY ACCESSIBLE
After the embargo period of five years from the completion of the proposal,
all related data will be fully opened and published under the CC0 license
(public domain). CC0 license facilitates the discovery, reuse, and citation
of that data forever.
- MAKING DATA INTEROPERABLE
MGML is running several dozens of different instruments producing data in
various data formats. All in-house developed instruments are using open-source
software that generates standardized and well described data formats. All
commercial instruments are generating standardized data formats that are
interchangeable between researchers and institutions.
- MAKING DATA REUSABLE
A strict CC0 license will allow unlimited reuse of all MGML’s produced
datasets. Five years embargo period will guarantee enough time for researchers
to take advantage of their data’s exclusivity and properly publish all their
results. Each PI of the proposal can shorten or prolong the embargo period or
publish the data immediately.
In addition to the FAIR principles, there is a supplementary aspect which will
MGML guarantee:
- MAKING DATA TRUSTWORTHY
The technical design of MGML instruments will ensure that collected raw data
were not modified by the user or anyone else. Our system will generate control
checksums of every dataset, and these will be openly published immediately
after the experiment. Therefore, it will be possible to check the consistency
of the data anytime. Our technical solution also guarantees that no data files
were deleted from the published datasets. The idea behind trustworthy data is
explained in diagram below.
Diagram showing the advantages of trustworthy data in the means of open science
principles. During the peer review process, the manuscript is reviewed, and
evaluation scripts are checked for correctness. However, there is no way to
check if the provided raw data are correct. This task needs to be secured by
large research infrastructure.