Sharing data

Services for sharing and collaborating on research data

To find a research data repository for your data, you can search on the Registry of Research Data Repositories (re3data) platform and filter by country, content type, discipline, etc.

International:

  • Zenodo: A general-purpose open access repository created by OpenAIRE and CERN. Integration with GitHub, allows researchers to upload files up to 50 GB.

  • Figshare: Online digital repository where researchers can preserve and share their research outputs (figures, datasets, images and videos). Users can make all of their research outputs available in a citable, shareable and discoverable manner.

  • EUDAT: European platform for researchers and practitioners from any research discipline to preserve, find, access, and process data in a trusted environment.

  • Dryad: A general-purpose home for a wide diversity of datatypes, governed by a nonprofit membership organization. A curated resource that makes the data underlying scientific publications discoverable, freely reusable, and citable.

  • The Open Science Framework: Gives free accounts for collaboration around files and other research artifacts. Each account can have up to 5 GB of files without any problem, and it remains private until you make it public.

Sweden:

Norway:

Denmark:

Finland:


Resources for data management


Licensing of datasets and databases

  • The EU has a database directive which restricts data mining on databases.

  • Has a somewhat similar effect to copyright, because copyright would not apply to data mining.

  • A good license also gives rights to data mine. So not a major concern.

When you can use datasets:

  • The license allows

  • Your country has exceptions for research

  • The data doesn’t come from the EU

License text, slides, images, and supporting information under a Creative Commons license, and get a DOI using Zenodo or Figshare or OSF other services.


Licensing and machine learning/ AI

Is it data? Is it software? We need to consider the AI solution, the training data, the production data, the AI output, and AI evolutions.

How about ethics? How about liability?

  • EU AI Act

  • Models can be reverse-engineered and training data can be extracted

  • What if the model generates an outcome that is dangerous? .cite[Thanks to E. Glerean for pointing these issues out to us]

Some resources


Further reading