Sharing data

Resources for data management

Licensing of datasets and databases

The EU has a database directive which restricts data mining on databases.
Has a somewhat similar effect to copyright, because copyright would not apply to data mining.
A good license also gives rights to data mine. So not a major concern.

When you can use datasets:

The license allows
Your country has exceptions for research
The data doesn’t come from the EU

License text, slides, images, and supporting information under a Creative Commons license, and get a DOI using Zenodo or Figshare or OSF other services.

Licensing and machine learning/ AI

Is it data? Is it software? We need to consider the AI solution, the training data, the production data, the AI output, and AI evolutions.

How about ethics? How about liability?

EU AI Act
Models can be reverse-engineered and training data can be extracted
What if the model generates an outcome that is dangerous? .cite[Thanks to E. Glerean for pointing these issues out to us]

Some resources

Further reading

The Turing way
Illustrations from the Turing Way book dashes
Reproduciblity syllabus
The reproducible research data analysis platform
Good talks on open reproducible research can be found here.
“FAIR is not fair enough”
“A FAIRer future”
“Top 10 FAIR Data & Software Things” are brief guides that can be used by the research community to understand how they can make their research (data and software) more FAIR.
Five recommendations for fair software
Publishing research software A MIT libraries webpage on why to publish software, where to publish software, and how to make software citable.
Software Quality Checklist
MolSSI Best Practice Guides
Five recommendations for fair software
Awesome Research Software Registries