How to document your research software

The lesson

  • Motivation and wishlist
  • Popular tools and solutions
    • In-code documentation
    • README files
    • reStructuredText and Markdown
    • HTML static site generators
    • Wikis
    • LaTeX/PDF
    • Doxygen
    • Other tools
  • In-code documentation
  • Writing good README files
  • Sphinx and Markdown
  • Deploying Sphinx documentation to GitHub Pages
  • Hosting websites/homepages on GitHub Pages
  • Summary

Reference

  • Shell crash course
  • List of exercises
  • Instructor guide
  • Credit and license

About

  • All lessons
  • CodeRefinery
  • Reusing
How to document your research software
  • Popular tools and solutions
  • Edit on GitHub

Popular tools and solutions

Questions

  • What tools are out there?

  • What are their pros and cons?

Objectives

  • Choose the right tool for the right reason.


In-code documentation

  • Comments, function docstrings, …

  • Advantages

    • Good for programmers

    • Version controlled alongside code

    • Can be used to auto-generate documentation for functions/classes

  • Disadvantage

    • Probably not enough for users of the code

We will have a closer look at this in the In-code documentation episode.


README files

  • Advantages

    • Versioned (goes with the code development)

    • It is often good enough to have a README.md or README.rst along with your code/script

  • If you use README files, use either RST or Markdown

  • A great guide to README files: MakeaREADME

We will have a closer look at this in the Writing good README files episode.


reStructuredText and Markdown

# This is a section in Markdown   This is a section in RST
                                  ========================

## This is a subsection           This is a subsection
                                  --------------------

Nothing special needed for        Nothing special needed for
a normal paragraph.               a normal paragraph.

                                  ::

    This is a code block          This is a code block


**Bold** and *emphasized*.        **Bold** and *emphasized*.

A list:                           A list:
- this is an item                 - this is an item
- another item                    - another item

There is more: images,            There is more: images,
tables, links, ...                tables, links, ...
  • Two of the most popular lightweight markup languages.

  • reStructuredText (RST) has more features than Markdown but the choice is a matter of taste.

  • There are (unfortunately) many flavors of Markdown.

  • Motivation to stick to a standard text-based format: They make it easier to move the documentation to other tools which also expect a standard format, as the project/organization grows.

  • We will use MyST flavored Markdown in the Sphinx and Markdown episode and the Hosting websites/homepages on GitHub Pages example.

  • Nice resource to learn Markdown: Learn Markdown in 60 seconds

  • Pandoc can convert between MD and RST (and many other formats).


HTML static site generators

There are many tools that can turn RST or Markdown into beautiful HTML pages:

  • Sphinx ← we will exercise this, this is how this lesson material is built

    • Generate HTML/PDF/LaTeX from RST and Markdown.

    • Basically all Python projects use Sphinx but Sphinx is not limited to Python.

    • Read the docs hosts public Sphinx documentation for free!

    • Also hostable anywhere else, like Github pages.

    • API documentation possible.

  • Jekyll

    • Generates HTML from Markdown.

    • GitHub supports this without adding extra build steps.

  • pkgdown

    • Popular in the R community

  • MkDocs

  • GitBook

  • Hugo

  • Hexo

  • Zola <- this is what we use for our project website and workshop websites

  • There are many more …

GitHub, GitLab, and Bitbucket make it possible to serve HTML pages:

  • GitHub Pages

  • Bitbucket Pages

  • GitLab Pages


Wikis

  • Popular solutions (but many others exist):

    • MediaWiki

    • Dokuwiki

  • Advantage

    • Barrier to write and edit is low

  • Disadvantages

    • Typically disconnected from source code repository (reproducibility)

    • Difficult to serve multiple versions

    • Difficult to check out a specific old version

    • Typically needs to be hosted and maintained


LaTeX/PDF

  • Advantage

    • Popular and familiar in the physics and mathematics community

  • Disadvantages

    • PDF format is not ideal for copy-pasting of examples

    • Possible, but not trivial to automate rebuilding documentation after every Git push


Doxygen

  • Auto-generates API documentation

  • Documented directly in the source code

  • Popular in the C++ community

  • Has support for C, Fortran, Python, Java, etc., see Doxygen Github Repo

  • Many keywords are understood by Doxygen: Doxygen special commands

  • Can be used to also generate higher-level (“human”) documentation

  • Can be deployed to GiHub/GitLab/Bitbucket Pages


Other tools

  • Fortran

    • Fortran Documenter (FORD)

  • Julia

    • Franklin: static site generator

    • Documenter.jl

  • Quarto converts markdown to websites, pdfs, ebooks and many other things


Keypoints

  • Some popular solutions make reproducibility and maintenance of multiple code versions difficult.

Previous Next

© Copyright CodeRefinery contributors.

Built with Sphinx using a theme provided by Read the Docs.