## Git, GitHub, and what your lab can gain from open science John Pearson Dec 13, 2016
## This is not one of *those* talks Note: - not the reproducibility police - not to tell you what priorities should be - not a git tutorial

Also, this is not (really) about

Git is:

  • a version control system
  • distributed, not central
  • developed for the Linux kernel
  • very fast

Version control:

  • How do we work on the same code at the same time?
  • How do we track/revert changes and fix bugs?
  • How do we handle multiple versions?

Distributed Version control:

  • Everyone has a copy
  • No need for a central "blessed" version
  • No need for internet connection

But git has issues

https://github.com/
  • Cloud-based git service
  • "Social" coding
  • Well-designed user interface
  • Cute mascot

The rest of the talk:

Benefits for you:

  • Managing technical debt
  • Reducing bus factor
  • Communicating your science


+Coda: Benefits for your students

One more thing: it's free.

I. Technical debt

You wouldn't keep a lab notebook on scraps of paper.

Why do it with your code?

github.com/pearsonlab
an example repository
a larger project

The bottom line:

  • Code is in one place
  • You can choose to keep it private
  • You have complete history of project
  • Jupyter notebooks for reproducible analysis
  • Can also release data*

II. Bus factor

This is a story...

about a magical grad student...

Who built this...
and then left the lab...

...with this.

The bottom line:

  • In my lab, if it's not on GH, it doesn't count
  • This goes double for undergrads
  • And procedures go on the wiki

III. Communicating your science

The long tail

Papers live a long time.

Who would be citing you if they could use your work?

pearsonlab.github.io



Coda: For students

GitHub as resume

github.com/jmxpearson

Action steps

  • Come talk to me
  • Adopt GH on a per-project basis
  • There are lots of ways to learn git and GitHub