class: middle, center, title-slide
.footnote[Tim Head, Wild Tree Tech, PyData Zurich, June 2018]
Tim is a doctor of experimental physics, worked at CERN and EPFL, contributor to the PyData ecosystem.
Wild Tree Tech builds bespoke solutions for clients all around the world, from startups to UN organisations.
- digital products that leverage machine-learning and AI,
- small and large JupyterHub deployments.
Visit http://www.wildtreetech.com.
class: middle, center
--
and cool things that become possible if you can do this.
.footnote[From https://xkcd.com/1987/]
.larger[An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.]
-- Buckheit and Donoho (paraphrasing John Claerbout), WaveLab and Reproducible Research, 1995
- How long till someone can reproduce work?
- How long till they can contribute to someone else's project?
Use the onboarding experience as proxy for how well sharing works.
???
- new team member joins and is given the task to expand on previous work
- sends email to everyone asking for pointers
- gets back various emails with links, documents, slides, and papers
- has to track down code used last time, it is out of date
- fiddle with the code, realise it doesn't produce the plots in the slides. Decides to either adapt the deviated code or start from scratch
- after spending time reproducing the results, or giving up and starting from scratch, do the actual work
- Finally, distribute the results through a presentation, email, or Google Doc, perpetuating the cycle
- The computational tools to solve a problem
- An interface to help us create solutions
- A way to communicate our work
- A way to share our work
- A way to pack it all so others can use it
- A way to do all of this relatively easily and accessibly
And many more language specific tools.
- language specific tools work well if you only use one language
- some tools are easier to install via the operating system's package manager
- to specify the complete computational environment a container is the right level of abstraction
.larger[All the technical pieces exist but they require significant expertise to operate and combine.]
class: middle, center
- Prototype on your laptop
- fast iteration cycles
- interactive work
- fast feedback
???
- Build yet another framework that people have to fit into
- Force people to switch tools
- Force people to work in a particular way
- No benefit from using the tool on day zero
class: middle, center
Writing a Dockerfile
is hard, and tedious. Not my definition of fun.
Most programming languages already have a way to specify dependencies.
Can't we automate creating a Docker image?
It can parse many different files that specify what dependencies to install.
This means that you can keep working the way you have always been working,
and benefit from repo2docker
from day one.
Supported configuration files:
requirements.txt
environment.yml
apt.txt
REQUIRE
install.R
runtime.txt
postBuild
Dockerfile
Hi Tim,
thanks for helping out on our project. To run things
you need to install Docker and then
`pip install repo2docker`.
You need to run it with the URL of the GitHub repo
and it will automatically build a docker image for
you, then launch it. It produces a lot of output but
at the end there is a URL that you need to paste into
your browser. It will show a Jupyter notebook.
Let me know how it goes,
.larger[ Can we make it even easier?
Maybe just a link people can click? ]
You are probably already using something like this.
Flexibly connects users to a computational environment provisioned somewhere else.
.left-column[ .center.width-90[] ] .right-column[
- mature open-source technology, used to serve thousands of users
- BSD license
- commercial support available
- Active developer community (v0.9 was released on Friday!)
- Total of 116 contributors, 44 new ones for this release ]
class: middle, center
.center[
.larger[+
repo2docker
+]
???
Combine repo2docker
with JupyterHub to build images for any git repository
ondemand.
We call it BinderHub.
A public BinderHub operated by the Binder team.
https://github.com/norvig/pytudes/
https://github.com/binder-examples/r
- shout out to all the contributors!
- super hard to find a picture or a list because we are spread over several repositories
- Join us at https://github.com/jupyterhub/binder
- Chat with us https://gitter.im/jupyterhub/binder
- Become part of the community:
- use binder and the tools around it,
- help explain binder to people,
- follow our tutorials (and then help us improve them)
- help maintaining the code,
- take part in the operations,
- create your own with the zero2binder guide!
Users from all around the globe
- The computational tools to solve a problem - open source languages
- An interface to help us create solutions - Jupyter, RStudio, editors
- A way to communicate our work - notebooks
- A way to share our work - GitHub, GitLab, ...
- A way to pack it all so others can use it - containers
- A way to do all of this relatively easily and accessibly - BinderHub
Combines the stability and scalability of JupyterHub with ondemand image building.
Anyone who wants can now make their computational project "one click" reproducible:
Based on state of the art cloud orchestration tools.
Can be deployed by anyone: https://binderhub.readthedocs.io/en/latest/
Tooling is useful for anyone from day one, when they are using it at home,
with no intention of sharing: repo2docker
as virtualenv++
class: middle, center
now we start the exciting part
Can you (this room) move the needle on the number of live binders?
.larger[ Visit http://bit.ly/2t9Bjql to launch your first binder. ]
<iframe src="https://grafana.mybinder.org/d-solo/fZWsQmnmz/pod-activity?refresh=30s&orgId=1&panelId=3" width="100%" height="300" frameborder="0"></iframe>https://spacy.io/usage/linguistic-features#pos-tagging
- https://github.com/binder-examples/appmode
- https://github.com/SimonBiggs/scriptedforms/blob/master/README.md
class: bottom, center