class: middle, center, title-slide
A multi-user notebook server
.footnote[Tim Head, Wild Tree Tech, April 2018]
class: middle, center
One day training or semester long courses usually involve "setup instructions"...
???
JupyterHub means that there is no need to waste time on setup.
Everyone has their own computer, USB drives, coffee and tea ...
???
Different team members with different levels of skills/time, different hardware, etc. Sharing, backups, etc is hard. Limited to laptop's compute power.
???
Using JupyterHub means that all work is in a central place, shared, and backed up.
All you need is a browser, centralised access to: data, compute, and software.
???
This means that you could work from an iPad on the beach or ski hut or ...!
class: middle, center
What happens when you run jupyter notebook
?
???
A notebook is a document, coding environment, and a web application!
Two tasks:
- authentication of users
- launching jupyter notebook servers
One notebook server for each user.
Establishes who a user is.
- PAM (default)
- OAuth (GitHub, Google, GitLab, Globus, your OAuth provider)
- LDAP
- LTI (popular with educational services)
- NullAuthenticator (if you don't need auth)
- write your own
from tornado import gen
from traitlets import Dict
from jupyterhub.auth import Authenticator
class DictionaryAuthenticator(Authenticator):
passwords = Dict(config=True,
help="""dict of username:password for authentication"""
)
@gen.coroutine
def authenticate(self, handler, data):
"""
Check username and password against a dictionary.
Return username if password correct, else return None.
"""
password = self.passwords.get(data['username'])
if password == data['password']:
return data['username']
More: http://jupyterhub-tutorial.readthedocs.io/en/latest/authenticators.html
Responsible for starting a user's notebook server.
A Spawner needs to be able to:
- start the process
- poll whether the process is still running
- stop the process
Some custom spawners:
- DockerSpawner, launch servers in docker containers
- SudoSpawner, JupyterHub does not
have to be
root
- BatchSpawner, launch user servers on a batch system
- KubeSpawner, spawn user servers on Kubernetes
- or, your customer spawner
More: http://jupyterhub.readthedocs.io/en/latest/reference/spawners.html
class LocalProcessSpawner:
@gen.coroutine
def start(self):
"""Start local notebook server"""
if self.ip:
self.user.server.ip = self.ip
self.user.server.port = random_port()
cmd = []
env = self.get_env()
cmd.extend(self.cmd)
cmd.extend(self.get_args())
self.proc = Popen(cmd, env=env,
preexec_fn=self.make_preexec_fn(self.user.name),
)
self.pid = self.proc.pid
return (self.user.server.ip, self.user.server.port)
class LocalProcessSpawner:
async def poll(self):
"""Poll the spawned process to see if it is still running."""
# if we started the process, poll with Popen
if self.proc is not None:
status = self.proc.poll()
if status is not None:
# clear state if the process is done
self.clear_state()
return status
# send signal 0 to check if PID exists
# this doesn't work on Windows, but that's okay because we don't support Windows.
alive = await self._signal(0)
if not alive:
self.clear_state()
return 0
else:
return None
class LocalProcessSpawner:
async def stop(self):
"""Stop the single-user server process for the current user.
The coroutine should return when the process is no longer running.
"""
status = await self.poll()
if status is not None:
return
self.log.debug("Interrupting %i", self.pid)
await self._signal(signal.SIGINT)
await self.wait_for_death(self.interrupt_timeout)
A spawner can modify the environment of a user's server. For example inject credentials, database paths, or Spark cluster details.
Check out Using auth_state
for an example of injecting auth credentials into the users environment.
- A Tornado application
async
andawait
, oh yeah!
- BSD licensed
- Extremely customisable
- Battle tested
- Welcoming community with a mix of academics and industry
- Upcoming release (v0.9) in the next few weeks
Local deployment on one machine:
$ conda create -n jhub-demo -c conda-forge \
> python jupyterhub notebook
$ source activate jhub-demo
$ jupyterhub
Good starting point to then customise for your needs.
The Full Monty: https://zero-to-jupyterhub.readthedocs.io/en/latest/
- based on kubernetes
- tested with Google Cloud
- trying out the guide will cost you ~coffee
- (create an account with $300 free credits)
- profit from the experience of the world's largest JupyterHub deployments
- Starting JupyterHub locally
- Admin panel
- Custom authenticator
- EdX course: https://courses.edx.org/courses/course-v1:BerkeleyX+Data8.1x+1T2018/course/
- https://mybinder.org/, JupyterHub++
- builds custom Docker images on demand
- custom frontend + JupyterHub
- served ~128000 sessions to ~85000 users over the last 30 days
Don't tell anyone but you don't even have to use jupyter notebooks with JupyterHub RStudio works just as well.
JupyterLab is coming [blog post]!
class: middle, center
class: middle, final-slide
.center[
Tim Head
Science, code, and people.
Slides: https://github.com/wildtreetech/jupyterhub-talk ]
Talk to me about:
- setting up JupyterHub
- data-science consulting
- in-house training
.footnote[ Tim Head, Wild Tree Tech, April 2018 ]