Jupyter Notebook Best Practices

Jupyter Notebook Best PracticesConcise advice that will make you use Jupyter notebooks more effectively.

Dominik HaitzBlockedUnblockFollowFollowingMar 27Photo by SpaceX on UnsplashCaveat: The advice in this article refers to the original Jupyter notebook.

While much of the advice can be adapted to JupyterLab, the popular notebook extensions can’t.

1.

Structure your NotebookGive your notebook a title (H1 header) and a meaningful preamble to describe its purpose and contents.

Use headings and documentation in Markdown cells to structure your notebook and explain your workflow steps.

Remember: You’re not only doing this for your colleagues or your successor, but also for your future self :-)The toc2 extension can automatically create heading numbers and a Table of Contents, both in a sidebar (optionally a floating window) and in a markdown cell.

The highlighting indicates your current position in the document — this will help you keep oriented in long notebooks.

The Collapsible Headings extension allows you to hide entire sections of code, thereby letting you focus on your current workflow stage.

This default template extension causes notebooks to not be created empty, but with a default structure and common imports.

Also, it will repeatedly ask you to change the name from Untitled.

ipynb to something meaningful.

The Jupyter snippets extension allows you to conveniently insert often needed code blocks, e.

g.

your typical import statements.

Using a Jupyter notebook template (which sets up default imports and structure) and the Table of Contents (toc2) extension, which automatically numbers headings.

The Collapsible Headings extension enables hiding of section contents by clicking the grey triangles next to the headings.

2.

Refactor & outsource code into modulesAfter you’ve written plain code in cells to get ahead quickly, acquire the habit of turning stable code into functions and move them to a dedicated module.

This makes your notebook more readable and is incredibly helpful when productionizing your workflow.

This:df = pd.

read_csv(filename)df.

drop( .

df.

query( .

df.

groupby( .

becomes this:def load_and_preprocess_data(filename): """DOCSTRING""" # do stuff # .

return dfand finally this:import dataprepdf = dataprep.

load_and_preprocess_data(filename)If you edit a module file, Jupyter’s autoreload extension reloads imported modules:%load_ext autoreload%autoreloadUse ipytest for testing inside notebooks.

Use a proper IDE, e.

g.

PyCharm.

Learn about its features for efficient debugging, refactoring and testing.

Stick to the standards of good coding — think Clean Code principles and PEP8.

Use meaningful variable and function names, comment sensibly, modularize your code and don’t be too lazy to refactor.

3.

Be curious about productivity hacksLearn the Jupyter Keyboard Shortcuts.

Print the list and hang it on the wall next to your screen.

Get to know Jupyter extensions: Codefolding, Hide input all, Variable Inspector, Split Cells Notebook, zenmode and many more.

Jupyter Widgets (sliders, buttons, dropdown-menus, …) allow you to build interactive GUIs.

The tqdm library provides a convenient progress bar.

4.

Embrace reproducibilityVersion Control: Learn to use git — there are many great tutorials out there.

Depending on your project and purpose, it might be reasonable to use a git pre-commit hook that removes notebook output.

It will make commits and diffs more readable, but might discard output (plots etc.

) you actually want to store.

Run your notebook in a dedicated (conda) environment.

Store the requirements.

txt file in the git repository alongside your notebooks and modules.

This will help you reproduce your workflow as well as facilitate transitioning into a production environment.

5.

Further readingWorking efficiently with JupyterLab NotebooksBringing the best out of Jupyter Notebooks for Data ScienceBoosting Your Jupyter Notebook Productivity… and of course Joel Grus’ famous I Don’t Like NotebooksConclusionGood software engineering practices, structuring and documenting your workflow as well as customizing Jupyter to your personal taste will increase your notebook productivity and sustainability.

I’m happy to hear your own tips and your feedback in the comments.

.. More details

Leave a Reply