How to deploy Jupyter notebooks as components of a Kubeflow ML pipeline (Part 2)

How to deploy Jupyter notebooks as components of a Kubeflow ML pipeline (Part 2)An easy way to run your Jupyter notebook on a Kubernetes clusterIn Part 1, I showed you how to create and deploy a Kubeflow ML pipeline using Docker components..In Part 2, I will show you how to make a Jupyter notebook a component of a Kubeflow ML pipeline..Recall from Part 1 that all it takes for something to be a component is for it to be a self-contained container that takes a few parameters and writes outputs to files, either on the Kubeflow cluster or on Cloud Storage.In order to deploy the flights_model notebook as a component:I have a cell at the top of my notebook whose tag is “parameters”..When this Docker image is run, it will execute the supplied notebook and copy the output notebook (with plots plotted, models trained, etc.) to GCS.Launch the notebook component as part of a pipelineThe point of running the notebook as one step of the pipeline is so that it can be orchestrated and reused in other pipelines..As the pipeline runs, logs get streamed to the pipelines log, and will show up in Stackdriver:As the pipeline executes, the notebook cells’ outputs get streamed to StackdriverIn my GitHub repo, creating and deploying the pipeline is shown in launcher.ipynb.Try it outIf you haven’t do so already, please read and walk through Part 1 of how to create and deploy a Kubeflow ML pipeline using Docker images.Try out this article on how to deploy a Jupyter notebook as a component in a Kubeflow pipeline:Start a cluster as explained in Part 1On the cluster, open flights_model.ipynb, change the PROJECT and BUCKET to be something you own, and run the notebook, making sure it works.Open launcher.ipynb and walk through the steps of running flights_model.ipynb and as a Kubeflow pipelines component.The launcher notebook also includes the ability to launch the flights_model notebook on the Deep Learning VM, but ignore it for now — I’ll cover that in Part 3 of this series.The notebook can be a unit of composability and reusability — but for this to happen, you have to take care to write small, single-purpose notebooks..I will cover this in Part 3 of this series.If you work in a large organization where a separate ML Platform team manages your ML infrastructure (i.e., a Kubeflow cluster), this article (Part 2) shows you how to develop in Jupyter notebooks and deploy to Kubeflow pipelines.. More details

Leave a Reply