AI Tales: Building Machine learning pipeline using Kubeflow and Minio

Phew…!Kubernetes provides consistent way to deploy and run your applications and Kubeflow helps you define your ML pipeline on top of Kubernetes.Kubeflow adds some resources to your cluster to assist with a variety of tasks, including training and serving models and running Jupyter Notebooks.It extends the Kubernetes API by adding new Custom Resource Definitions (CRDs) to your cluster, so machine learning workloads can be treated as first-class citizens by Kubernetes.The Data challengeHere is the add-on, Minio. Minio fits amazingly well into the Cloud native environment inside Kubernetes. Its simple, fast, scalable and S3 compatible. Using Minio for Storing the data required for Deep learning training has the following advantage,YAML is Scary!Kubeflow makes use of ksonnet to help manage deployments. ksonnet acts as another layer on top of kubectl.While Kubernetes is typically managed with static YAML files, ksonnet adds a further abstraction that is closer to standard OOP objects.Resources are managed as prototypes with empty parameters, which can be instantiated into components by defining values for the parameters.Advantages?This system makes it easier to deploy slightly different resources to different clusters at the same time, making it easy to maintain different environments for staging and production.Kubeflow has 3 primary componentsKubeflow CRD’s defined by KsonnetTf-Job: Send Tensor-flow JobsTf-Serving: Serve trained modelKubeflow-Core: Other core components.Let’s see how easy it is to train, serve a Deep learning application using Ksonnnet and KubeflowInstall Kubeflow using Ksonnet$ VERSION=v0.2.0-rc.1$ ks registry add kubeflow github.com/kubeflow/kubeflow/tree/${VERSION}/kubeflow$ ks pkg install kubeflow/core@${VERSION}$ ks pkg install kubeflow/tf-serving@${VERSION}$ ks pkg install kubeflow/tf-job@${VERSION}// generate the kubeflow-core component from its prototypeks generate core kubeflow-core –name=kubeflow-core –cloud=gke//apply component to our clusterks apply cloud -c kubeflow-coreRunning a Deep learning training jobJust set the parameter for the image containing your Tensorflow code for training, number CPU’s and GPU’s required and number of workers for distributed training and you’re set.$ ks generate tf-job my-training $ ks param listCOMPONENT PARAM VALUE========= ===== =====my-training args "null"my-training image "null"my-training image_gpu. "null"my-training name "my-training"my-training namespace "default"my-training num_gpus 0my-training num_masters 1my-training num_ps 0my-training num_workers 0//set the parameters for this job$ ks param set train image $TRAIN_PATH$ ks param set train name "train-"$VERSION_TAG// Apply the container to the cluster:$ ks apply cloud -c trainTo the view the training progress$ kubectl get pods$ kubectl logs $POD_NAMENow to serve the trained model//create a ksonnet component from the prototypeks generate tf-serving serve –name=my-training-service//set the parameters and apply to the clusterks param set serve modelPath http://minio2:9000ks apply cloud -c serveThank you for being with us so far :)We witnessed the challenges of building a production grade Machine learning application.We learn how Kubeflow along with Minio addresses these issues .The sequel of blogs will contain hands on end to end examples containing training, serving and building applications on top of popular ML/DL algorithms.See you all soon, till then… Happy coding :)About: We are Kredaro, An super early stage startup focussed on Data driven SRE, High performance + Cost effective Analytics, ML/AI at Scale..Let’ s talk, Ping us at hello@kredaro.com.. More details

Leave a Reply