Using the Operator Lifecycle Manager to deploy Prometheus on OpenShift

You just process a template, run an image… (At least to a certain extend).

The real problems usually surface after what we call day-1… Day-2 operations are not obvious on day-1 but you can imagine some: upgrades, changes in size or even morphology of the deployment, etc.

In this article, I explain step by step how to deploy the Prometheus Operator and how to monitor applications which are in a different namespace.

TL;DRI provide with a step by step guide to deploy Prometheus on OpenShift using the Operator Lifecycle Manager (currently in Tech Preview in OpenShift).

I stress on how to check if your configuration is ok and also how to set the operator to monitor application across namespaces.

What is the Prometheus Operator?Operators were introduced by CoreOS as a class of software that operates other software, putting operational knowledge collected by humans into software.

For further information around the Operator Framework please go here.

The Prometheus Operator serves to make running Prometheus on top of Kubernetes as easy as possible, while preserving Kubernetes-native configuration options.

The Operator Lifecycle ManagerThe Operator Framework (currently in Technology Preview phase) installs the Operator Lifecycle Manager (OLM), which aids cluster administrators in installing, upgrading, and granting access to Operators running on their OpenShift Container Platform cluster.

The OpenShift Container Platform web console has been is also updated for cluster administrators to install Operators, as well as grant specific projects access to use the catalog of Operators available on the cluster.

One of the Red Hat Supported Operators is the one we need for this lab, this means you don’t need to install the operator itself but to use it.

As any other operator there are a set of objects (CRDs) we need to create to tell the operator how we want to install and operate Prometheus.

These are the objects we’ll need to create:PrometheusServiceMonitorAlertManager (we want use it in this lab)The next image shows how they’re related.

For further details please go hereImage borrowed from the getting started guidePrerequisitesIn order to follow this lab you’ll need:Have an OpenShift cluster to play withA user who has been granted the cluster-admin roleThe next command lines show how to grant a user the cluster-admin role and create a project where we’ll install Prometheus.

oc adm policy add-cluster-role-to-user cluster-admin <user_name>oc new-project monitoringEnd result of the labThe aim of this lab is to deploy the following architecture.

Objects as YAML descriptorsAs you can see we need to define a Prometheus server ‘linked’ to a set of ServiceMonitors through a serviceMonitorSelector rule, in this case we’re interested on ServiceMonitors containing label k8s-app no matter which value it contains.

Additionally we’ll define a ServiceMonitor containg the required label k8s-app which in its turn we’ll trigger the scanning of Services according to the rule defined in the selector section (matches label teamwith value backend).

Finally port property in section endpoints of our ServiceMonitor should match the port name defined in our target Service objects.

Create a Prometheus subscriptionPlease go to the OpenShift Web consoleThen go to the Cluster Console and open the Operators➡Catalog Resources menu on the left.

There we’ll create a Subscription which the way we manage the Prometheus Operator itself, not the servers.

Make sure that the monitoring project we created before is selected before proceeding!Now it’s time to create the Prometheus Operator subscription.

Please, scroll down and click on the Create Subscription button close to the Prometheus Operator.

Now you should be presented with a default/example subscription descriptor, pay attention to the namespace, it should be monitoring.

Once checked the namespace please click on CreateExample subscription:apiVersion: operators.

coreos.

com/v1alpha1kind: Subscriptionmetadata: generateName: prometheus- namespace: monitoringspec: source: rh-operators name: prometheus startingCSV: prometheusoperator.

0.

22.

2 channel: previewIf everything goes as expected you should see something similar to this.

You should see the upgrade status as Up to date, if that is the case click on the link pointed by the arrow which should take you to the Cluster Service Versions area (menu on the left).

You should be able to see the description and links to documentation of the Prometheus Operator along with a set of Create New commands, signaled with a red arrow.

Good job, you have deployed the operator in project monitoring, in fact if you go to the OpenShift Application Console to project monitor you should see one instance of the prometeus-operator as in the next picture.

Now let’s proceed with the deployment of the Prometheus server.

Deployment of the Prometheus ServerGo back to the Cluster Console and click on the Create New button and choose PrometheusNext screen shows an example descriptor of a Prometheus server, go ahead and change metadata-➡name to server as in the image and click Create.

Pay attention to section spec➡serviceMonitorSelector.

There is where we define the match expression to select which Service Monitors we’re interested in.

In this case we want Service Monitors with a label called key.

Also pay attention to spec➡replicas, if you go to the OpenShift Application Console you’ll find a StatefulSet called prometheus-server with exactly 2 replicasapiVersion: monitoring.

coreos.

com/v1kind: Prometheusmetadata: name: server labels: prometheus: k8s namespace: monitoringspec: replicas: 2 version: v2.

3.

2 serviceAccountName: prometheus-k8s securityContext: {} serviceMonitorSelector: matchExpressions: – key: k8s-app operator: Exists ruleSelector: matchLabels: role: prometheus-rulefiles prometheus: k8s alerting: alertmanagers: – namespace: monitoring name: alertmanager-main port: webDeploying a test application with monitoring enabledWe’ve borrowed the following example from the Getting Started Guide of the Prometheus OperatorPlease follow the next steps to deploy a test application (3 pods) that exposes Prometheus metrics along with a Service that balances requests to the pods.

Let’s create a project for our application.

oc new-project monitored-appsLet’s deploy the test application.

$ cat << EOF | oc create -n "monitored-apps" -f -apiVersion: extensions/v1beta1kind: Deploymentmetadata: name: example-appspec: replicas: 3 template: metadata: labels: app: example-app spec: containers: – name: example-app image: fabxc/instrumented_app ports: – name: web containerPort: 8080EOFLet’s check the status of those 3 pods.

$ oc get pod -n "monitored-apps"NAME READY STATUS RESTARTS AGEexample-app-94c8bc8-jq5cr 1/1 Running 0 30sexample-app-94c8bc8-phfrv 1/1 Running 0 30sexample-app-94c8bc8-vfgr7 1/1 Running 0 30sNow let’s create a Service object to balance to these pods.

Pay attention to spec➡port➡name as we explained before should match the value of metadata➡endpoints➡port in the ServiceMonitor$ cat << EOF | oc create -n "monitored-apps" -f -kind: ServiceapiVersion: v1metadata: name: example-app labels: app: example-app team: backendspec: selector: app: example-app ports: – name: web port: 8080EOFLet’s create a ServiceMonitor to scan our test ServicePlease go to the Cluster Console to the Operators➡Cluster Service Versions area.

And click on Create Newand select Service Monitor.

Remember that project should be monitoringThe next descriptor will deploy a ServiceMonitor which is compliant with the rule we defined in our Prometheus object, namely: having a label named k8s-appAttention: we are creating the ServiceMonitor in the same namespace of the Prometheus object.

apiVersion: monitoring.

coreos.

com/v1kind: ServiceMonitormetadata: name: backend-monitor labels: k8s-app: backend-monitor namespace: monitoringspec: namespaceSelector: any: true selector: matchLabels: team: backend endpoints: – interval: 30s port: webTIP: namespaceSelector could also define exactly which namespaces you want to discovered targets fromspec: namespaceSelector: matchNames: – monitored-appsChecking configurationSo far we have created a Prometheus server and a ServiceMonitor that points to a Service.

Now we should check if everything is fine or not.

We can do this by checking Prometheus server logs, but before we do that we need to locate one of the pods, the next command will help us here.

$ oc get pods -n monitoringNAME READY STATUS RESTARTS AGEprometheus-operator-7fccbd7c74-48m6v 1/1 Running 0 16hprometheus-server-0 3/3 Running 1 3hprometheus-server-1 3/3 Running 1 3hNow that we know the name of the pods we’re looking for we can read the logs.

Next command gets us the logs of container prometheus in one of the target pods.

$ oc logs prometheus-server-0 -c prometheus -n monitoring.

level=error ts=2019-02-12T10:57:12.

739199828Z caller=main.

go:218 component=k8s_client_runtime err="github.

com/prometheus/prometheus/discovery/kubernetes/kubernetes.

go:289: Failed to list *v1.

Pod: pods is forbidden: User "system:serviceaccount:monitoring:prometheus-k8s" cannot list pods in the namespace "monitored-apps": no RBAC policy matched"level=error ts=2019-02-12T10:57:12.

739190937Z caller=main.

go:218 component=k8s_client_runtime err="github.

com/prometheus/prometheus/discovery/kubernetes/kubernetes.

go:288: Failed to list *v1.

Service: services is forbidden: User "system:serviceaccount:monitoring:prometheus-k8s" cannot list services in the namespace "monitored-apps": no RBAC policy matched"level=error ts=2019-02-12T10:57:12.

73929972Z caller=main.

go:218 component=k8s_client_runtime err="github.

com/prometheus/prometheus/discovery/kubernetes/kubernetes.

go:287: Failed to list *v1.

Endpoints: endpoints is forbidden: User "system:serviceaccount:monitoring:prometheus-k8s" cannot list endpoints in the namespace "monitored-apps": no RBAC policy matched"Well… something is not ok… apparently the problem has to do with permissions over namespace monitored-apps.

system:serviceaccount:monitoring:prometheus-k8s cannot list endpoints in the namespace “monitored-apps”So what we have to do is grant those required permissions (view) to the Service Account created by the operator and used by Prometheus.

We could grant a cluster-role to the service account, this way it can monitor any namespace, as in the next command.

oc adm policy add-cluster-role-to-user view system:serviceaccount:monitoring:prometheus-k8sOr we can add permissions in a namespace basis as in the next one.

oc adm policy add-role-to-user view system:serviceaccount:monitoring:prometheus-k8s -n monitored-appsOnce you run one of the two versions errors logs should stop appearing.

Further checking… would involve using the Prometheus console… in order to do so we need first expose the Service as in the next command.

$ oc get svc -n monitoringNAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGEprometheus-operated ClusterIP None <none> 9090/TCP 13h$ oc expose svc/prometheus-operated -n monitoringroute.

route.

openshift.

io/prometheus-operated exposedNow please open the url returned by the next command and navigate to Status➡Targets$ oc get route -n monitoringNAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARDprometheus-operated prometheus-operated-monitoring.

apps.

serverless-8d48.

openshiftworkshop.

com prometheus-operated web NoneYou should see something like this.

There are three targets, one per pod.

Now if you navigate to Status➡Configuration you should be able to see that there’s a scrape_config entry per ServiceMonitor object, in our case we only have one, called backnd-monitor, the generated scrape-config name monitoring/backend-monitor/0See it in actionNow that we’re sure that our target service is being monitored we could go and see some graphs.

To do so, navigate to Graph and start typing codelab in the Expression.

textfield.

Then choose one of the available metrics, for instance codelab_api_http_requests_in_progress ……and click on tab Graph.

Congratulations, you’ve deployed Prometheus using the Operator Lifecycle Manager, deployed a service in a different namespace, tracked down a configuration error, fixed it and finally checked everything works… hopefully ;-)Wrap upThis simple guide doesn’t go deep into the configuration of Prometheus, but shows how easy it is to set up Prometheus using the Operator Lifecycle Manager… but remember the real payoff starts on day-2 operations, so what’s easy today should be easy tomorrow too.

.

. More details

Leave a Reply