Infinispan monitoring in Kubernetes with Prometheus and Grafana

Infinispan is a java based open source cache and grid solution, that is used extensively in different framework like j2ee, springboot and quarkus among others. It is compatible with different java caching models, like JSR-107 and spring data.

It is very simple to set up a high availability cluster with different replication/distribution configurations, and thanks to the kubernetes operator pattern, is even easier in kubernetes distribution.

Prometheus is the defacto standard for metric scraping and short-term storing to identify potential issues. It is a graduated project of the CNCF Foundation.

Grafana is a very popular graphing tool that can create nice and easy to use dashboards, created and maintained by Grafana Labs.

We can see the architecture that we are going to test in the following picture.

Infinispan dashboard with prometheus and grafana

Kubernetes initial setup

For this example, I am going to use a minikube installation in my fedora, and for the dns I use a dnsmasq domain configuration, but the ingress-dns addon can be use indistinctly. The ingress dns wildard selected from now on will be minikube.cloud. My dnsmasq.conf (172.17.0.2 is from minikube ip command) is as follows.

address=/.minikube.cloud/172.17.0.2

Deploying the Operator Lifecycle Manager

The Operator Lifecycle Manager (OLM) is part of an infrastructure that bundles an operator deployment, CRDs and dependencies (use to be RBAC objects) in a simple an easy way. OperatorHub

At the time of writing this post, the current version is 0.15.1. To deploy it, just with a cluster-admin user, execute.

curl -sL https://github.com/operator-framework/operator-lifecycle-manager/releases/download/0.15.1/install.sh | bash -s 0.15.1

Or just execute minikube addons enable olm.

Deploying the Infinipan Operator

We are using the kubernetes operator. To deploy it, just create an infinispan-operator.yaml with the following content (from now on, replace infinispan namespace if any other is selected for this test).

apiVersion: v1
kind: Namespace
metadata:
  name: infinispan
---
apiVersion: operators.coreos.com/v1
kind: OperatorGroup
metadata:
  name: operatorgroup
  namespace: infinispan
spec:
  targetNamespaces:
  - infinispan
---
apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: infinispan
  namespace: infinispan
spec:
  channel: stable
  name: infinispan
  source: operatorhubio-catalog
  sourceNamespace: olm

Deploying an Infinispan Cluster using the operator

Note: This article use infinispan operator 1.1.2. Some important updates are expected for 2.0 that may change some details. Please, check the version deployed on your installation.

The process described here is based on the official documentation for deploying the operator: infinispan operator documentation. If anything else is needed, refer to that documentation as a reference.

The infinispan operator ensures the cluster to be secured at least with credentials. We are setting our own ones, but if you prefer the operator to create a random ones, just don’t set up the secret and it will be automatically populated in the deployment.

The infinispan image looks for credentials structured in a file called identities.yaml.

We can encode in base64 this file and pass it as data or just add it as is in stringData and then, we can create the infinispan Custom Resource (CR) instance to create the cluster. Create a file called infinispan-cluster.yaml.

apiVersion: v1
kind: Secret
metadata:
  name: infinispan-basic-auth
  namespace: infinispan
stringData:
  identities.yaml: |- 
    credentials:
    - username: developer
      password: dIRs5cAAsHIeeRIL
    - username: operator
      password: uMBo9CmEdEduYk24
---
apiVersion: infinispan.org/v1
kind: Infinispan
metadata:
  name: infinispan-minimal
  namespace: infinispan
spec:
  image: infinispan/server:11.0
  replicas: 3
  service:
    type: DataGrid
  security:
    endpointSecretName: infinispan-basic-auth

kubectl create -f infinispan-cluster.yaml -n infinispan

To check the identities are ok, or recover them in the future, we can execute.

kubectl get secret infinispan-basic-auth -o jsonpath="{.data.identities\.yaml}" | base64 -d

I recommend using hotrod protocol to communicate from clients to servers, as it is optimized for achieving the best performance. For example, clients keep topology updated for faster store/retrieval in case of distributed caches. However, to simplify this example, I am going to use the REST interface. I can do with containers inside kubernetes, or expose through load balancers or nodeports defined in the operator. As I am used to create ingresses for that, I will do the same. Create a file called infinispan-ingress.yaml.

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: infinispan-minimal
  namespace: infinispan
  annotations: 
    kubernetes.io/ingress.class: nginx
    nginx.ingress.kubernetes.io/affinity: cookie
spec:
  rules:
    - host: infinispan-minimal.minikube.cloud
      http:
        paths:
          - path: /
            backend:
              serviceName: infinispan-minimal
              servicePort: 11222

And create the ingress based on that file.

kubectl create -f infinispan-ingress.yaml

To double check if the cluster is up and running, you may execute some API REST commands.

curl -X GET -u developer:dIRs5cAAsHIeeRIL -H "Content-Type: application/json" http://infinispan-minimal.minikube.cloud/rest/v2/server

The result should be the version details.

{"version":"Infinispan 'Corona Extra' 11.0.0.Final"}

Or you can go to http://infinispan-minimal.minikube.cloud/console/, and with the credentials above, you will see the infinispan console.

Infinispan Console: Startup

Creating a Cache and adding some data

After deploying the cluster, infinispan has no caches configured (but some internal ones, defined with __name__). We are creating a sample cache named “fruit”, with the following configuration in default.json.

{
    "distributed-cache": {
        "mode": "SYNC",
        "owners": 1,
        "partition-handling": {
            "when-split": "ALLOW_READ_WRITES",
            "merge-policy": "REMOVE_ALL"
        },
        "transaction": {
            "mode": "NONE"
        },
        "memory": {
            "off-heap": {
                "size": 96468992,
                "eviction": "MEMORY",
                "strategy": "REMOVE"
            }
        },
        "statistics": true
    }
}

Then, create the cache.

curl -X POST -u developer:dIRs5cAAsHIeeRIL -H "Content-Type: application/json" http://infinispan-minimal.minikube.cloud/rest/v2/caches/fruits --data-binary "@default.json"

We can query the cache list.

curl -X GET -u developer:dIRs5cAAsHIeeRIL -H "Content-Type: application/json" http://infinispan-minimal.minikube.cloud/rest/v2/caches/

You can get some output like this.

["___protobuf_metadata","fruits","___script_cache"]

We can add some records to the cache, for example.

curl -X POST -u developer:dIRs5cAAsHIeeRIL -H "Content-Type: application/json" http://infinispan-minimal.minikube.cloud/rest/v2/caches/fruits/0 --data '{ "banana": 5 }'

And query the API or just go to the console and see if this entry is there.

Infinispan Console: Key

Adding Prometheus Server using the operator

There are different alternatives to deploy and configure a prometheus server and its rules to scrape the targets. I am selecting to use the coreos operator, as it can be used with olm as I did previously with infinispan.

As it is installed cluster-wide, the target namespace operators is already set up by the olm installation. We only need one file, prometheus-operator.yaml (you may use https://operatorhub.io/install/prometheus.yaml if you prefer, but I like to change the name from my-prometheus to infinispan-prometheus).

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: infinispan-prometheus
  namespace: operators
spec:
  channel: beta
  name: prometheus
  source: operatorhubio-catalog
  sourceNamespace: olm

kubectl create -f prometheus-operator.yaml

Now, we need to create a prometheus instance to scrape our infinispan metrics.

apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  labels:
    prometheus: k8s
  name: infinispan
  namespace: operators
spec:
  alerting: 
    alertmanagers: []
  replicas: 1
  ruleSelector: {}
  securityContext: {}
  serviceAccountName: prometheus-k8s
  serviceMonitorSelector: {}

Good! Now we create a ServiceMonitor for our Infinispan cluster. We need also to add the credentials to access the secured endpoint. Just create a file called prometheus-service-monitor.yaml with the following content.

apiVersion: v1
stringData:
  username: developer 
  password: dIRs5cAAsHIeeRIL 
kind: Secret
metadata:
  name: prom-infinispan-basic-auth
  namespace: operators
type: Opaque
---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    k8s-app: prometheus
  name: infinispan-monitoring
  namespace: operators
spec:
  endpoints:
    - basicAuth:
        password:
          key: password
          name: prom-infinispan-basic-auth
        username:
          key: username
          name: prom-infinispan-basic-auth
      interval: 30s
      scheme: http
      targetPort: 11222
  namespaceSelector:
    matchNames:
      - infinispan
  selector:
    matchLabels:
      app: infinispan-service
      clusterName: infinispan-minimal

And create the objects in the cluster.

kubectl create -f prometheus-service-monitor.yaml

I recommend to create an ingress to the prometheus server, so we can check if the endpoints are correctly scraped and we are getting all the metrics that we want. With the following prometheus-ingress.yaml.

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: prometheus
  namespace: operators
  annotations: 
    kubernetes.io/ingress.class: nginx
spec:
  rules:
    - host: prometheus.minikube.cloud
      http:
        paths:
          - path: /
            backend:
              serviceName: prometheus-operated
              servicePort: web

And creating it.

kubectl create -f prometheus-ingress.yaml

So you should see the targets like in the following picture.

Prometheus Console: targets

And then, you can try and query a metric from the new cache.

Prometheus Console: query

Deploying Grafana

To deploy Grafana on kubernetes cluster, we will use the official helm charts from. We can add the repo to helm, or just use the remote bundle. We are using a values.yaml

helm install infinispan-grafana grafana -f values.yaml --repo https://kubernetes-charts.storage.googleapis.com

With this values, you get the prometheus datasource already configured. You may create your own dashboard, or just import this two samples that I created, a General and a Detailed templated for a cache.

The General dashboard with this setup can be something like the following picture.

Grafana Dashboard: General

Code

All the code used in this post can be downloaded at this github repo.