Increase Storage Size of an OCP Logging Elasticsearch instance
This page describes how to increase the underlying storage size of the OpenShift Cluster Logging Elasticsearch instance.
This is a disruptive operation! During the resize, the Elasticsearch cluster will experience reduced performance and new logs will be delayed. |
Starting situation
-
You already have an OpenShift 4 with Cluster Logging enabled.
-
The Cluster Logging instance is managed and is of type
elasticsearch
. -
You have admin-level access to the cluster.
-
You want to increase the storage size of the Elasticsearch cluster.
Prerequisites
-
kubectl
-
curl
-
jq
-
yq
yq YAML processor (version 4 or higher) -
commodore
, see Running Commodore
Prepare local environment
-
Configure API access
export COMMODORE_API_URL=https://api.syn.vshn.net (1) # Set Project Syn cluster and tenant ID export CLUSTER_ID=<lieutenant-cluster-id> # Looks like: c-cluster-id-1234 export TENANT_ID=$(curl -sH "Authorization: Bearer $(commodore fetch-token)" ${COMMODORE_API_URL}/clusters/${CLUSTER_ID} | jq -r .tenant)
1 Replace with the API URL of the desired Lieutenant instance. -
Create a local directory to work in and compile the cluster catalog
export WORK_DIR=/path/to/work/dir mkdir -p "${WORK_DIR}" pushd "${WORK_DIR}" commodore catalog compile "${CLUSTER_ID}"
We strongly recommend creating an empty directory, unless you already have a work directory for the cluster you’re about to work on. This guide will run Commodore in the directory created in this step.
Increase PVC sizes
Update Catalog
-
Set desired size
STORAGE_SIZE=250Gi (1)
1 Replace with the desired PVC size. -
Update Commodore catalog
pushd "inventory/classes/${TENANT_ID}/" yq eval -i "parameters.openshift4_logging.clusterLogging.logStore.elasticsearch.storage.size += \"${STORAGE_SIZE}\"" \ ${CLUSTER_ID}.yml git commit -a -m "Set Elasticsearch backing storage to \"${STORAGE_SIZE}\" on ${CLUSTER_ID}" git push popd
-
Compile catalog
commodore catalog compile ${CLUSTER_ID} --push -i
Increase PVC sizes
The Elasticsearch operator can’t modify PVC storage sizes. We’ll do the steps manually. |
-
Patch PVCs
pvcs=$(kubectl \ --as=cluster-admin \ -n openshift-logging \ get pvc \ -l logging-cluster=elasticsearch \ -o=name) while IFS= read -r pvc; do kubectl \ --as=cluster-admin \ -n openshift-logging \ patch $pvc \ --patch "$(yq eval -n ".spec.resources.requests.storage = \"${STORAGE_SIZE}\"")" done <<< "$pvcs"
Restart Elasticsearch Deployments
-
Stop operator from managing the cluster
kubectl \ --as=cluster-admin \ -n openshift-logging \ patch clusterloggings/instance \ --type=merge \ -p '{"spec":{"managementState":"Unmanaged"}}'
-
Scale down Fluentd pods to stop sending logs to Elasticsearch
kubectl \ --as=cluster-admin \ -n openshift-logging \ patch daemonset collector \ -p '{"spec":{"template":{"spec":{"nodeSelector":{"logging-infra-fluentd": "false"}}}}}'
-
Perform a flush on all shards to ensure there are no pending operations waiting to be written to disk prior to shutting down.
es_pod=$(kubectl \ --as=cluster-admin \ -n openshift-logging \ get pods \ -l component=elasticsearch \ -o name | head -n1) kubectl \ --as=cluster-admin \ -n openshift-logging \ exec "${es_pod}" \ -c elasticsearch \ -- es_util --query="_flush/synced" -XPOST
Example output{"_shards":{"total":4,"successful":4,"failed":0},".security":{"total":2,"successful":2,"failed":0},".kibana_1":{"total":2,"successful":2,"failed":0}}
-
Prevent shard balancing when purposely bringing down nodes.
kubectl \ --as=cluster-admin \ -n openshift-logging \ exec "${es_pod}" \ -c elasticsearch \ -- es_util --query="_cluster/settings" -XPUT -d '{ "persistent": { "cluster.routing.allocation.enable" : "primaries" } }'
Example output{"acknowledged":true,"persistent":{"cluster":{"routing":{"allocation":{"enable":"primaries"}}}},"transient":{}}
-
Find Elasticsearch deployments
kubectl \ --as=cluster-admin \ -n openshift-logging \ get deploy \ -l component=elasticsearch
Sample output:
NAME READY UP-TO-DATE AVAILABLE AGE elasticsearch-cdm-7ya69va8-1 1/1 1 1 68d elasticsearch-cdm-7ya69va8-2 1/1 1 1 68d elasticsearch-cdm-7ya69va8-3 1/1 1 1 68d
-
For each deployment do
-
Restart Elasticsearch
ES_DEPLOYMENT=elasticsearch-cdm-7ya69va8-1 (1) kubectl \ --as=cluster-admin \ -n openshift-logging \ scale deploy/${ES_DEPLOYMENT} \ --replicas=0 # Verify pod is removed kubectl \ --as=cluster-admin \ -n openshift-logging \ get pods \ | grep "${ES_DEPLOYMENT}-" kubectl \ --as=cluster-admin \ -n openshift-logging \ scale deploy/${ES_DEPLOYMENT} \ --replicas=1 # Wait for pod to become ready kubectl \ --as=cluster-admin \ -n openshift-logging \ get pods \ --watch
1 Replace with deployment name found in previous step. -
Wait until cluster becomes healthy again.
Make sure the status is green
oryellow
before proceeding.es_pod=$(kubectl \ --as=cluster-admin \ -n openshift-logging \ get pods \ -l component=elasticsearch \ -o name | head -n1) kubectl \ --as=cluster-admin \ -n openshift-logging \ exec "${es_pod}" \ -c elasticsearch \ -- es_util '--query=_cluster/health?pretty=true' | jq '.status'
-
-
Re-enable shard balancing
kubectl \ --as=cluster-admin \ -n openshift-logging \ exec "${es_pod}" \ -c elasticsearch \ -- es_util --query="_cluster/settings" -XPUT -d '{ "persistent": { "cluster.routing.allocation.enable" : "all" } }'
-
Re-enable operator
kubectl \ --as=cluster-admin \ -n openshift-logging \ patch clusterloggings/instance \ --type=merge \ -p '{"spec":{"managementState":"Managed"}}'