Advanced Gluster Operations

Temporarily disable monitoring for volumes

nagios-plugins-glusterfs skips checks if a file in /var/lib/glusterd/monitoring-inhibit has an mtime in the future.

ansible mungg_gluster_server -m file -a 'path=/var/lib/glusterd/monitoring-inhibit state=directory'
ansible mungg_gluster_server -m shell -a 'touch -dnow+1hour /var/lib/glusterd/monitoring-inhibit/manual'

Remove manual inhibit:

ansible mungg_gluster_server -m file -a 'path=/var/lib/glusterd/monitoring-inhibit/manual state=absent'

Gluster client logs

You can find the Gluster client logs on OpenShift nodes here:

ls -l /var/lib/origin/openshift.local.volumes/plugins/kubernetes.io/glusterfs/

Gluster FUSE client state dump

Upstream documentation:

# Determine PID of Gluster FUSE client
glpid=

kill -USR1 "$glpid"

less "/var/run/gluster/glusterdump.$glpid.dump."*

Gluster volume list is not accurate

Run puppet from infra node on all Gluster servers:

ansible mungg_gluster_server -m puppet

Replace a brick

Do this only if you have no other chance to get to a stable operation.

gluster --mode=script volume remove-brick $name replica 2 $storage-node:/data/$name/brick force

Check if you have enough storage:

df -h /data/$name/

If you have enough storage:

mv /data/$name/brick /data/$name/brick-old

If you don’t have enough storage:

rm /data/$name/brick

Add a new brick:

gluster --mode=script volume add-brick $name replica 3 $storage-node:/data/$name/brick

Gluster options not managed by mungg-gluster-volume

Check for gluster volume options not managed by mungg-gluster-volume:

gluster volume list | xargs -I{} gluster volume info {} | egrep -v 'Type|Volume ID|Status|Snapshot Count|Transport-type|Brick' | egrep -v 'performance.io-cache|performance.open-behind|performance.quick-read|performance.read-ahead|performance.readdir-ahead|performance.stat-prefetch|performance.strict-o-direct|performance.client-io-threads|client.event-threads|server.event-threads|performance.write-behind'

Gluster options not managed by mungg-gluster-volume plus global settings

gluster volume list | xargs -I{} gluster volume info {} | egrep -v 'Type|Volume ID|Status|Snapshot Count|Transport-type|Brick' | egrep -v 'performance.io-cache|performance.open-behind|performance.quick-read|performance.read-ahead|performance.readdir-ahead|performance.stat-prefetch|performance.strict-o-direct|performance.client-io-threads|client.event-threads|server.event-threads|performance.write-behind' | egrep -v 'transport.address-family|storage.fips-mode-rchecksum|nfs.disable|cluster.max-bricks-per-process|cluster.brick-multiplex|features.quota|features.inode-quota'

Example:

Volume Name: gluster-pv26
Options Reconfigured:
cluster.self-heal-daemon: enable
storage.build-pgfid: off
network.ping-timeout: 4

This can be improved by searching for paths/files on the volume, but currently this is good enough.

oc get pvc --all-namespaces | grep -vE 'logging-es|gluster-database|gp2|aws-efs' | awk '{print $4" "$1"/"$2}' | grep -E 'alertmanager|cassandra|database|db|logging|mariadb|mongo|mongodb|mongodbstatefulset|mysql|mysqldb|pgsql|postgres|postgresql|prometheus|rabbitmq|redis|solr|tsdb|vault|consul'

Find Gluster volumes in the inventory file with a specific storage class

yaml2json infra2.yaml | jq -r '.["profile_openshift3::ansible_master::host_groups"]["mungg_gluster_server"]["vars"]["mungg_gluster_volumes"] | to_entries[] | select(.value.storage_class=="gluster-database" or .value.storage_class=="bulk-gluster-database") | .key' | sort

Delete Gluster Volume

Please use mungg-gluster-volumes in normal cases. If you want to do it manually: Please be aware what you are doing! Delete the correct PV!

On the master:

oc delete pv $name

On one storage node:

gluster --mode=script volume stop $name
gluster --mode=script volume delete $name

On all storage nodes:

vim /etc/fstab
rmdir /data/$name
umount /data/$name
lvremove -f vgxyz/$name

Check for configured gluster volume options on all gluster clusters

gluster volume always gives you the default value back. The info parameter shows you explicit set options.

ansible -i ansible-gluster-inventory gluster -m shell -b -a "gluster volume list | xargs -I{} gluster volume info {} | egrep features.lock-heal|features.grace-timeout'"

Check for gluster volumes with a specific gluster volumes set

for v in $(gluster volume list); do gluster volume info "$v" | grep -zq "cluster.self-heal-daemon" && echo "$v";
done

Reset a specific volume option on all gluster volumes it has been set

for v in $(gluster volume list); do gluster volume info "$v" | grep -zq "cluster.self-heal-daemon" && gluster volume reset "$v" "cluster.self-heal-daemon";
done