Cirrus: Fix elevator workaround multi-cloud support

In order to support execution on various non-GCP cloud environments, the BFQ scheduler workaround needs updating. Previously it assumed the root disk was always `/dev/sda`. With the addition of new clouds (AWS) and different environment types, the assumption is not always valid. Update the workaround to take care in looking up the block device where '/' comes from. Also update the scheduler to 'none', as all modern clouds already have highly optimized underlying storage configurations. There's no reason to complicate I/O paths further by hard-coding specific scheduler(s) for all environment types. Signed-off-by: Chris Evich <cevich@redhat.com>
author: Chris Evich <cevich@redhat.com> 2022-06-13 12:12:13 -0400
committer: Chris Evich <cevich@redhat.com> 2022-07-01 11:25:47 -0400
commit: f58d7dbdab7076d1ff83375fdbc010445261c0cd (patch)
tree: fd9ca0b4542e31a9927006c27732f5338adc1642
parent: b00e65aa9c071428579a55f91a92f3702765ed85 (diff)
download: podman-f58d7dbdab7076d1ff83375fdbc010445261c0cd.tar.gz
podman-f58d7dbdab7076d1ff83375fdbc010445261c0cd.tar.bz2
podman-f58d7dbdab7076d1ff83375fdbc010445261c0cd.zip
1 files changed, 22 insertions, 5 deletions
diff --git a/contrib/cirrus/setup_environment.sh b/contrib/cirrus/setup_environment.sh
index 9bd35bd06..fa0a40991 100755
--- a/contrib/cirrus/setup_environment.sh
+++ b/contrib/cirrus/setup_environment.sh
@@ -99,11 +99,28 @@ esac
 if ((CONTAINER==0)); then  # Not yet running inside a container
     # Discovered reemergence of BFQ scheduler bug in kernel 5.8.12-200
     # which causes a kernel panic when system is under heavy I/O load.
-    # Previously discovered in F32beta and confirmed fixed. It's been
-    # observed in F31 kernels as well.  Deploy workaround for all VMs
-    # to ensure a more stable I/O scheduler (elevator).
-    echo "mq-deadline" > /sys/block/sda/queue/scheduler
-    warn "I/O scheduler: $(cat /sys/block/sda/queue/scheduler)"
+    # Disable the I/O scheduler (a.k.a. elevator) for all environments,
+    # leaving optimization up to underlying storage infrastructure.
+    testfs="/"  # mountpoint that experiences the most I/O during testing
+    msg "Querying block device owning partition hosting the '$testfs' filesystem"
+    # Need --nofsroot b/c btrfs appends subvolume label to `source` name
+    testdev=$(findmnt --canonicalize --noheadings --nofsroot \
+              --output source --mountpoint $testfs)
+    msg "    found partition: '$testdev'"
+    testdisk=$(lsblk --noheadings --output pkname --paths $testdev)
+    msg "    found block dev: '$testdisk'"
+    testsched="/sys/block/$(basename $testdisk)/queue/scheduler"
+    if [[ -n "$testdev" ]] && [[ -n "$testdisk" ]] && [[ -e "$testsched" ]]; then
+        msg "    Found active I/O scheduler: $(cat $testsched)"
+        if [[ ! "$(<$testsched)" =~ \[none\]  ]]; then
+            msg "    Disabling elevator for '$testsched'"
+            echo "none" > "$testsched"
+        else
+            msg "    Elevator already disabled"
+        fi
+    else
+        warn "Sys node for elevator doesn't exist: '$testsched'"
+    fi
 fi
 
 # Which distribution are we testing on.
author	Chris Evich <cevich@redhat.com>	2022-06-13 12:12:13 -0400
committer	Chris Evich <cevich@redhat.com>	2022-07-01 11:25:47 -0400
commit	f58d7dbdab7076d1ff83375fdbc010445261c0cd (patch)
tree	fd9ca0b4542e31a9927006c27732f5338adc1642
parent	b00e65aa9c071428579a55f91a92f3702765ed85 (diff)
download	podman-f58d7dbdab7076d1ff83375fdbc010445261c0cd.tar.gz podman-f58d7dbdab7076d1ff83375fdbc010445261c0cd.tar.bz2 podman-f58d7dbdab7076d1ff83375fdbc010445261c0cd.zip