containers/podman.git - forked from: https://github.com/containers/podman.git

	Commit message (Collapse)	Author	Age
*	Fix race condition in kill test leading to hang	Ed Santiago	2019-12-28
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	When you open a FIFO for reading, but there's no writer, you hang. This is just one of those obscure UNIXisms we all know but just forget all too often. My last PR was guilty of introducing such a condition; I caught it by accident while testing other stuff. In short, the signal container was doing 'echo DONE' as its last step, and we (BATS) were reading the FIFO to check for it; but if the container exited before we opened the FIFO for read, the open would hang. This is not a hang that we can catch in the test: it would hang the entire job forever. CI would presumably time out eventually, but with no useful indication of the cause of the error. Solution: use 'exec' to open the FIFO early and keep it open, and use 'read -u FD' instead of 'read <$fifo': the former reads from an open FD, the latter forces a new open() each time. There is a shorter, more maintainable solution -- see #4755 -- but that suffers from the same hanging problem in the (unlikely) case where the signal-handling container exits, e.g. if signal handling is broken in podman. The test would hang, with no helpful indicator. Although this PR is a little more advanced scripting, I have commented the relevant code well and believe the maintenance cost is worth the risk of undebuggable hangs. There is still a hang risk: if 'podman logs -f' fails and exits immediately, the 'exec' will hang. I can't think of a non-racy way to prevent that, and choose to live with that risk. Tested by temporarily including 9 (SIGKILL) in the signals list. The read timeout triggers, and the end user has a fair chance of tracking down the root cause. Signed-off-by: Ed Santiago <santiago@redhat.com>
*	signal parsing - better input validation	Ed Santiago	2019-12-26
	The helper function we use for signal name mapping does not check for negative numbers nor invalid (too-high) ones. This can yield unexpected error messages: # podman kill -s -1 foo ERRO[0000] unknown signal "18446744073709551615" This PR introduces a small wrapper for it that: 1) Strips off a leading dash, allowing '-1' or '-HUP' as valid inputs; and 2) Rejects numbers <1 or >64 (SIGRTMAX) Also adds a test suite checking signal handling as well as ensuring that invalid signals are rejected by the command line. Fixes: #4746 Signed-off-by: Ed Santiago <santiago@redhat.com>