| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix a race in journald driver. Following the logs implies streaming
until the container is dead. Streaming happened in one goroutine,
waiting for the container to exit/die and signaling that event happened
in another goroutine.
The nature of having two goroutines running simultaneously is pretty
much the core of the race condition. When the streaming goroutines
received the signal that the container has exitted, the routine may not
have read and written all of the container's logs.
Fix this race by reading both, the logs and the events, of the container
and stop streaming when the died/exited event has been read. The died
event is guaranteed to be after all logs in the journal which guarantees
not only consistencty but also a deterministic behavior.
Note that the journald log driver now requires the journald event
backend to be set.
Fixes: #10323
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial version of libimage changed the order of layers which has
now been restored to remain backwards compatible.
Further changes:
* Fix a bug in the journald logging which requires to strip trailing
new lines from the message. The system tests did not pass due to
empty new lines. Triggered by changing the default logger to
journald in containers/common.
* Fix another bug in the journald logging which embedded the container
ID inside the message rather than the specifid field. That surfaced
in a preceeding whitespace of each log line which broke the system
tests.
* Alter the system tests to make sure that the k8s-file and the
journald logging drivers are executed.
* A number of e2e tests have been changed to force the k8s-file driver
to make them pass when running inside a root container.
* Increase the timeout in a kill test which seems to take longer now.
Reasons are unknown. Tests passed earlier and no signal-related
changes happend. It may be CI VM flake since some system tests but
other flaked.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
- images test: add test for 'table' and '\t' formatting
- image mount test: check output from 'umount', test
repeat umount (NOP), and test invalid-umount
- kill test: remove kludgy workaround for crun signal bug
ref: #5004 -- code is no longer needed (fingers crossed),
and the workaround involved pulling an expensive image.
- selinux test: add new tests for shared context in:
* pods , w/ and w/o infra container (ref: #7902)
* containers with namespace sharing: --ipc, --pid, --net
- selinux test: new test for --pid=host (disabled pending
propagation of container-selinux-2.146, ref: #7939)
Signed-off-by: Ed Santiago <santiago@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
Invert the branch logic to match the comment. Docker seems to wait for
the container while Podman does not.
Enable the remote-disabled system test as well.
Fixes: #7135
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
podman-remote is in better shape now. Let's see what needs
to be done to reenable remote system tests.
- logs test: skip multilog, it doesn't work remote
- diff test: use -l only when local, not with remote
- many other tests: skip_if_remote, with 'FIXME: pending #xxxx'
where xxxx is a filed issue.
Unrelated: added new helper to skip_if_remote and _if_rootless,
where we check if the source message includes "remote"/"rootless"
and insert it if missing. This is a minor usability enhancement
to make it easier to understand at-a-glance why a skip triggers.
Signed-off-by: Ed Santiago <santiago@redhat.com>
|
|
|
|
| |
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
9f69c4eca (part of the f31 pr, #3091) semi-broke the kill test,
there's now an ugly warning:
setup(): removing stray images quay.io/libpod/fedora-minimal:latest 7bb5a60e8a78
The comments also didn't actually explain the problem
being addressed, and included a misleading reference
to busybox.
Here we switch to using fedora-minimal only with podman-remote,
clean it up (rmi) when finished, and include an explanation in
the comments about why this is needed; making it clear that
this workaround can be removed once we get rid of podman-remote.
We also reformat back to 80 columns.
Signed-off-by: Ed Santiago <santiago@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
It's possible/likely the container image for the test will need to be
pulled as part of the `run` command. Due to the way BATS handles
output, messages regarding image-pull could be misinterpreted as the
container's CID. Force the CID to be obtained by only the last line of
output.
Signed-off-by: Chris Evich <cevich@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Recommended as part of:
https://github.com/containers/libpod/issues/5004
and
https://github.com/containers/crun/issues/230
Signed-off-by: Chris Evich <cevich@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Add pkg/signal to deal with parts of signal processing and translating
signals from string to numeric representations. The code has been
copied from docker/docker (and attributed with the copyright) but been
reduced to only what libpod needs (on Linux).
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When you open a FIFO for reading, but there's no writer, you hang.
This is just one of those obscure UNIXisms we all know but just
forget all too often.
My last PR was guilty of introducing such a condition; I caught
it by accident while testing other stuff. In short, the signal
container was doing 'echo DONE' as its last step, and we (BATS)
were reading the FIFO to check for it; but if the container
exited before we opened the FIFO for read, the open would hang.
This is not a hang that we can catch in the test: it would hang
the entire job forever. CI would presumably time out eventually,
but with no useful indication of the cause of the error.
Solution: use 'exec' to open the FIFO early and keep it open,
and use 'read -u FD' instead of 'read <$fifo': the former
reads from an open FD, the latter forces a new open() each time.
There is a shorter, more maintainable solution -- see #4755 -- but
that suffers from the same hanging problem in the (unlikely) case
where the signal-handling container exits, e.g. if signal handling
is broken in podman. The test would hang, with no helpful indicator.
Although this PR is a little more advanced scripting, I have
commented the relevant code well and believe the maintenance
cost is worth the risk of undebuggable hangs.
There is still a hang risk: if 'podman logs -f' fails and exits
immediately, the 'exec' will hang. I can't think of a non-racy
way to prevent that, and choose to live with that risk.
Tested by temporarily including 9 (SIGKILL) in the signals list.
The read timeout triggers, and the end user has a fair chance
of tracking down the root cause.
Signed-off-by: Ed Santiago <santiago@redhat.com>
|
|
The helper function we use for signal name mapping does not
check for negative numbers nor invalid (too-high) ones. This
can yield unexpected error messages:
# podman kill -s -1 foo
ERRO[0000] unknown signal "18446744073709551615"
This PR introduces a small wrapper for it that:
1) Strips off a leading dash, allowing '-1' or '-HUP'
as valid inputs; and
2) Rejects numbers <1 or >64 (SIGRTMAX)
Also adds a test suite checking signal handling as well as
ensuring that invalid signals are rejected by the command line.
Fixes: #4746
Signed-off-by: Ed Santiago <santiago@redhat.com>
|