| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
the exit file
If the container exit code needs to be retained, it cannot be retained
in tmpfs, because libpod runs in a memcg itself so it can't leave
traces with a daemon-less design.
This wasn't a memleak detectable by kmemleak for example. The kernel
never lost track of the memory and there was no erroneous refcounting
either. The reference count dependencies however are not easy to track
because when a refcount is increased, there's no way to tell who's
still holding the reference. In this case it was a single page of
tmpfs pagecache holding a refcount that kept pinned a whole hierarchy
of dying memcg, slab kmem, cgropups, unrechable kernfs nodes and the
respective dentries and inodes. Such a problem wouldn't happen if the
exit file was stored in a regular filesystem because the pagecache
could be reclaimed in such case under memory pressure. The tmpfs page
can be swapped out, but that's not enough to release the memcg with
CONFIG_MEMCG_SWAP_ENABLED=y.
No amount of more aggressive kernel slab shrinking could have solved
this. Not even assigning slab kmem of dying cgroups to alive cgroup
would fully solve this. The only way to free the memory of a dying
cgroup when a struct page still references it, would be to loop over
all "struct page" in the kernel to find which one is associated with
the dying cgroup which is a O(N) operation (where N is the number of
pages and can reach billions). Linking all the tmpfs pages to the
memcg would cost less during memcg offlining, but it would waste lots
of memory and CPU globally. So this can't be optimized in the kernel.
A cronjob running this command can act as workaround and will allow
all slab cache to be released, not just the single tmpfs pages.
rm -f /run/libpod/exits/*
This patch solved the memleak with a reproducer, booting with
cgroup.memory=nokmem and with selinux disabled. The reason memcg kmem
and selinux were disabled for testing of this fix, is because kmem
greatly decreases the kernel effectiveness in reusing partial slab
objects. cgroup.memory=nokmem is strongly recommended at least for
workstation usage. selinux needs to be further analyzed because it
causes further slab allocations.
The upstream podman commit used for testing is
1fe2965e4f672674f7b66648e9973a0ed5434bb4 (v1.4.4).
The upstream kernel commit used for testing is
f16fea666898dbdd7812ce94068c76da3e3fcf1e (v5.2-rc6).
Reported-by: Michele Baldessari <michele@redhat.com>
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
<Applied with small tweaks to comments>
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
|\
| |
| | |
Build fix for 32-bit systems.
|
| |
| |
| |
| |
| |
| | |
* Fixes #3664.
Signed-off-by: Pete Johanson <peter@peterjohanson.com>
|
|\ \
| |/
|/| |
Update libpod.conf to be more friendly to NixOS
|
| |
| |
| |
| |
| |
| |
| |
| | |
NixOS links the current system state to `/run/current-system`, so we
have to add these paths to the configuration files as well to work out
of the box.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
|
|\ \
| | |
| | | |
Fix commit --changes env=X=Y
|
| |/
| |
| |
| | |
Signed-off-by: Jhon Honce <jhonce@redhat.com>
|
|\ \
| |/
|/|
| |
| | |
wking/fatal-requested-hook-directory-does-not-exist
libpod/container_internal: Make all errors loading explicitly configured hook dirs fatal
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
hook dirs fatal
Remove this IsNotExist out which was added along with the rest of this
block in f6a2b6bf2b (hooks: Add pre-create hooks for runtime-config
manipulation, 2018-11-19, #1830). Besides the obvious "hook directory
does not exist", it was swallowing the less-obvious "hook command does
not exist". And either way, folks are likely going to want non-zero
podman exits when we fail to load a hook directory they explicitly
pointed us towards.
Signed-off-by: W. Trevor King <wking@tremily.us>
|
|\ \
| | |
| | | |
podman: support --userns=ns|container
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
allow to join the user namespace of another container.
Closes: https://github.com/containers/libpod/issues/3629
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|/ /
| |
| |
| |
| |
| |
| | |
We now return an empty string for the `Comment` field if an OCI v1 image
contains no history.
Signed-off-by: Sascha Grunert <sgrunert@suse.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
We should not be fuzzy matching on volume names. Docker doesn't
do it, and it doesn't make much sense. Everything requires exact
matches for names - only IDs allow partial matches.
Fixes #3635
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
|\ \
| | |
| | | |
Fix a segfault on Podman no-store commands with refresh
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
When a command (like `ps`) requests no store be created, but also
requires a refresh be performed, we have to ignore its request
and initialize the store anyways to prevent segfaults. This work
was done in #3532, but that missed one thing - initializing a
storage service. Without the storage service, Podman will still
segfault. Fix that oversight here.
Fixes #3625
Signed-off-by: Matthew Heon <mheon@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Peter Hunt <pehunt@redhat.com>
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| | |
There's no way to get the error if we successfully get an exit code (as it's just printed to stderr instead).
instead of relying on the error to be passed to podman, and edit based on the error code, process it on the varlink side instead
Also move error codes to define package
Signed-off-by: Peter Hunt <pehunt@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
a PR slipped through without running the new linter. this cleans things
up for the master branch.
Signed-off-by: baude <bbaude@redhat.com>
|
|\ \
| | |
| | | |
golangci-lint phase 4
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
clean up some final linter issues and add a make target for
golangci-lint. in addition, begin running the tests are part of the
gating tasks in cirrus ci.
we cannot fully shift over to the new linter until we fix the image on
the openshift side. for short term, we will use both
Signed-off-by: baude <bbaude@redhat.com>
|
|/ /
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This includes:
Implement exec -i and fix some typos in description of -i docs
pass failed runtime status to caller
Add resize handling for a terminal connection
Customize exec systemd-cgroup slice
fix healthcheck
fix top
add --detach-keys
Implement podman-remote exec (jhonce)
* Cleanup some orphaned code (jhonce)
adapt remote exec for conmon exec (pehunt)
Fix healthcheck and exec to match docs
Introduce two new OCIRuntime errors to more comprehensively describe situations in which the runtime can error
Use these different errors in branching for exit code in healthcheck and exec
Set conmon to use new api version
Signed-off-by: Jhon Honce <jhonce@redhat.com>
Signed-off-by: Peter Hunt <pehunt@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
this is the third round of preparing to use the golangci-lint on our
code base.
Signed-off-by: baude <bbaude@redhat.com>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Currently the pull message on failure is UGLY. This patch removes a lot of the noice
when pulling an image from multiple registries to make the user experience better.
Our current messages are way too verbose and need to be dampened down. Still has
verbose mode if you turn on log-level=debug.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
| |
| |
| |
| |
| |
| |
| | |
When removing --all images prune images only attempt to remove read/write images,
ignore read/only images
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|\ \
| | |
| | | |
Include changes to the container's root file-system in the checkpoint archive
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The newly added functionality to include the container's root
file-system changes into the checkpoint archive can now be explicitly
disabled. Either during checkpoint or during restore.
If a container changes a lot of files during its runtime it might be
more effective to migrated the root file-system changes in some other
way and to not needlessly increase the size of the checkpoint archive.
If a checkpoint archive does not contain the root file-system changes
information it will automatically be skipped. If the root file-system
changes are part of the checkpoint archive it is also possible to tell
Podman to ignore these changes.
Signed-off-by: Adrian Reber <areber@redhat.com>
|
| | |
| | |
| | |
| | | |
Signed-off-by: Adrian Reber <areber@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
One of the last limitations when migrating a container using Podman's
'podman container checkpoint --export=/path/to/archive.tar.gz' was
that it was necessary to manually handle changes to the container's root
file-system. The recommendation was to mount everything as --tmpfs where
the root file-system was changed.
This extends the checkpoint export functionality to also include all
changes to the root file-system in the checkpoint archive. The
checkpoint archive now includes a tarstream of the result from 'podman
diff'. This tarstream will be applied to the restored container before
restoring the container.
With this any container can now be migrated, even it there are changes
to the root file-system.
There was some discussion before implementing this to base the root
file-system migration on 'podman commit', but it seemed wrong to do
a 'podman commit' before the migration as that would change the parent
layer the restored container is referencing. Probably not really a
problem, but it would have meant that a migrated container will always
reference another storage top layer than it used to reference during
initial creation.
Signed-off-by: Adrian Reber <areber@redhat.com>
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
The newly added function GetDiffTarStream() mirrors the GetDiff()
function. It tries to get the correct layer ID from getLayerID()
and it filters out containerMounts from the tarstream. Thus the
behavior is the same as GetDiff(), but it returns a tarstream.
This also adds the function ApplyDiffTarStream() to apply the tarstream
generated by GetDiffTarStream().
These functions are targeted to support container migration with
root file-system changes.
Signed-off-by: Adrian Reber <areber@redhat.com>
|
|\ \ \
| | | |
| | | | |
Remove exec PID files after use to prevent memory leaks
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We have another patch running to do the same for exit files, with
a much more in-depth explanation of why it's necessary. Suffice
to say that persistent files in tmpfs tied to container CGroups
lead to significant memory allocations that last for the lifetime
of the file.
Based on a patch by Andrea Arcangeli (aarcange@redhat.com).
Signed-off-by: Matthew Heon <mheon@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
We can infer no-new-privileges. For now, manually populate
seccomp (can't infer what file we sourced from) and
SELinux/Apparmor (hard to tell if they're enabled or not).
Signed-off-by: Matthew Heon <mheon@redhat.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
Our previous method (just read the PID that we spawned) doesn't
work - Conmon double-forks to daemonize, so we end up with a PID
pointing to the first process, which dies almost immediately.
Reading from the PID file gets us the real PID.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
When we first began writing Podman, we ran into a major issue
when implementing Inspect. Libpod deliberately does not tie its
internal data structures to Docker, and stores most information
about containers encoded within the OCI spec. However, Podman
must present a CLI compatible with Docker, which means it must
expose all the information in 'docker inspect' - most of which is
not contained in the OCI spec or libpod's Config struct.
Our solution at the time was the create artifact. We JSON'd the
complete CreateConfig (a parsed form of the CLI arguments to
'podman run') and stored it with the container, restoring it when
we needed to run commands that required the extra info.
Over the past month, I've been looking more at Inspect, and
refactored large portions of it into Libpod - generating them
from what we know about the OCI config and libpod's (now much
expanded, versus previously) container configuration. This path
comes close to completing the process, moving the last part of
inspect into libpod and removing the need for the create
artifact.
This improves libpod's compatability with non-Podman containers.
We no longer require an arbitrarily-formatted JSON blob to be
present to run inspect.
Fixes: #3500
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
An image with "HEALTHCHECK CMD ['']" is valid but as there is no command
defined the healthcheck will fail. Reject such a configuration.
Fixes #3507
Signed-off-by: Stefan Becker <chemobejk@gmail.com>
|
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
- remove duplicate check, already called in HealthCheck()
- reject zero-length command list and empty command string as errorneous
- support all Docker command list keywords: NONE, CMD or CMD-SHELL
- use Docker default "/bin/sh -c" for CMD-SHELL
Fixes #3507
Signed-off-by: Stefan Becker <chemobejk@gmail.com>
|
|\ \ \ \
| | | | |
| | | | | |
Ensure we have a valid store when we refresh
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Fixes #3520
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
| |/ / /
|/| | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
with debug output.
Added \n char to specific standard output
Signed-off-by: dom finn <dom.finn00@gmail.com>
|
|\ \ \ \
| | | | |
| | | | | |
Fix a bug where ctrs could not be removed from pods
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
Using pod removal worked, but container removal was missing the
most critical step - the actual removal. Must have been
accidentally removed during a refactor.
Fixes #3556
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
|\ \ \ \ \
| |_|_|/ /
|/| | | | |
golangci-lint pass number 2
|
| | | | |
| | | | |
| | | | |
| | | | |
| | | | |
| | | | | |
clean up and prepare to migrate to the golangci-linter
Signed-off-by: baude <bbaude@redhat.com>
|
|\ \ \ \ \
| |/ / / /
|/| | | | |
Correctly set FinishedTime for checkpointed container
|
| | |/ /
| |/| |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | |
| | | | |
During 'podman container checkpoint' the finished time was not set. This
resulted in a strange container status after checkpointing:
Exited (0) 292 years ago
During checkpointing FinishedTime is now set to time.now().
Signed-off-by: Adrian Reber <areber@redhat.com>
|
|\ \ \ \
| |_|/ /
|/| | | |
first pass of corrections for golangci-lint
|
| |/ /
| | |
| | |
| | | |
Signed-off-by: baude <bbaude@redhat.com>
|
|/ /
| |
| |
| |
| |
| | |
fix a regression introduced by 1d36501f961889f554daf3c696fe95443ef211b6
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|\ \
| | |
| | | |
healthcheck: support rootless mode
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
now that dbus authentication works fine from a user namespace (systemd
241 works fine), we can enable rootless healthchecks.
It uses "systemd-run --user" for creating the healthcheck timer and
communicates with the user instance of systemd listening at
$XDG_RUNTIME_DIR/systemd/private.
Closes: https://github.com/containers/libpod/issues/3523
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|