summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAge
* container stop: kill conmonValentin Rothberg2019-08-05
| | | | | | | | | | | | | Old versions of conmon have a bug where they create the exit file before closing open file descriptors causing a race condition when restarting containers with open ports since we cannot bind the ports as they're not yet closed by conmon. Killing the old conmon PID is ~okay since it forces the FDs of old conmons to be closed, while it's a NOP for newer versions which should have exited already. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Merge pull request #3720 from baude/honorconfiginuserOpenShift Merge Robot2019-08-05
|\ | | | | honor libpod.conf in /usr/share/containers
| * honor libpod.conf in /usr/share/containersbaude2019-08-04
|/ | | | | | | | | | | we should be looking for the libpod.conf file in /usr/share/containers and not in /usr/local. packages of podman should drop the default libpod.conf in /usr/share. the override remains /etc/containers/ as well. Fixes: #3702 Signed-off-by: baude <bbaude@redhat.com>
* Merge pull request #3717 from rhatdan/errorsOpenShift Merge Robot2019-08-04
|\ | | | | Don't log errors to the screen when XDG_RUNTIME_DIR is not set
| * Don't log errors to the screen when XDG_RUNTIME_DIR is not setDaniel J Walsh2019-08-04
|/ | | | | | | | Drop errors to debug when trying to setup the runtimetmpdir. If the tool can not setup a runtime dir, it will error out with a correct message no need to put errors on the screen, when the tool actually succeeds. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Merge pull request #3707 from haircommander/no-errorfOpenShift Merge Robot2019-08-03
|\ | | | | Add handling for empty LogDriver
| * Add handling for empty LogDriverPeter Hunt2019-08-02
|/ | | | | | | | There are two cases logdriver can be empty, if it wasn't set by libpod, or if the user did --log-driver "" The latter case is an odd one, and the former is very possible and already handled for LogPath. Instead of printing an error for an entirely reasonable codepath, let's supress the error Signed-off-by: Peter Hunt <pehunt@redhat.com>
* Merge pull request #3695 from edsantiago/bats_hang_fixOpenShift Merge Robot2019-08-02
|\ | | | | System tests: resolve hang in rawhide rootless
| * System tests: resolve hang in rawhide rootlessEd Santiago2019-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fedora CI tests are failing on rawhide under kernel 5.3.0-0.rc1.git3.1.fc31 (rhbz#1736758). But there's another insidious failure, a 4-hour hang in the rootless tests on the same CI system. The culprit line is in the podman build test, but it's actually BATS itself that hangs, not the build command -- which suggests that it's the usual FD 3 problem (see BATS README). It would seem that podman is forking a process that inherits fd 3 but that process is not getting cleaned up when podman crashes upon encountering the kernel bug. Today it's podman build, tomorrow it might be something else. Let's just run all podman invocations in run_podman with a non-bats FD 3. Signed-off-by: Ed Santiago <santiago@redhat.com>
* | Merge pull request #3692 from haircommander/play-capsOpenShift Merge Robot2019-08-02
|\ \ | | | | | | Add Capability support to play kube
| * | Add capability functionality to play kubePeter Hunt2019-08-01
| | | | | | | | | | | | | | | | | | | | | Take capabilities written in a kube and add to a container adapt test suite and write cap-add/drop tests Signed-off-by: Peter Hunt <pehunt@redhat.com>
| * | Deduplicate capabilities in generate kubePeter Hunt2019-08-01
| | | | | | | | | | | | | | | | | | capabilities that were added and dropped were several times duplicated. Fix this Signed-off-by: Peter Hunt <pehunt@redhat.com>
* | | Merge pull request #3676 from fzoske/fix-typoValentin Rothberg2019-08-02
|\ \ \ | | | | | | | | Fix typo
| * | | Fix typoFabian Zoske2019-08-01
| | | | | | | | | | | | | | | | Signed-off-by: Fabian Zoske <git@fzoske.de>
* | | | Merge pull request #3551 from mheon/fix_memory_leakOpenShift Merge Robot2019-08-02
|\ \ \ \ | |_|_|/ |/| | | Fix memory leak with exit files
| * | | Use "none" instead of "null" for the null eventerMatthew Heon2019-08-01
| | | | | | | | | | | | | | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Pass on events-backend config to cleanup processesMatthew Heon2019-08-01
| | | | | | | | | | | | | | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Ensure we generate a 'stopped' event on force-removeMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When forcibly removing a container, we are initiating an explicit stop of the container, which is not reflected in 'podman events'. Swap to using our standard 'stop()' function instead of a custom one for force-remove, and move the event into the internal stop function (so internal calls also register it). This does add one more database save() to `podman remove`. This should not be a terribly serious performance hit, and does have the desirable side effect of making things generally safer. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Fix Dockerfile - a dependency's name was changedMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | System events are valid, don't error on themMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | | | | | | | | | The logfile driver was not aware that system events existed. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Do not use an events backend when restoring imagesMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Expose Null eventer and allow its use in the Podman CLIMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We need this specifically for tests, but others may find it useful if they don't explicitly need events and don't want the performance implications of using them. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Force tests to use file backend for eventsMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Podman-in-podman (and possibly ubuntu) have "issues" with journald. Let's just use file instead to be safe. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Add a flag to set events logger typeMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Fix test suiteMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | Retrieve exit codes for containers via eventsMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As we previously removed our exit code retrieval code to stop a memory leak, we need a new way of doing this. Fortunately, events is able to do the job for us. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * | | podman: fix memleak caused by renaming and not deletingMatthew Heon2019-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | the exit file If the container exit code needs to be retained, it cannot be retained in tmpfs, because libpod runs in a memcg itself so it can't leave traces with a daemon-less design. This wasn't a memleak detectable by kmemleak for example. The kernel never lost track of the memory and there was no erroneous refcounting either. The reference count dependencies however are not easy to track because when a refcount is increased, there's no way to tell who's still holding the reference. In this case it was a single page of tmpfs pagecache holding a refcount that kept pinned a whole hierarchy of dying memcg, slab kmem, cgropups, unrechable kernfs nodes and the respective dentries and inodes. Such a problem wouldn't happen if the exit file was stored in a regular filesystem because the pagecache could be reclaimed in such case under memory pressure. The tmpfs page can be swapped out, but that's not enough to release the memcg with CONFIG_MEMCG_SWAP_ENABLED=y. No amount of more aggressive kernel slab shrinking could have solved this. Not even assigning slab kmem of dying cgroups to alive cgroup would fully solve this. The only way to free the memory of a dying cgroup when a struct page still references it, would be to loop over all "struct page" in the kernel to find which one is associated with the dying cgroup which is a O(N) operation (where N is the number of pages and can reach billions). Linking all the tmpfs pages to the memcg would cost less during memcg offlining, but it would waste lots of memory and CPU globally. So this can't be optimized in the kernel. A cronjob running this command can act as workaround and will allow all slab cache to be released, not just the single tmpfs pages. rm -f /run/libpod/exits/* This patch solved the memleak with a reproducer, booting with cgroup.memory=nokmem and with selinux disabled. The reason memcg kmem and selinux were disabled for testing of this fix, is because kmem greatly decreases the kernel effectiveness in reusing partial slab objects. cgroup.memory=nokmem is strongly recommended at least for workstation usage. selinux needs to be further analyzed because it causes further slab allocations. The upstream podman commit used for testing is 1fe2965e4f672674f7b66648e9973a0ed5434bb4 (v1.4.4). The upstream kernel commit used for testing is f16fea666898dbdd7812ce94068c76da3e3fcf1e (v5.2-rc6). Reported-by: Michele Baldessari <michele@redhat.com> Signed-off-by: Andrea Arcangeli <aarcange@redhat.com> <Applied with small tweaks to comments> Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* | | | Merge pull request #3693 from QiWang19/searchOpenShift Merge Robot2019-08-02
|\ \ \ \ | | | | | | | | | | fix search output limit
| * | | | fix search output limitQi Wang2019-08-01
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | close https://bugzilla.redhat.com/show_bug.cgi?id=1732280 From the bug Podman search returns 25 results even when limit option `--limit` is larger than 25(maxQueries). They want Podman to return `--limit` results. This PR fixes the number of output result. if --limit not set, return MIN(maxQueries, len(res)) if --limit is set, return MIN(option, len(res)) Signed-off-by: Qi Wang <qiwan@redhat.com>
* | | | Merge pull request #3458 from rhatdan/volumeOpenShift Merge Robot2019-08-01
|\ \ \ \ | | | | | | | | | | Use buildah/pkg/parse volume parsing rather then internal version
| * | | | Use buildah/pkg/parse volume parsing rather then internal versionDaniel J Walsh2019-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We share this code with buildah, so we should eliminate the podman version. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* | | | | Merge pull request #3688 from mheon/print_podOpenShift Merge Robot2019-08-01
|\ \ \ \ \ | |_|_|_|/ |/| | | | Print Pod ID in `podman inspect` output
| * | | | Print Pod ID in `podman inspect` outputMatthew Heon2019-08-01
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | Somehow this managed to slip through the cracks, but this is definitely something inspect should print. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* | | | Merge pull request #3686 from vrothberg/rawhide-buildsOpenShift Merge Robot2019-08-01
|\ \ \ \ | | | | | | | | | | go build: use `-mod=vendor` for go >= 1.11.x
| * | | | go build: use `-mod=vendor` for go >= 1.11.xValentin Rothberg2019-08-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Go 1.13.x isn't sensitive to the GO111MODULE environment variable causing builds to not use the vendored sources in ./vendor. Force builds of module-supporting go versions to use the vendored sources by setting -mod=vendor. Verified in a fedora:rawhide container. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* | | | | Merge pull request #3341 from rhatdan/exitOpenShift Merge Robot2019-08-01
|\ \ \ \ \ | |/ / / / |/| | | | Add new exit codes to rm & rmi for running containers & dependencies
| * | | | Add new exit codes to rm & rmi for running containers & dependenciesDaniel J Walsh2019-08-01
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This enables programs and scripts wrapping the podman command to handle 'podman rm' and 'podman rmi' failures caused by paused or running containers or due to images having other child images or dependent containers. These errors are common enough that it makes sense to have a more machine readable way of detecting them than parsing the standard error output. Signed-off-by: Ondrej Zoder <ozoder@redhat.com> Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* | | | Merge pull request #3675 from rhatdan/storageOpenShift Merge Robot2019-08-01
|\ \ \ \ | | | | | | | | | | Vendor in containers/storage v1.12.16
| * | | | github.com/containers/storage v1.12.13Daniel J Walsh2019-08-01
| | |/ / | |/| | | | | | | | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* | | | Merge pull request #3677 from giuseppe/systemd-cgroupsv2OpenShift Merge Robot2019-08-01
|\ \ \ \ | | | | | | | | | | systemd, cgroupsv2: not bind mount /sys/fs/cgroup/systemd
| * | | | systemd, cgroupsv2: not bind mount /sys/fs/cgroup/systemdGiuseppe Scrivano2019-08-01
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | when running on a cgroups v2 system, do not bind mount the named hierarchy /sys/fs/cgroup/systemd as it doesn't exist anymore. Instead bind mount the entire /sys/fs/cgroup. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | | | Merge pull request #3681 from vrothberg/tests-check-errorsDaniel J Walsh2019-08-01
|\ \ \ \ | | | | | | | | | | e2e test: check exit codes for pull, save, inspect
| * | | | e2e test: check exit codes for pull, save, inspectValentin Rothberg2019-07-31
| | |_|/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | Check the exit codes of pull, save and inspect to avoid masking those errors. We've hit a case where a corrupted/broken image has been pulled which then surfaced for some tests later. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* | | | Merge pull request #3671 from openSUSE/runtime-path-discoveryOpenShift Merge Robot2019-08-01
|\ \ \ \ | |_|_|/ |/| | | Add runtime and conmon path discovery
| * | | Add runtime and conmon path discoverySascha Grunert2019-08-01
| | |/ | |/| | | | | | | | | | | | | | | | | | | The `$PATH` environment variable will now used as fallback if no valid runtime or conmon path matches. The debug logs has been updated to state the used executable. Signed-off-by: Sascha Grunert <sgrunert@suse.com>
* | | Merge pull request #3573 from rhatdan/vendorDaniel J Walsh2019-08-01
|\ \ \ | |/ / |/| | Vendor in latest buildah code
| * | Vendor in buildah 1.9.2Daniel J Walsh2019-07-30
| |/ | | | | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* | Merge pull request #3682 from cevich/fix_release_rerunOpenShift Merge Robot2019-07-31
|\ \ | |/ |/| Cirrus: Fix re-run of release task into no-op.
| * Cirrus: Fix release dependenciesChris Evich2019-07-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The release-task ***must*** always execute last, in order to guarantee a consistent cache of release archives from dependent tasks. It accomplishes this by verifying it's task-number matches one-less than the total number of tasks. Previous to this commit, a YAML anchor/alias was used to avoid duplication of the dependency list between 'success' and 'release' However, it's been observed that this opens the possibility for 'release' and 'success' tasks to race when running on a PR. Because YAML anchor/aliases cannot be used to modify lists, duplication is required to make 'release' actually depend upon 'success'. This duplication will introduce an additional maintenance burden. Though when adding a new task, it's already very easy to forget to update the 'depends_on' list. Assist both cases by the addition unit-tests to verify ``.cirrus.yml`` dependency contents and structure. Signed-off-by: Chris Evich <cevich@redhat.com>
| * Cirrus: Fix re-run of release task into no-op.Chris Evich2019-07-31
|/ | | | | | | | | | This task depends upon other tasks caching their binaries. If for whatever reason the `release` task is re-run and/or is out-of-order with it's dependents, the state of cache will be undefined. Previously this would result in an error, and failing of the release task. This commit alters this behavior to issue a warning instead. Signed-off-by: Chris Evich <cevich@redhat.com>