summaryrefslogtreecommitdiff
path: root/libpod/runtime_pod_linux.go
Commit message (Collapse)AuthorAge
* Introduce graph-based pod container removalMatthew Heon2022-09-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Originally, during pod removal, we locked every container in the pod at once, did a number of validity checks to ensure everything was safe, and then removed all the containers in the pod. A deadlock was recently discovered with this approach. In brief, we cannot lock the entire pod (or much more than a single container at a time) without causing a deadlock. As such, we converted to an approach where we just looped over each container in the pod, removing them individually. Unfortunately, this removed a lot of the validity checking of the earlier approach, allowing for a lot of unintended bad things. Infra containers could be removed while containers in the pod still depended on them, for example. There's no easy way to do validity checks while in a simple loop, so I implemented a version of our graph-traversal logic that currently handles pod start. This version acts in the reverse order of startup: startup starts from containers which depend on nothing and moves outwards, while removal acts on containers which have nothing depend on them and moves inwards. By doing graph traversal, we can guarantee that nothing is removed while something that depends on it still exists - so the infra container should be the last thing in a pod that is removed, for example. In the (unlikely) case that a graph of the pod's containers cannot be built (most likely impossible without database editing) the old method of pod removal has been retained to ensure that even misbehaving pods can be forcibly evicted from the state. I'm fairly confident that this resolves the problem, but there are a lot of assumptions around dependency structure built into the original pod removal code and I am not 100% sure I have captured all of them. Fixes #15526 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Fix stuttersDaniel J Walsh2022-09-10
| | | | | | | | | | | | | | Podman adds an Error: to every error message. So starting an error message with "error" ends up being reported to the user as Error: error ... This patch removes the stutter. Also ioutil.ReadFile errors report the Path, so wrapping the err message with the path causes a stutter. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Merge pull request #14976 from giuseppe/do-not-lock-containers-pod-rmOpenShift Merge Robot2022-07-22
|\ | | | | libpod: do not lock all containers on pod rm
| * libpod: do not lock all containers on pod rmGiuseppe Scrivano2022-07-21
| | | | | | | | | | | | | | | | | | | | | | | | do not attempt to lock all containers on pod rm since it can cause deadlocks when other podman cleanup processes are attempting to lock the same containers in a different order. [NO NEW TESTS NEEDED] Closes: https://github.com/containers/podman/issues/14929 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | resource limits for podsCharlie Doern2022-07-21
|/ | | | | | | | | | | | | | | | | | added the following flags and handling for podman pod create --memory-swap --cpuset-mems --device-read-bps --device-write-bps --blkio-weight --blkio-weight-device --cpu-shares given the new backend for systemd in c/common, all of these can now be exposed to pod create. most of the heavy lifting (nearly all) is done within c/common. However, some rewiring needed to be done here as well! Signed-off-by: Charlie Doern <cdoern@redhat.com>
* libpod/runtime: switch to golang native error wrappingSascha Grunert2022-07-04
| | | | | | | | | We now use the golang error wrapping format specifier `%w` instead of the deprecated github.com/pkg/errors package. [NO NEW TESTS NEEDED] Signed-off-by: Sascha Grunert <sgrunert@redhat.com>
* only create crgoup when not rootless if using cgroupfsCharlie Doern2022-06-28
| | | | | | | [NO NEW TESTS NEEDED] now that podman's cgroup config tries to initialize controllers, cgroupfs errors out on pod creation we need to mimic the behavior that used to exist and only create the cgroup when running as rootful Signed-off-by: Charlie Doern <cdoern@redhat.com>
* Merge pull request #14713 from Luap99/volume-pluginopenshift-ci[bot]2022-06-27
|\ | | | | add podman volume reload to sync volume plugins
| * add podman volume reload to sync volume pluginsPaul Holzinger2022-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Libpod requires that all volumes are stored in the libpod db. Because volume plugins can be created outside of podman, it will not show all available plugins. This podman volume reload command allows users to sync the libpod db with their external volume plugins. All new volumes from the plugin are also created in the libpod db and when a volume from the db no longer exists it will be removed if possible. There are some problems: - naming conflicts, in this case we only use the first volume we found. This is not deterministic. - race conditions, we have no control over the volume plugins. It is possible that the volumes changed while we run this command. Fixes #14207 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | podman cgroup enhancementcdoern2022-06-24
|/ | | | | | | | | | | currently, setting any sort of resource limit in a pod does nothing. With the newly refactored creation process in c/common, podman ca now set resources at a pod level meaning that resource related flags can now be exposed to podman pod create. cgroupfs and systemd are both supported with varying completion. cgroupfs is a much simpler process and one that is virtually complete for all resource types, the flags now just need to be added. systemd on the other hand has to be handeled via the dbus api meaning that the limits need to be passed as recognized properties to systemd. The properties added so far are the ones that podman pod create supports as well as `cpuset-mems` as this will be the next flag I work on. Signed-off-by: Charlie Doern <cdoern@redhat.com>
* play kube: service containerValentin Rothberg2022-05-12
| | | | | | | | | | | | | | | | | | | | Add the notion of a "service container" to play kube. A service container is started before the pods in play kube and is (reverse) linked to them. The service container is stopped/removed *after* all pods it is associated with are stopped/removed. In other words, a service container tracks the entire life cycle of a service started via `podman play kube`. This is required to enable `play kube` in a systemd unit file. The service container is only used when the `--service-container` flag is set on the CLI. This flag has been marked as hidden as it is not meant to be used outside the context of `play kube`. It is further not supported on the remote client. The wiring with systemd will be done in a later commit. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
* libpod: unlock containers when removing podGiuseppe Scrivano2022-04-29
| | | | | | | | | | | | | | | | | It solves a race where a container cleanup process launched because of the container process exiting normally would hang. It also solves a problem when running as rootless on cgroup v1 since it is not possible to force pids.max = 1 on conmon to limit spawning the cleanup process. Partially copied from https://github.com/containers/podman/pull/13403 Related to: https://github.com/containers/podman/issues/14057 [NO NEW TESTS NEEDED] it doesn't add any new functionality Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Merge pull request #13398 from giuseppe/fix-warning-pod-create-rmOpenShift Merge Robot2022-03-22
|\ | | | | libpod: drop warning if cgroup doesn't exist
| * libpod: drop warning if cgroup doesn't existGiuseppe Scrivano2022-03-02
| | | | | | | | | | | | | | | | | | | | do not print a warning on cgroup removal if it doesn't exist. Closes: https://github.com/containers/podman/issues/13382 [NO NEW TESTS NEEDED] Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | libpod: pods do not use cgroups if --cgroups=disabledGiuseppe Scrivano2022-03-03
|/ | | | | | | | | do not attempt to use cgroups with pods if the cgroups are disabled. A similar check is already in place for containers. Closes: https://github.com/containers/podman/issues/13411 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Remove the runtime lockMatthew Heon2022-02-22
| | | | | | | | | | | | | | | | This primarily served to protect us against shutting down the Libpod runtime while operations (like creating a container) were happening. However, it was very inconsistently implemented (a lot of our longer-lived functions, like pulling images, just didn't implement it at all...) and I'm not sure how much we really care about this very-specific error case? Removing it also removes a lot of potential deadlocks, which is nice. [NO NEW TESTS NEEDED] Signed-off-by: Matthew Heon <mheon@redhat.com>
* bump go module to version 4Valentin Rothberg2022-01-18
| | | | | | | | | | | | | Automated for .go files via gomove [1]: `gomove github.com/containers/podman/v3 github.com/containers/podman/v4` Remaining files via vgrep [2]: `vgrep github.com/containers/podman/v3` [1] https://github.com/KSubedi/gomove [2] https://github.com/vrothberg/vgrep Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Standardize on capatalized CgroupsDaniel J Walsh2022-01-14
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Update vendor or containers/common moving pkg/cgroups thereDaniel J Walsh2021-12-07
| | | | | | | [NO NEW TESTS NEEDED] This is just moving pkg/cgroups out so existing tests should be fine. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* pod/container create: resolve conflicts of generated namesValentin Rothberg2021-11-08
| | | | | | | | | | | Address the TOCTOU when generating random names by having at most 10 attempts to assign a random name when creating a pod or container. [NO TESTS NEEDED] since I do not know a way to force a conflict with randomly generated names in a reasonable time frame. Fixes: #11735 Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Remove infra ID from DB before removing containersMatthew Heon2021-10-20
| | | | | | | | | | | | | | | If we interrupt pod removal between removing containers and removing the whole pod, the infra ID was still in the DB, and most pod operations would try to retrieve the infra container (and would this fail). Clear the infra ID from the DB just before we remove all containers to prevent this. Fixes #12034 [NO NEW TESTS NEEDED] This is a very narrow race and I have no idea how to repro it. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Merge pull request #11851 from cdoern/podRmOpenShift Merge Robot2021-10-20
|\ | | | | Pod Rm Infra Handling Improvements
| * Pod Rm Infra Improvementscdoern2021-10-18
| | | | | | | | | | | | | | | | Made changes so that if the pod contains all exited containers and only infra is running, remove the pod. resolves #11713 Signed-off-by: cdoern <cdoern@redhat.com>
* | Add --time out for podman * rm -f commandsDaniel J Walsh2021-10-04
|/ | | | | | | | | Add --time flag to podman container rm Add --time flag to podman pod rm Add --time flag to podman volume rm Add --time flag to podman network rm Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* standardize logrus messages to upper caseDaniel J Walsh2021-09-22
| | | | | | | | Remove ERROR: Error stutter from logrus messages also. [ NO TESTS NEEDED] This is just code cleanup. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* InfraContainer Reworkcdoern2021-08-26
| | | | | | | | | | InfraContainer should go through the same creation process as regular containers. This change was from the cmd level down, involving new container CLI opts and specgen creating functions. What now happens is that both container and pod cli options are populated in cmd and used to create a podSpecgen and a containerSpecgen. The process then goes as follows FillOutSpecGen (infra) -> MapSpec (podOpts -> infraOpts) -> PodCreate -> MakePod -> createPodOptions -> NewPod -> CompleteSpec (infra) -> MakeContainer -> NewContainer -> newContainer -> AddInfra (to pod state) Signed-off-by: cdoern <cdoern@redhat.com>
* podman pod create --pid flagcdoern2021-07-15
| | | | | | | | added support for --pid flag. User can specify ns:file, pod, private, or host. container returns an error since you cannot point the ns of the pods infra container to a container outside of the pod. Signed-off-by: cdoern <cdoern@redhat.com>
* cgroup: fix rootless --cgroup-parent with podsGiuseppe Scrivano2021-05-06
| | | | | | | | | | extend to pods the existing check whether the cgroup is usable when running as rootless with cgroupfs. commit 17ce567c6827abdcd517699bc07e82ccf48f7619 introduced the regression. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* bump go module to v3Valentin Rothberg2021-02-22
| | | | | | | | | We missed bumping the go module, so let's do it now :) * Automated go code with github.com/sirkon/go-imports-rename * Manually via `vgrep podman/v2` the rest Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Drop default log-level from error to warnDaniel J Walsh2020-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our users are missing certain warning messages that would make debugging issues with Podman easier. For example if you do a podman build with a Containerfile that contains the SHELL directive, the Derective is silently ignored. If you run with the log-level warn you get a warning message explainging what happened. $ podman build --no-cache -f /tmp/Containerfile1 /tmp/ STEP 1: FROM ubi8 STEP 2: SHELL ["/bin/bash", "-c"] STEP 3: COMMIT --> 7a207be102a 7a207be102aa8993eceb32802e6ceb9d2603ceed9dee0fee341df63e6300882e $ podman --log-level=warn build --no-cache -f /tmp/Containerfile1 /tmp/ STEP 1: FROM ubi8 STEP 2: SHELL ["/bin/bash", "-c"] STEP 3: COMMIT WARN[0000] SHELL is not supported for OCI image format, [/bin/bash -c] will be ignored. Must use `docker` format --> 7bd96fd25b9 7bd96fd25b9f755d8a045e31187e406cf889dcf3799357ec906e90767613e95f These messages will no longer be lost, when we default to WARNing level. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Switch all references to github.com/containers/libpod -> podmanDaniel J Walsh2020-07-28
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* move go module to v2Valentin Rothberg2020-07-06
| | | | | | | | | | | | | | | With the advent of Podman 2.0.0 we crossed the magical barrier of go modules. While we were able to continue importing all packages inside of the project, the project could not be vendored anymore from the outside. Move the go module to new major version and change all imports to `github.com/containers/libpod/v2`. The renaming of the imports was done via `gomove` [1]. [1] https://github.com/KSubedi/gomove Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Turn on More lintersDaniel J Walsh2020-06-15
| | | | | | | | | - misspell - prealloc - unparam - nakedret Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Fix errors found in coverity scanDaniel J Walsh2020-05-01
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Add support for containers.confDaniel J Walsh2020-03-27
| | | | | | | vendor in c/common config pkg for containers.conf Signed-off-by: Qi Wang qiwan@redhat.com Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Only modify conmon cgroup if we have running containersMatthew Heon2020-02-06
| | | | | | | | | | | | | If there are no running containers - for example, if the pod was just created - the cgroup in question may not exist (under certain circumstances that we're not 100% sure about). However, regardless, we don't need to set a PID limit, as nothing will be making cleanup processes (no running conmon processes), so not changing the cgroup is safe regardless. Fixes #5072 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Deprecate & remove IsCtrSpecific in favor of IsAnonMatthew Heon2020-01-29
| | | | | | | | | | | | | | | | | | | In Podman 1.6.3, we added support for anonymous volumes - fixing our old, broken support for named volumes that were created with containers. Unfortunately, this reused the database field we used for the old implementation, and toggled volume removal on for `podman run --rm` - so now, we were removing *named* volumes created with older versions of Podman. We can't modify these old volumes in the DB, so the next-safest thing to do is swap to a new field to indicate volumes should be removed. Problem: Volumes created with 1.6.3 and up until this lands, even anonymous volumes, will not be removed. However, this is safer than removing too many volumes, as we were doing before. Fixes #5009 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* make lint: enable gocriticValentin Rothberg2020-01-13
| | | | | | | `gocritic` is a powerful linter that helps in preventing certain kinds of errors as well as enforcing a coding style. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Remove volumes after containers in pod removeMatthew Heon2019-12-17
| | | | | | | | | | | | | | | | | | | | | | | When trying to reproduce #4704 I noticed that the named volumes from the Postgres containers in the reproducer weren't being removed by `podman pod rm -f` saying that the container they were attached to was still in use. This was rather odd, considering they were only in use by one container, and that container was in the process of being removed with the pod. After a bit of tracing, I realized that the cause is the ordering of container removal when we remove a pod. Normally, it's done in removeContainer() before volume removal (which is the last thing in that function). However, when we are removing a pod, we remove containers all at once, after removeContainer has already finished - meaning the container still exists when we try to remove its volumes, and thus the volume can't be removed. Solution: collect a list of all named volumes in use by the pod, and remove them all at once after every container in the pod is gone. This ensures that there are no dependency issues. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* add libpod/configValentin Rothberg2019-10-31
| | | | | | | | | | | | Refactor the `RuntimeConfig` along with related code from libpod into libpod/config. Note that this is a first step of consolidating code into more coherent packages to make the code more maintainable and less prone to regressions on the long runs. Some libpod definitions were moved to `libpod/define` to resolve circular dependencies. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Fix error message on podman stats on cgroups v1 rootless environmentsDaniel J Walsh2019-08-19
| | | | | | | podman stats does not work in rootless environments with cgroups V1. Fix error message and document this fact. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Allow customizing pod hostnameChen Zhiwei2019-08-18
| | | | | | | * set hostname in pod yaml file * set --hostname in pod create command Signed-off-by: Chen Zhiwei <zhiweik@gmail.com>
* golangci-lint round #3baude2019-07-21
| | | | | | | this is the third round of preparing to use the golangci-lint on our code base. Signed-off-by: baude <bbaude@redhat.com>
* Ensure locks are freed when ctr/pod creation failsMatthew Heon2019-07-02
| | | | | | | | If we don't do this, we can leak locks on every failure, and that is very, very bad - can render Podman unusable without a 'system renumber' being run. Signed-off-by: Matthew Heon <mheon@redhat.com>
* stats: fix cgroup path for rootless containersGiuseppe Scrivano2019-06-26
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* libpod: use pkg/cgroups instead of containerd/cgroupsGiuseppe Scrivano2019-06-26
| | | | | | use the new implementation for dealing with cgroups. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* remove libpod from mainbaude2019-06-25
| | | | | | | | | | | | | the compilation demands of having libpod in main is a burden for the remote client compilations. to combat this, we should move the use of libpod structs, vars, constants, and functions into the adapter code where it will only be compiled by the local client. this should result in cleaner code organization and smaller binaries. it should also help if we ever need to compile the remote client on non-Linux operating systems natively (not cross-compiled). Signed-off-by: baude <bbaude@redhat.com>
* When removing pods, free their locksMatthew Heon2019-05-14
| | | | | | | Without this we leak allocated locks, which is definitely not a good thing. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* When removing a pod with CGroupfs, set pids limit to 0Matthew Heon2019-05-12
| | | | | | | | | | | | | When using CGroupfs, we see races during pod removal between removing the CGroup and the cleanup process starting (in the CGroup, thus preventing removal). The simplest way to avoid this is to prevent the forking of the cleanup process. Conveniently, we can do this via the CGroup that we already created for Conmon - we just need to update the PID limit to 0, which completely inhibits new forks. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Use standard remove functions for removing pod ctrsMatthew Heon2019-05-10
| | | | | | | Instead of rewriting the logic, reuse the standard logic we use for removing containers, which is much better tested. Signed-off-by: Matthew Heon <matthew.heon@pm.me>