summaryrefslogtreecommitdiff
path: root/libpod/container_internal.go
Commit message (Collapse)AuthorAge
* Initial implementation of volume pluginsMatthew Heon2021-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements support for mounting and unmounting volumes backed by volume plugins. Support for actually retrieving plugins requires a pull request to land in containers.conf and then that to be vendored, and as such is not yet ready. Given this, this code is only compile tested. However, the code for everything past retrieving the plugin has been written - there is support for creating, removing, mounting, and unmounting volumes, which should allow full functionality once the c/common PR is merged. A major change is the signature of the MountPoint function for volumes, which now, by necessity, returns an error. Named volumes managed by a plugin do not have a mountpoint we control; instead, it is managed entirely by the plugin. As such, we need to cache the path in the DB, and calls to retrieve it now need to access the DB (and may fail as such). Notably absent is support for SELinux relabelling and chowning these volumes. Given that we don't manage the mountpoint for these volumes, I am extremely reluctant to try and modify it - we could easily break the plugin trying to chown or relabel it. Also, we had no less than *5* separate implementations of inspecting a volume floating around in pkg/infra/abi and pkg/api/handlers/libpod. And none of them used volume.Inspect(), the only correct way of inspecting volumes. Remove them all and consolidate to using the correct way. Compat API is likely still doing things the wrong way, but that is an issue for another day. Fixes #4304 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Merge pull request #8906 from vrothberg/fix-8501OpenShift Merge Robot2021-01-14
|\ | | | | container stop: release lock before calling the runtime
| * container stop: release lock before calling the runtimeValentin Rothberg2021-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Podman defers stopping the container to the runtime, which can take some time. Keeping the lock while waiting for the runtime to complete the stop procedure, prevents other commands from acquiring the lock as shown in #8501. To improve the user experience, release the lock before invoking the runtime, and re-acquire the lock when the runtime is finished. Also introduce an intermediate "stopping" to properly distinguish from "stopped" containers etc. Fixes: #8501 Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* | Fxes /etc/hosts duplicated every time after container restarted in a podzhangguanzhang2021-01-13
| | | | | | | | Signed-off-by: zhangguanzhang <zhangguanzhang@qq.com>
* | add pre checkpointunknown2021-01-10
|/ | | | Signed-off-by: Zhuohan Chen <chen_zhuohan@163.com>
* SpellingJosh Soref2020-12-22
| | | | Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* Drop default log-level from error to warnDaniel J Walsh2020-12-03
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our users are missing certain warning messages that would make debugging issues with Podman easier. For example if you do a podman build with a Containerfile that contains the SHELL directive, the Derective is silently ignored. If you run with the log-level warn you get a warning message explainging what happened. $ podman build --no-cache -f /tmp/Containerfile1 /tmp/ STEP 1: FROM ubi8 STEP 2: SHELL ["/bin/bash", "-c"] STEP 3: COMMIT --> 7a207be102a 7a207be102aa8993eceb32802e6ceb9d2603ceed9dee0fee341df63e6300882e $ podman --log-level=warn build --no-cache -f /tmp/Containerfile1 /tmp/ STEP 1: FROM ubi8 STEP 2: SHELL ["/bin/bash", "-c"] STEP 3: COMMIT WARN[0000] SHELL is not supported for OCI image format, [/bin/bash -c] will be ignored. Must use `docker` format --> 7bd96fd25b9 7bd96fd25b9f755d8a045e31187e406cf889dcf3799357ec906e90767613e95f These messages will no longer be lost, when we default to WARNing level. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Merge pull request #8263 from rhatdan/restartOpenShift Merge Robot2020-11-23
|\ | | | | Allow containers to --restart on-failure with --rm
| * Allow containers to --restart on-failure with --rmDaniel J Walsh2020-11-20
| | | | | | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* | Make c.networks() list include the default networkMatthew Heon2020-11-20
|/ | | | | | | | | | | | | | This makes things a lot more clear - if we are actually joining a CNI network, we are guaranteed to get a non-zero length list of networks. We do, however, need to know if the network we are joining is the default network for inspecting containers as it determines how we populate the response struct. To handle this, add a bool to indicate that the network listed was the default network, and only the default network. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Merge pull request #8298 from mheon/db_network_connectOpenShift Merge Robot2020-11-12
|\ | | | | Add support for network connect / disconnect to DB
| * Add support for network connect / disconnect to DBMatthew Heon2020-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Convert the existing network aliases set/remove code to network connect and disconnect. We can no longer modify aliases for an existing network, but we can add and remove entire networks. As part of this, we need to add a new function to retrieve current aliases the container is connected to (we had a table for this as of the first aliases PR, but it was not externally exposed). At the same time, remove all deconflicting logic for aliases. Docker does absolutely no checks of this nature, and allows two containers to have the same aliases, aliases that conflict with container names, etc - it's just left to DNS to return all the IP addresses, and presumably we round-robin from there? Most tests for the existing code had to be removed because of this. Convert all uses of the old container config.Networks field, which previously included all networks in the container, to use the new DB table. This ensures we actually get an up-to-date list of in-use networks. Also, add network aliases to the output of `podman inspect`. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* | Ensure we do not double-lock the same volume in createMatthew Heon2020-11-11
|/ | | | | | | | | | | | | When making containers, we want to lock all named volumes we are adding the container to, to ensure they aren't removed from under us while we are working. Unfortunately, this code did not account for a container having the same volume mounted in multiple places so it could deadlock. Add a map to ensure that we don't lock the same name more than once to resolve this. Fixes #8221 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Stop excessive wrapping of errorsDaniel J Walsh2020-10-30
| | | | | | | | | | | | Most of the builtin golang functions like os.Stat and os.Open report errors including the file system object path. We should not wrap these errors and put the file path in a second time, causing stuttering of errors when they get presented to the user. This patch tries to cleanup a bunch of these errors. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* new "image" mount typeValentin Rothberg2020-10-29
| | | | | | | | | | | | | | Add a new "image" mount type to `--mount`. The source of the mount is the name or ID of an image. The destination is the path inside the container. Image mounts further support an optional `rw,readwrite` parameter which if set to "true" will yield the mount writable inside the container. Note that no changes are propagated to the image mount on the host (which in any case is read only). Mounts are overlay mounts. To support read-only overlay mounts, vendor a non-release version of Buildah. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Populate /etc/hosts file when run in a user namespaceDaniel J Walsh2020-10-07
| | | | | | | | | We do not populate the hostname field with the IP Address when running within a user namespace. Fixes https://github.com/containers/podman/issues/7490 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Remove excessive error wrappingKir Kolyshkin2020-10-05
| | | | | | | | | | | | | | | | | In case os.Open[File], os.Mkdir[All], ioutil.ReadFile and the like fails, the error message already contains the file name and the operation that fails, so there is no need to wrap the error with something like "open %s failed". While at it - replace a few places with os.Open, ioutil.ReadAll with ioutil.ReadFile. - replace errors.Wrapf with errors.Wrap for cases where there are no %-style arguments. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
* Fix up errors found by codespellDaniel J Walsh2020-09-11
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Merge pull request #7578 from giuseppe/join-userns-reuse-mappingsOpenShift Merge Robot2020-09-10
|\ | | | | libpod: read mappings when joining a container userns
| * libpod: read mappings when joining a container usernsGiuseppe Scrivano2020-09-10
| | | | | | | | | | | | | | | | | | when joining an existing container user namespace, read the existing mappings so the storage can be created with the correct ownership. Closes: https://github.com/containers/podman/issues/7547 Signed-off-by: Giuseppe Scrivano <giuseppe@scrivano.org>
* | rootless: support `podman network create` (CNI-in-slirp4netns)Akihiro Suda2020-09-09
|/ | | | | | | | | | | | | | | | | Usage: ``` $ podman network create foo $ podman run -d --name web --hostname web --network foo nginx:alpine $ podman run --rm --network foo alpine wget -O - http://web.dns.podman Connecting to web.dns.podman (10.88.4.6:80) ... <h1>Welcome to nginx!</h1> ... ``` See contrib/rootless-cni-infra for the design. Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
* fix apiv2 will create containers with incorrect commandszhangguanzhang2020-08-24
| | | | Signed-off-by: zhangguanzhang <zhangguanzhang@qq.com>
* volumes: do not recurse when chowningGiuseppe Scrivano2020-07-31
| | | | | | | | | keep the file ownership when chowning and honor the user namespace mappings. Closes: https://github.com/containers/podman/issues/7130 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Switch all references to github.com/containers/libpod -> podmanDaniel J Walsh2020-07-28
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* When chowning we should not follow symbolic linkDaniel J Walsh2020-07-27
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Add support for overlay volume mounts in podman.Qi Wang2020-07-20
| | | | | | | | Add support -v for overlay volume mounts in podman. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com> Signed-off-by: Qi Wang <qiwan@redhat.com>
* Preserve passwd on container restartMatthew Heon2020-07-15
| | | | | | | | | | | | | | | We added code to create a `/etc/passwd` file that we bind-mount into the container in some cases (most notably, `--userns=keep-id` containers). This, unfortunately, was not persistent, so user-added users would be dropped on container restart. Changing where we store the file should fix this. Further, we want to ensure that lookups of users in the container use the right /etc/passwd if we replaced it. There was already logic to do this, but it only worked for user-added mounts; it's easy enough to alter it to use our mounts as well. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Remove all instances of named return "err" from LibpodMatthew Heon2020-07-09
| | | | | | | | | | | | | | | | | This was inspired by https://github.com/cri-o/cri-o/pull/3934 and much of the logic for it is contained there. However, in brief, a named return called "err" can cause lots of code confusion and encourages using the wrong err variable in defer statements, which can make them work incorrectly. Using a separate name which is not used elsewhere makes it very clear what the defer should be doing. As part of this, remove a large number of named returns that were not used anywhere. Most of them were once needed, but are no longer necessary after previous refactors (but were accidentally retained). Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Implement --sdnotify cmdline option to control sd-notify behaviorJoseph Gooch2020-07-06
| | | | | | | | | | | | | | | | | | | --sdnotify container|conmon|ignore With "conmon", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI runtime doesn't pass it into the container. We also advertise "ready" when the OCI runtime finishes to advertise the service as ready. With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI runtime passes it into the container for initialization, and let the container advertise further metadata. This is the default, which is closest to the behavior podman has done in the past. The "ignore" option removes NOTIFY_SOCKET from the environment, so neither podman nor any child processes will talk to systemd. This removes the need for hardcoded CID and PID files in the command line, and the PIDFile directive, as the pid is advertised directly through sd-notify. Signed-off-by: Joseph Gooch <mrwizard@dok.org>
* move go module to v2Valentin Rothberg2020-07-06
| | | | | | | | | | | | | | | With the advent of Podman 2.0.0 we crossed the magical barrier of go modules. While we were able to continue importing all packages inside of the project, the project could not be vendored anymore from the outside. Move the go module to new major version and change all imports to `github.com/containers/libpod/v2`. The renaming of the imports was done via `gomove` [1]. [1] https://github.com/KSubedi/gomove Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* container: move volume chown after spec generationGiuseppe Scrivano2020-06-29
| | | | | | | | | move the chown for newly created volumes after the spec generation so the correct UID/GID are known. Closes: https://github.com/containers/libpod/issues/5698 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* libpod: volume copyup honors namespace mappingsGiuseppe Scrivano2020-06-29
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* libpod: specify mappings to the storageGiuseppe Scrivano2020-06-24
| | | | | | | | | | | | specify the mappings in the container configuration to the storage when creating the container so that the correct mappings can be configured. Regression introduced with Podman 2.0. Closes: https://github.com/containers/libpod/issues/6735 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Turn on More lintersDaniel J Walsh2020-06-15
| | | | | | | | | - misspell - prealloc - unparam - nakedret Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Ensure Conmon is alive before waiting for exit fileMatthew Heon2020-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came out of a conversation with Valentin about systemd-managed Podman. He discovered that unit files did not properly handle cases where Conmon was dead - the ExecStopPost `podman rm --force` line was not actually removing the container, but interestingly, adding a `podman cleanup --rm` line would remove it. Both of these commands do the same thing (minus the `podman cleanup --rm` command not force-removing running containers). Without a running Conmon instance, the container process is still running (assuming you killed Conmon with SIGKILL and it had no chance to kill the container it managed), but you can still kill the container itself with `podman stop` - Conmon is not involved, only the OCI Runtime. (`podman rm --force` and `podman stop` use the same code to kill the container). The problem comes when we want to get the container's exit code - we expect Conmon to make us an exit file, which it's obviously not going to do, being dead. The first `podman rm` would fail because of this, but importantly, it would (after failing to retrieve the exit code correctly) set container status to Exited, so that the second `podman cleanup` process would succeed. To make sure the first `podman rm --force` succeeds, we need to catch the case where Conmon is already dead, and instead of waiting for an exit file that will never come, immediately set the Stopped state and remove an error that can be caught and handled. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Fix remote integration for healthchecksBrent Baude2020-05-20
| | | | | | the one remaining test that is still skipped do to missing exec function Signed-off-by: Brent Baude <bbaude@redhat.com>
* Fix lintMatthew Heon2020-05-14
| | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Remove exec sessions on container restartMatthew Heon2020-05-14
| | | | | | | | | | | | | With APIv2, we cannot guarantee that exec sessions will be removed cleanly on exit (Docker does not include an API for removing exec sessions, instead using a timer-based reaper which we cannot easily replicate). This is part 1 of a 2-part approach to providing a solution to this. This ensures that exec sessions will be reaped, at the very least, on container restart, which takes care of any that were not properly removed during the run of a container. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Cleanup OCI runtime before storageMatthew Heon2020-05-14
| | | | | | | | | Some runtimes (e.g. Kata containers) seem to object to having us unmount storage before the container is removed from the runtime. This is an easy fix (change the order of operations in cleanup) and seems to make more sense than the way we were doing things. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Fix SELinux functions names to not be repetitiveDaniel J Walsh2020-04-23
| | | | | | | Since functions are now in an selinux subpackage, they should not start with SELinux Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Move selinux labeling support from pkg/util to pkg/selinuxDaniel J Walsh2020-04-22
| | | | | | | The goal here is to make the package less heavy and not overload the pkg/util. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Add support for selecting kvm and systemd labelsDaniel J Walsh2020-04-15
| | | | | | | | | | | | In order to better support kata containers and systemd containers container-selinux has added new types. Podman should execute the container with an SELinux process label to match the container type. Traditional Container process : container_t KVM Container Process: containre_kvm_t PID 1 Init process: container_init_t Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* userns: support --userns=autoGiuseppe Scrivano2020-04-06
| | | | | | | automatically pick an empty range and create an user namespace for the container. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Add support for containers.confDaniel J Walsh2020-03-27
| | | | | | | vendor in c/common config pkg for containers.conf Signed-off-by: Qi Wang qiwan@redhat.com Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Attempt manual removal of CNI IP allocations on refreshMatthew Heon2020-03-19
| | | | | | | | | | | | | | We previously attempted to work within CNI to do this, without success. So let's do it manually, instead. We know where the files should live, so we can remove them ourselves instead. This solves issues around sudden reboots where containers do not have time to fully tear themselves down, and leave IP address allocations which, for various reasons, are not stored in tmpfs and persist through reboot. Fixes #5433 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add structure for new exec session tracking to DBMatthew Heon2020-03-18
| | | | | | | | | | | | | | | | | | | | | | | As part of the rework of exec sessions, we need to address them independently of containers. In the new API, we need to be able to fetch them by their ID, regardless of what container they are associated with. Unfortunately, our existing exec sessions are tied to individual containers; there's no way to tell what container a session belongs to and retrieve it without getting every exec session for every container. This adds a pointer to the container an exec session is associated with to the database. The sessions themselves are still stored in the container. Exec-related APIs have been restructured to work with the new database representation. The originally monolithic API has been split into a number of smaller calls to allow more fine-grained control of lifecycle. Support for legacy exec sessions has been retained, but in a deprecated fashion; we should remove this in a few releases. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Revert "exec: get the exit code from sync pipe instead of file"Matthew Heon2020-03-09
| | | | | | | | | This reverts commit 4b72f9e4013411208751df2a92ab9f322d4da5b2. Continues what began with revert of d3d97a25e8c87cf741b2e24ac01ef84962137106 in previous commit. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Fix spelling mistakes in code found by codespellDaniel J Walsh2020-03-07
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* exec: get the exit code from sync pipe instead of filePeter Hunt2020-03-03
| | | | | | | | Before, we were getting the exit code from the file, in which we waited an arbitrary amount of time (5 seconds) for the file, and segfaulted if we didn't find it. instead, we should be a bit more certain conmon has sent the exit code. Luckily, it sends the exit code along the sync pipe fd, so we can read it from there Adapt the ExecContainer interface to pass along a channel to get the pid and exit code from conmon, to be able to read both from the pipe Signed-off-by: Peter Hunt <pehunt@redhat.com>
* Add basic deadlock detection for container start/removeMatthew Heon2020-02-24
| | | | | | | | | | | | | | | We can easily tell if we're going to deadlock by comparing lock IDs before actually taking the lock. Add a few checks for this in common places where deadlocks might occur. This does not yet cover pod operations, where detection is more difficult (and costly) due to the number of locks being involved being higher than 2. Also, add some error wrapping on the Podman side, so we can tell people to use `system renumber` when it occurs. Signed-off-by: Matthew Heon <matthew.heon@pm.me>