summaryrefslogtreecommitdiff
path: root/libpod/container_internal_linux.go
Commit message (Collapse)AuthorAge
* libpod: fix check for slirp4netns netnsGiuseppe Scrivano2020-06-11
| | | | | | | | | | fix the check for c.state.NetNS == nil. Its value is changed in the first code block, so the condition is always true in the second one and we end up running slirp4netns twice. Closes: https://github.com/containers/libpod/issues/6538 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* container: do not set hostname when joining utsGiuseppe Scrivano2020-06-10
| | | | | | | do not set the hostname when joining an UTS namespace, as it could be owned by a different userns. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* container: make resolv.conf and hosts accessible in usernsGiuseppe Scrivano2020-06-10
| | | | | | | | when running in a new userns, make sure the resolv.conf and hosts files bind mounted from another container are accessible to root in the userns. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* check --user range for rootless containersQi Wang2020-06-02
| | | | | | Check --user range if it's a uid for rootless containers. Returns error if it is out of the range. From https://github.com/containers/libpod/issues/6431#issuecomment-636124686 Signed-off-by: Qi Wang <qiwan@redhat.com>
* Fix mountpont in SecretMountsWithUIDGIDDaniel J Walsh2020-05-19
| | | | | | | In FIPS Mode we expect to work off of the Mountpath not the Rundir path. This is causing FIPS Mode checks to fail. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* libpod: set hostname from joined containerGiuseppe Scrivano2020-04-27
| | | | | | | when joining a UTS namespace, take the hostname from the destination container. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Update podman to use containers.confDaniel J Walsh2020-04-20
| | | | | | | | Add more default options parsing Switch to using --time as opposed to --timeout to better match Docker. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* userns: support --userns=autoGiuseppe Scrivano2020-04-06
| | | | | | | automatically pick an empty range and create an user namespace for the container. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Add support for containers.confDaniel J Walsh2020-03-27
| | | | | | | vendor in c/common config pkg for containers.conf Signed-off-by: Qi Wang qiwan@redhat.com Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Attempt manual removal of CNI IP allocations on refreshMatthew Heon2020-03-19
| | | | | | | | | | | | | | We previously attempted to work within CNI to do this, without success. So let's do it manually, instead. We know where the files should live, so we can remove them ourselves instead. This solves issues around sudden reboots where containers do not have time to fully tear themselves down, and leave IP address allocations which, for various reasons, are not stored in tmpfs and persist through reboot. Fixes #5433 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Ensure that exec sessions inherit supplemental groupsMatthew Heon2020-02-28
| | | | | | | | This corrects a regression from Podman 1.4.x where container exec sessions inherited supplemental groups from the container, iff the exec session did not specify a user. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add network options to podman pod createMatthew Heon2020-02-19
| | | | | | | | | | | | | | | | | Enables most of the network-related functionality from `podman run` in `podman pod create`. Custom CNI networks can be specified, host networking is supported, DNS options can be configured. Also enables host networking in `podman play kube`. Fixes #2808 Fixes #3837 Fixes #4432 Fixes #4718 Fixes #4770 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* make lint: enable gocriticValentin Rothberg2020-01-13
| | | | | | | `gocritic` is a powerful linter that helps in preventing certain kinds of errors as well as enforcing a coding style. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Correctly export the root file-system changesAdrian Reber2019-12-09
| | | | | | | | | | | | | | | | When doing a checkpoint with --export the root file-system diff was not working as expected. Instead of getting the changes from the running container to the highest storage layer it got the changes from the highest layer to that parent's layer. For a one layer container this could mean that the complete root file-system is part of the checkpoint. With this commit this changes to use the same functionality as 'podman diff'. This actually enables to correctly diff the root file-system including tracking deleted files. This also removes the non-working helper functions from libpod/diff.go. Signed-off-by: Adrian Reber <areber@redhat.com>
* Allow chained network namespace containersMatthew Heon2019-12-03
| | | | | | | | | | | | | The code currently assumes that the container we delegate network namespace to will never further delegate to another container, so when looking up things like /etc/hosts and /etc/resolv.conf we won't pull the correct files from the chained dependency. The changes to resolve this are relatively simple - just need to keep looking until we find a container without NetNsCtr set. Fixes #4626 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Merge pull request #4493 from mheon/add_removing_stateOpenShift Merge Robot2019-12-02
|\ | | | | Add ContainerStateRemoving
| * Add ContainerStateRemovingMatthew Heon2019-11-19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When Libpod removes a container, there is the possibility that removal will not fully succeed. The most notable problems are storage issues, where the container cannot be removed from c/storage. When this occurs, we were faced with a choice. We can keep the container in the state, appearing in `podman ps` and available for other API operations, but likely unable to do any of them as it's been partially removed. Or we can remove it very early and clean up after it's already gone. We have, until now, used the second approach. The problem that arises is intermittent problems removing storage. We end up removing a container, failing to remove its storage, and ending up with a container permanently stuck in c/storage that we can't remove with the normal Podman CLI, can't use the name of, and generally can't interact with. A notable cause is when Podman is hit by a SIGKILL midway through removal, which can consistently cause `podman rm` to fail to remove storage. We now add a new state for containers that are in the process of being removed, ContainerStateRemoving. We set this at the beginning of the removal process. It notifies Podman that the container cannot be used anymore, but preserves it in the DB until it is fully removed. This will allow Remove to be run on these containers again, which should successfully remove storage if it fails. Fixes #3906 Signed-off-by: Matthew Heon <mheon@redhat.com>
* | Disable checkpointing of containers started with --rmAdrian Reber2019-11-28
| | | | | | | | | | | | | | | | | | | | | | | | | | Trying to checkpoint a container started with --rm works, but it makes no sense as the container, including the checkpoint, will be deleted after writing the checkpoint. This commit inhibits checkpointing containers started with '--rm' unless '--export' is used. If the checkpoint is exported it can easily be restored from the exported checkpoint, even if '--rm' is used. To restore a container from a checkpoint it is even necessary to manually run 'podman rm' if the container is not started with '--rm'. Signed-off-by: Adrian Reber <areber@redhat.com>
* | container-restore: Fix restore with user namespaceRadostin Stoyanov2019-11-17
|/ | | | | | | | | | | | | | | | | | | | | | When restoring a container with user namespace, the user namespace is created by the OCI runtime, and the network namespace is created after the user namespace to ensure correct ownership. In this case PostConfigureNetNS will be set and the value of c.state.NetNS would be nil. Hence, the following error occurs: $ sudo podman run --name cr \ --uidmap 0:1000:500 \ -d docker.io/library/alpine \ /bin/sh -c 'i=0; while true; do echo $i; i=$(expr $i + 1); sleep 1; done' $ sudo podman container checkpoint cr $ sudo podman container restore cr ... panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x30 pc=0x13a5e3c] Signed-off-by: Radostin Stoyanov <rstoyanov1@gmail.com>
* podman: add support for specifying MACJakub Filak2019-11-06
| | | | | | | | I basically copied and adapted the statements for setting IP. Closes #1136 Signed-off-by: Jakub Filak <jakub.filak@sap.com>
* Vendor in latest containers/buildahUrvashi Mohnani2019-11-01
| | | | | | | | Pull in changes to pkg/secrets/secrets.go that adds the logic to disable fips mode if a pod/container has a label set. Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
* add libpod/configValentin Rothberg2019-10-31
| | | | | | | | | | | | Refactor the `RuntimeConfig` along with related code from libpod into libpod/config. Note that this is a first step of consolidating code into more coherent packages to make the code more maintainable and less prone to regressions on the long runs. Some libpod definitions were moved to `libpod/define` to resolve circular dependencies. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* systemd: mask /sys/fs/cgroup/systemd/release_agentGiuseppe Scrivano2019-10-25
| | | | | | | | when running in systemd mode on cgroups v1, make sure the /sys/fs/cgroup/systemd/release_agent is masked otherwise the container is able to modify it and execute scripts on the host. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* When restoring containers, reset cgroup pathMatthew Heon2019-10-10
| | | | | | | | | | | | | | | | | | | | Previously, `podman checkport restore` with exported containers, when told to create a new container based on the exported checkpoint, would create a new container, with a new container ID, but not reset CGroup path - which contained the ID of the original container. If this was done multiple times, the result was two containers with the same cgroup paths. Operations on these containers would this have a chance of crossing over to affect the other one; the most notable was `podman rm` once it was changed to use the --all flag when stopping the container; all processes in the cgroup, including the ones in the other container, would be stopped. Reset cgroups on restore to ensure that the path matches the ID of the container actually being run. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Move OCI runtime implementation behind an interfaceMatthew Heon2019-10-10
| | | | | | | | | | | | For future work, we need multiple implementations of the OCI runtime, not just a Conmon-wrapped runtime matching the runc CLI. As part of this, do some refactoring on the interface for exec (move to a struct, not a massive list of arguments). Also, add 'all' support to Kill and Stop (supported by runc and used a bit internally for removing containers). Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Merge pull request #4086 from mheon/cni_del_on_refreshOpenShift Merge Robot2019-09-25
|\ | | | | Force a CNI Delete on refreshing containers
| * Force a CNI Delete on refreshing containersMatthew Heon2019-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | CNI expects that a DELETE be run before re-creating container networks. If a reboot occurs quickly enough that containers can't stop and clean up, that DELETE never happens, and Podman currently wipes the old network info and thinks the state has been entirely cleared. Unfortunately, that may not be the case on the CNI side. Some things - like IP address reservations - may not have been cleared. To solve this, manually re-run CNI Delete on refresh. If the container has already been deleted this seems harmless. If not, it should clear lingering state. Fixes: #3759 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* | rootless: Rearrange setup of rootless containersGabi Beyer2019-09-24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to run Podman with VM-based runtimes unprivileged, the network must be set up prior to the container creation. Therefore this commit modifies Podman to run rootless containers by: 1. create a network namespace 2. pass the netns persistent mount path to the slirp4netns to create the tap inferface 3. pass the netns path to the OCI spec, so the runtime can enter the netns Closes #2897 Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
* | execuser: look at the source for /etc/{passwd,group} overridesGiuseppe Scrivano2019-09-21
|/ | | | | | | | | look if there are bind mounts that can shadow the /etc/passwd and /etc/group files. In that case, look at the bind mount source. Closes: https://github.com/containers/libpod/pull/4068#issuecomment-533782941 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* container: make sure $HOME is always setGiuseppe Scrivano2019-09-20
| | | | | | | | | | | | | If the HOME environment variable is not set, make sure it is set to the configuration found in the container /etc/passwd file. It was previously depending on a runc behavior that always set HOME when it is not set. The OCI runtime specifications do not require HOME to be set so move the logic to libpod. Closes: https://github.com/debarshiray/toolbox/issues/266 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* linux: fix systemd with --cgroupns=privateGiuseppe Scrivano2019-09-12
| | | | | | | | | When --cgroupns=private is used we need to mount a new cgroup file system so that it points to the correct namespace. Needs: https://github.com/containers/crun/pull/88 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Merge pull request #3927 from openSUSE/manager-annotationsOpenShift Merge Robot2019-09-11
|\ | | | | Add `ContainerManager` annotation to created containers
| * Add `ContainerManager` annotation to created containersSascha Grunert2019-09-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This change adds the following annotation to every container created by podman: ```json "Annotations": { "io.containers.manager": "libpod" } ``` Target of this annotaions is to indicate which project in the containers ecosystem is the major manager of a container when applications share the same storage paths. This way projects can decide if they want to manipulate the container or not. For example, since CRI-O and podman are not using the same container library (libpod), CRI-O can skip podman containers and provide the end user more useful information. A corresponding end-to-end test has been adapted as well. Relates to: https://github.com/cri-o/cri-o/pull/2761 Signed-off-by: Sascha Grunert <sgrunert@suse.com>
* | Merge pull request #3581 from mheon/no_cgroupsOpenShift Merge Robot2019-09-11
|\ \ | | | | | | Support running containers without CGroups
| * | Add support for launching containers without CGroupsMatthew Heon2019-09-10
| |/ | | | | | | | | | | | | This is mostly used with Systemd, which really wants to manage CGroups itself when managing containers via unit file. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* / When first mounting any named volume, copy upMatthew Heon2019-09-09
|/ | | | | | | | | | | Previously, we only did this for volumes created at the same time as the container. However, this is not correct behavior - Docker does so for all named volumes, even those made with 'podman volume create' and mounted into a container later. Fixes #3945 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Ignore ENOENT on umount of SHMMatthew Heon2019-09-06
| | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Correctly report errors on unmounting SHMMatthew Heon2019-09-05
| | | | | | | | | | | When we fail to remove a container's SHM, that's an error, and we need to report it as such. This may be part of our lingering storage woes. Also, remove MNT_DETACH. It may be another cause of the storage removal failures. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add ability for volumes with options to mount/umountMatthew Heon2019-09-05
| | | | | | | | | | | | | When volume options and the local volume driver are specified, the volume is intended to be mounted using the 'mount' command. Supported options will be used to volume the volume before the first container using it starts, and unmount the volume after the last container using it dies. This should work for any local filesystem, though at present I've only tested with tmpfs and btrfs. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* handle dns response from cnibaude2019-09-03
| | | | | | | | | | | | | when cni returns a list of dns servers, we should add them under the right conditions. the defined conditions are as follows: - if the user provides dns, it and only it are added. - if not above and you get a cni name server, it is added and a forwarding dns instance is created for what was in resolv.conf. - if not either above, the entries from the host's resolv.conf are used. Signed-off-by: baude <bbaude@redhat.com> Signed-off-by: baude <bbaude@redhat.com>
* cgroup: fix regression when running systemdGiuseppe Scrivano2019-08-06
| | | | | | | | | | | | | | | commit 223fe64dc0a592fd44e0c9fde9f9e0ca087d566f introduced the regression. When running on cgroups v1, bind mount only /sys/fs/cgroup/systemd as rw, as the code did earlier. Also, simplify the rootless code as it doesn't require any special handling when using --systemd. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1737554 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Revert "rootless: Rearrange setup of rootless containers"baude2019-08-06
| | | | | | This reverts commit 80dcd4bebcdc8e280f6b43228561d09c194c328b. Signed-off-by: baude <bbaude@redhat.com>
* Merge pull request #3690 from adrianreber/ignore-static-ipOpenShift Merge Robot2019-08-05
|\ | | | | restore: added --ignore-static-ip option
| * restore: added --ignore-static-ip optionAdrian Reber2019-08-02
| | | | | | | | | | | | | | | | | | | | If a container is restored multiple times from an exported checkpoint with the help of '--import --name', the restore will fail if during 'podman run' a static container IP was set with '--ip'. The user can tell the restore process to ignore the static IP with '--ignore-static-ip'. Signed-off-by: Adrian Reber <areber@redhat.com>
* | Merge pull request #3310 from gabibeyer/rootlessKataOpenShift Merge Robot2019-08-05
|\ \ | |/ |/| rootless: Rearrange setup of rootless containers ***CIRRUS: TEST IMAGES***
| * rootless: Rearrange setup of rootless containersGabi Beyer2019-07-30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In order to run Podman with VM-based runtimes unprivileged, the network must be set up prior to the container creation. Therefore this commit modifies Podman to run rootless containers by: 1. create a network namespace 2. pass the netns persistent mount path to the slirp4netns to create the tap inferface 3. pass the netns path to the OCI spec, so the runtime can enter the netns Closes #2897 Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
* | Merge pull request #3677 from giuseppe/systemd-cgroupsv2OpenShift Merge Robot2019-08-01
|\ \ | | | | | | systemd, cgroupsv2: not bind mount /sys/fs/cgroup/systemd
| * | systemd, cgroupsv2: not bind mount /sys/fs/cgroup/systemdGiuseppe Scrivano2019-08-01
| |/ | | | | | | | | | | | | | | when running on a cgroups v2 system, do not bind mount the named hierarchy /sys/fs/cgroup/systemd as it doesn't exist anymore. Instead bind mount the entire /sys/fs/cgroup. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* / Vendor in buildah 1.9.2Daniel J Walsh2019-07-30
|/ | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* podman: support --userns=ns|containerGiuseppe Scrivano2019-07-25
| | | | | | | | allow to join the user namespace of another container. Closes: https://github.com/containers/libpod/issues/3629 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>