summaryrefslogtreecommitdiff
path: root/libpod/container.go
Commit message (Collapse)AuthorAge
* libpod: handle single user mapped as rootGiuseppe Scrivano2020-12-24
| | | | | | | | | | | if a single user is mapped in the user namespace, handle it as root. It is needed for running unprivileged containers with a single user available without being forced to run with euid and egid set to 0. Needs: https://github.com/containers/storage/pull/794 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* container cgroup pathValentin Rothberg2020-12-07
| | | | | | | | Before querying for a container's cgroup path, make sure that the container is synced. Also make sure to error out if the container isn't running. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* rewrite podman-cpValentin Rothberg2020-12-04
| | | | | | | | | | | | | | | | | | | | | | | | * Add a new `pkg/copy` to centralize all container-copy related code. * The new code is based on Buildah's `copier` package. * The compat `/archive` endpoints use the new `copy` package. * Update docs and an several new tests. * Includes many fixes, most notably, the look-up of volumes and mounts. Breaking changes: * Podman is now expecting that container-destination paths exist. Before, Podman created the paths if needed. Docker does not do that and I believe Podman should not either as it's a recipe for masking errors. These errors may be user induced (e.g., a path typo), or internal typos (e.g., when the destination may be a mistakenly unmounted volume). Let's keep the magic low for such a security sensitive feature. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Make c.networks() list include the default networkMatthew Heon2020-11-20
| | | | | | | | | | | | | | This makes things a lot more clear - if we are actually joining a CNI network, we are guaranteed to get a non-zero length list of networks. We do, however, need to know if the network we are joining is the default network for inspecting containers as it determines how we populate the response struct. To handle this, add a bool to indicate that the network listed was the default network, and only the default network. Signed-off-by: Matthew Heon <mheon@redhat.com>
* fix container cgroup lookupValentin Rothberg2020-11-20
| | | | | | | | | | | | | When running on cgroups v1, `/proc/{PID}/cgroup` has multiple entries, each pointing potentially to a different cgroup. Some may be empty, some may point to parents. The one we really need is the libpod-specific one, which always is the longest path. So instead of looking at the first entry, look at all and select the longest one. Fixes: #8397 Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* add network connect|disconnect compat endpointsbaude2020-11-19
| | | | | | | | | | | this enables the ability to connect and disconnect a container from a given network. it is only for the compatibility layer. some code had to be refactored to avoid circular imports. additionally, tests are being deferred temporarily due to some incompatibility/bug in either docker-py or our stack. Signed-off-by: baude <bbaude@redhat.com>
* add network connect|disconnect compat endpointsbaude2020-11-17
| | | | | | | | | | | this enables the ability to connect and disconnect a container from a given network. it is only for the compatibility layer. some code had to be refactored to avoid circular imports. additionally, tests are being deferred temporarily due to some incompatibility/bug in either docker-py or our stack. Signed-off-by: baude <bbaude@redhat.com>
* use container cgroups pathValentin Rothberg2020-11-17
| | | | | | | | | When looking up a container's cgroup path, parse /proc/[PID]/cgroup. This will work across all cgroup managers and configurations and is supported on cgroups v1 and v2. Fixes: #8265 Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Add support for network connect / disconnect to DBMatthew Heon2020-11-11
| | | | | | | | | | | | | | | | | | | | | | | | Convert the existing network aliases set/remove code to network connect and disconnect. We can no longer modify aliases for an existing network, but we can add and remove entire networks. As part of this, we need to add a new function to retrieve current aliases the container is connected to (we had a table for this as of the first aliases PR, but it was not externally exposed). At the same time, remove all deconflicting logic for aliases. Docker does absolutely no checks of this nature, and allows two containers to have the same aliases, aliases that conflict with container names, etc - it's just left to DNS to return all the IP addresses, and presumably we round-robin from there? Most tests for the existing code had to be removed because of this. Convert all uses of the old container config.Networks field, which previously included all networks in the container, to use the new DB table. This ensures we actually get an up-to-date list of in-use networks. Also, add network aliases to the output of `podman inspect`. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* new "image" mount typeValentin Rothberg2020-10-29
| | | | | | | | | | | | | | Add a new "image" mount type to `--mount`. The source of the mount is the name or ID of an image. The destination is the path inside the container. Image mounts further support an optional `rw,readwrite` parameter which if set to "true" will yield the mount writable inside the container. Note that no changes are propagated to the image mount on the host (which in any case is read only). Mounts are overlay mounts. To support read-only overlay mounts, vendor a non-release version of Buildah. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Store cgroup manager on a per-container basisMatthew Heon2020-10-08
| | | | | | | | | | | | | | | | | | | | | When we create a container, we assign a cgroup parent based on the current cgroup manager in use. This parent is only usable with the cgroup manager the container is created with, so if the default cgroup manager is later changed or overridden, the container will not be able to start. To solve this, store the cgroup manager that created the container in container configuration, so we can guarantee a container with a systemd cgroup parent will always be started with systemd cgroups. Unfortunately, this is very difficult to test in CI, due to the fact that we hard-code cgroup manager on all invocations of Podman in CI. Fixes #7830 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Switch all references to github.com/containers/libpod -> podmanDaniel J Walsh2020-07-28
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Refactor container configlouis2020-07-23
| | | | | | | This commit handle the TODO task of breaking the Container config into smaller sub-configs Signed-off-by: ldelossa <ldelossa@redhat.com>
* Add --umask flag for create, runAshley Cui2020-07-21
| | | | | | | | --umask sets the umask inside the container Defaults to 0022 Co-authored-by: Daniel J Walsh <dwalsh@redhat.com> Signed-off-by: Ashley Cui <acui@redhat.com>
* Add support for overlay volume mounts in podman.Qi Wang2020-07-20
| | | | | | | | Add support -v for overlay volume mounts in podman. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com> Signed-off-by: Qi Wang <qiwan@redhat.com>
* libpod: pass down network optionsGiuseppe Scrivano2020-07-16
| | | | | | do not pass network specific options through the network namespace. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Add username to /etc/passwd inside of container if --userns keep-idDaniel J Walsh2020-07-07
| | | | | | | | | | If I enter a continer with --userns keep-id, my UID will be present inside of the container, but most likely my user will not be defined. This patch will take information about the user and stick it into the container. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Implement --sdnotify cmdline option to control sd-notify behaviorJoseph Gooch2020-07-06
| | | | | | | | | | | | | | | | | | | --sdnotify container|conmon|ignore With "conmon", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI runtime doesn't pass it into the container. We also advertise "ready" when the OCI runtime finishes to advertise the service as ready. With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI runtime passes it into the container for initialization, and let the container advertise further metadata. This is the default, which is closest to the behavior podman has done in the past. The "ignore" option removes NOTIFY_SOCKET from the environment, so neither podman nor any child processes will talk to systemd. This removes the need for hardcoded CID and PID files in the command line, and the PIDFile directive, as the pid is advertised directly through sd-notify. Signed-off-by: Joseph Gooch <mrwizard@dok.org>
* Merge pull request #6836 from ashley-cui/tzlibpodOpenShift Merge Robot2020-07-06
|\ | | | | Add --tz flag to create, run
| * Add --tz flag to create, runAshley Cui2020-07-02
| | | | | | | | | | | | | | --tz flag sets timezone inside container Can be set to IANA timezone as well as `local` to match host machine Signed-off-by: Ashley Cui <acui@redhat.com>
* | move go module to v2Valentin Rothberg2020-07-06
|/ | | | | | | | | | | | | | | With the advent of Podman 2.0.0 we crossed the magical barrier of go modules. While we were able to continue importing all packages inside of the project, the project could not be vendored anymore from the outside. Move the go module to new major version and change all imports to `github.com/containers/libpod/v2`. The renaming of the imports was done via `gomove` [1]. [1] https://github.com/KSubedi/gomove Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* podman: add new cgroup mode splitGiuseppe Scrivano2020-06-25
| | | | | | | | | | | | | | | | | | | When running under systemd there is no need to create yet another cgroup for the container. With conmon-delegated the current cgroup will be split in two sub cgroups: - supervisor - container The supervisor cgroup will hold conmon and the podman process, while the container cgroup is used by the OCI runtime (using the cgroupfs backend). Closes: https://github.com/containers/libpod/issues/6400 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Add --preservefds to podman runQi Wang2020-06-19
| | | | | | Add --preservefds to podman run. close https://github.com/containers/libpod/issues/6458 Signed-off-by: Qi Wang <qiwan@redhat.com>
* Add support for the unless-stopped restart policyMatthew Heon2020-06-17
| | | | | | | | | | | | | | | | | | We initially believed that implementing this required support for restarting containers after reboot, but this is not the case. The unless-stopped restart policy acts identically to the always restart policy except in cases related to reboot (which we do not support yet), but it does not require that support for us to implement it. Changes themselves are quite simple, we need a new restart policy constant, we need to remove existing checks that block creation of containers when unless-stopped was used, and we need to update the manpages. Fixes #6508 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* add {generate,play} kubeValentin Rothberg2020-05-06
| | | | | | | | | | | | | | | | | | | Add the `podman generate kube` and `podman play kube` command. The code has largely been copied from Podman v1 but restructured to not leak the K8s core API into the (remote) client. Both commands are added in the same commit to allow for enabling the tests at the same time. Move some exports from `cmd/podman/common` to the appropriate places in the backend to avoid circular dependencies. Move definitions of label annotations to `libpod/define` and set the security-opt labels in the frontend to make kube tests pass. Implement rest endpoints, bindings and the tunnel interface. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* podman v2 remove bloat v2Brent Baude2020-04-16
| | | | | | rid ourseleves of libpod references in v2 client Signed-off-by: Brent Baude <bbaude@redhat.com>
* Add support for selecting kvm and systemd labelsDaniel J Walsh2020-04-15
| | | | | | | | | | | | In order to better support kata containers and systemd containers container-selinux has added new types. Podman should execute the container with an SELinux process label to match the container type. Traditional Container process : container_t KVM Container Process: containre_kvm_t PID 1 Init process: container_init_t Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Add support for containers.confDaniel J Walsh2020-03-27
| | | | | | | vendor in c/common config pkg for containers.conf Signed-off-by: Qi Wang qiwan@redhat.com Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Merge pull request #5088 from mheon/begin_exec_reworkOpenShift Merge Robot2020-03-19
|\ | | | | Begin exec rework
| * Add structure for new exec session tracking to DBMatthew Heon2020-03-18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As part of the rework of exec sessions, we need to address them independently of containers. In the new API, we need to be able to fetch them by their ID, regardless of what container they are associated with. Unfortunately, our existing exec sessions are tied to individual containers; there's no way to tell what container a session belongs to and retrieve it without getting every exec session for every container. This adds a pointer to the container an exec session is associated with to the database. The sessions themselves are still stored in the container. Exec-related APIs have been restructured to work with the new database representation. The originally monolithic API has been split into a number of smaller calls to allow more fine-grained control of lifecycle. Support for legacy exec sessions has been retained, but in a deprecated fashion; we should remove this in a few releases. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
| * Populate ExecSession with all required fieldsMatthew Heon2020-03-18
| | | | | | | | | | | | | | | | | | As part of the rework of exec sessions, we want to split Create and Start - and, as a result, we need to keep everything needed to start exec sessions in the struct, not just the bare minimum for tracking running ones. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* | auto updatesValentin Rothberg2020-03-17
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add support to auto-update containers running in systemd units as generated with `podman generate systemd --new`. `podman auto-update` looks up containers with a specified "io.containers.autoupdate" label (i.e., the auto-update policy). If the label is present and set to "image", Podman reaches out to the corresponding registry to check if the image has been updated. We consider an image to be updated if the digest in the local storage is different than the one of the remote image. If an image must be updated, Podman pulls it down and restarts the container. Note that the restarting sequence relies on systemd. At container-creation time, Podman looks up the "PODMAN_SYSTEMD_UNIT" environment variables and stores it verbatim in the container's label. This variable is now set by all systemd units generated by `podman-generate-systemd` and is set to `%n` (i.e., the name of systemd unit starting the container). This data is then being used in the auto-update sequence to instruct systemd (via DBUS) to restart the unit and hence to restart the container. Note that this implementation of auto-updates relies on systemd and requires a fully-qualified image reference to be used to create the container. This enforcement is necessary to know which image to actually check and pull. If we used an image ID, we would not know which image to check/pull anymore. Fixes: #3575 Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Remove ImageVolumes from databaseMatthew Heon2020-02-21
| | | | | | | | | | | | | | Before Libpod supported named volumes, we approximated image volumes by bind-mounting in per-container temporary directories. This was handled by Libpod, and had a corresponding database entry to enable/disable it. However, when we enabled named volumes, we completely rewrote the old implementation; none of the old bind mount implementation still exists, save one flag in the database. With nothing remaining to use it, it has no further purpose. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Initial implementation of a spec generator packageMatthew Heon2020-02-04
| | | | | | | | | | | | | | | | | | | | | | | The current Libpod pkg/spec has become a victim of the better part of three years of development that tied it extremely closely to the current Podman CLI. Defaults are spread across multiple places, there is no easy way to produce a CreateConfig that will actually produce a valid container, and the logic for generating configs has sprawled across at least three packages. This is an initial pass at a package that generates OCI specs that will supersede large parts of the current pkg/spec. The CreateConfig will still exist, but will effectively turn into a parsed CLI. This will be compiled down into the new SpecGenerator struct, which will generate the OCI spec and Libpod create options. The preferred integration point for plugging into Podman's Go API to create containers will be the new CreateConfig, as it's less tied to Podman's command line. CRI-O, for example, will likely tie in here. Signed-off-by: Matthew Heon <mheon@redhat.com>
* podman: add new option --cgroups=no-conmonGiuseppe Scrivano2020-01-16
| | | | | | | | it allows to disable cgroups creation only for the conmon process. A new cgroup is created for the container payload. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* make lint: enable gocriticValentin Rothberg2020-01-13
| | | | | | | `gocritic` is a powerful linter that helps in preventing certain kinds of errors as well as enforcing a coding style. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* log: support --log-opt tag=Giuseppe Scrivano2020-01-10
| | | | | | | | | | support a custom tag to add to each log for the container. It is currently supported only by the journald backend. Closes: https://github.com/containers/libpod/issues/3653 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: use RootlessKit port forwarderAkihiro Suda2020-01-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RootlessKit port forwarder has a lot of advantages over the slirp4netns port forwarder: * Very high throughput. Benchmark result on Travis: socat: 5.2 Gbps, slirp4netns: 8.3 Gbps, RootlessKit: 27.3 Gbps (https://travis-ci.org/rootless-containers/rootlesskit/builds/597056377) * Connections from the host are treated as 127.0.0.1 rather than 10.0.2.2 in the namespace. No UDP issue (#4586) * No tcp_rmem issue (#4537) * Probably works with IPv6. Even if not, it is trivial to support IPv6. (#4311) * Easily extensible for future support of SCTP * Easily extensible for future support of `lxc-user-nic` SUID network RootlessKit port forwarder has been already adopted as the default port forwarder by Rootless Docker/Moby, and no issue has been reported AFAIK. As the port forwarder is imported as a Go package, no `rootlesskit` binary is required for Podman. Fix #4586 May-fix #4559 Fix #4537 May-fix #4311 See https://github.com/rootless-containers/rootlesskit/blob/v0.7.0/pkg/port/builtin/builtin.go Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
* container config: add CreateCommandValentin Rothberg2019-12-13
| | | | | | | | | | | | | | | | | | Store the full command plus arguments of the process the container has been created with. Expose this data as a `Config.CreateCommand` field in the container-inspect data as well. This information can be useful for debugging, as we can find out which command has created the container, and, if being created via the Podman CLI, we know exactly with which flags the container has been created with. The immediate motivation for this change is to use this information for `podman-generate-systemd` to generate systemd-service files that allow for creating new containers (in contrast to only starting existing ones). Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* libpod: fix stats for rootless podsGiuseppe Scrivano2019-12-04
| | | | | | | | honor the systemd parent directory when specified. Closes: https://github.com/containers/libpod/issues/4634 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Allow chained network namespace containersMatthew Heon2019-12-03
| | | | | | | | | | | | | The code currently assumes that the container we delegate network namespace to will never further delegate to another container, so when looking up things like /etc/hosts and /etc/resolv.conf we won't pull the correct files from the chained dependency. The changes to resolve this are relatively simple - just need to keep looking until we find a container without NetNsCtr set. Fixes #4626 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* podman: add support for specifying MACJakub Filak2019-11-06
| | | | | | | | I basically copied and adapted the statements for setting IP. Closes #1136 Signed-off-by: Jakub Filak <jakub.filak@sap.com>
* add libpod/configValentin Rothberg2019-10-31
| | | | | | | | | | | | Refactor the `RuntimeConfig` along with related code from libpod into libpod/config. Note that this is a first step of consolidating code into more coherent packages to make the code more maintainable and less prone to regressions on the long runs. Some libpod definitions were moved to `libpod/define` to resolve circular dependencies. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* bump containers/image to v5.0.0, buildah to v1.11.4Nalin Dahyabhai2019-10-29
| | | | | | | | | Move to containers/image v5 and containers/buildah to v1.11.4. Replace an equality check with a type assertion when checking for a docker.ErrUnauthorizedForCredentials in `podman login`. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
* Add ensureState helper for checking container stateMatthew Heon2019-10-28
| | | | | | | | | | | | | | | | We have a lot of checks for container state scattered throughout libpod. Many of these need to ensure the container is in one of a given set of states so an operation may safely proceed. Previously there was no set way of doing this, so we'd use unique boolean logic for each one. Introduce a helper to standardize state checks. Note that this is only intended to replace checks for multiple states. A simple check for one state (ContainerStateRunning, for example) should remain a straight equality, and not use this new helper. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Move OCI runtime implementation behind an interfaceMatthew Heon2019-10-10
| | | | | | | | | | | | For future work, we need multiple implementations of the OCI runtime, not just a Conmon-wrapped runtime matching the runc CLI. As part of this, do some refactoring on the interface for exec (move to a struct, not a massive list of arguments). Also, add 'all' support to Kill and Stop (supported by runc and used a bit internally for removing containers). Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Update c/image to v4.0.1 and buildah to 1.11.3Miloslav Trmač2019-10-04
| | | | | | | | | | | | | | This requires updating all import paths throughout, and a matching buildah update to interoperate. I can't figure out the reason for go.mod tracking github.com/containers/image v3.0.2+incompatible // indirect ((go mod graph) lists it as a direct dependency of libpod, but (go list -json -m all) lists it as an indirect dependency), but at least looking at the vendor subdirectory, it doesn't seem to be actually used in the built binaries. Signed-off-by: Miloslav Trmač <mitr@redhat.com>
* Add support for launching containers without CGroupsMatthew Heon2019-09-10
| | | | | | | This is mostly used with Systemd, which really wants to manage CGroups itself when managing containers via unit file. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add comment to describe postConfigureNetNSGabi Beyer2019-07-30
| | | | | | | Provide information stating what the postConfigureNetNS option is used for. Signed-off-by: Gabi Beyer <gabrielle.n.beyer@intel.com>
* golangci-lint round #3baude2019-07-21
| | | | | | | this is the third round of preparing to use the golangci-lint on our code base. Signed-off-by: baude <bbaude@redhat.com>