summaryrefslogtreecommitdiff
path: root/libpod
Commit message (Collapse)AuthorAge
* libpod: fix wait and exit-code logicValentin Rothberg2022-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit addresses three intertwined bugs to fix an issue when using Gitlab runner on Podman. The three bug fixes are not split into separate commits as tests won't pass otherwise; avoidable noise when bisecting future issues. 1) Podman conflated states: even when asking to wait for the `exited` state, Podman returned as soon as a container transitioned to `stopped`. The issues surfaced in Gitlab tests to fail [1] as `conmon`'s buffers have not (yet) been emptied when attaching to a container right after a wait. The race window was extremely narrow, and I only managed to reproduce with the Gitlab runner [1] unit tests. 2) The clearer separation between `exited` and `stopped` revealed a race condition predating the changes. If a container is configured for autoremoval (e.g., via `run --rm`), the "run" process competes with the "cleanup" process running in the background. The window of the race condition was sufficiently large that the "cleanup" process has already removed the container and storage before the "run" process could read the exit code and hence waited indefinitely. Address the exit-code race condition by recording exit codes in the main libpod database. Exit codes can now be read from a database. When waiting for a container to exit, Podman first waits for the container to transition to `exited` and will then query the database for its exit code. Outdated exit codes are pruned during cleanup (i.e., non-performance critical) and when refreshing the database after a reboot. An exit code is considered outdated when it is older than 5 minutes. While the race condition predates this change, the waiting process has apparently always been fast enough in catching the exit code due to issue 1): `exited` and `stopped` were conflated. The waiting process hence caught the exit code after the container transitioned to `stopped` but before it `exited` and got removed. 3) With 1) and 2), Podman is now waiting for a container to properly transition to the `exited` state. Some tests did not pass after 1) and 2) which revealed the third bug: `conmon` was executed with its working directory pointing to the OCI runtime bundle of the container. The changed working directory broke resolving relative paths in the "cleanup" process. The "cleanup" process error'ed before actually cleaning up the container and waiting "main" process ran indefinitely - or until hitting a timeout. Fix the issue by executing `conmon` with the same working directory as Podman. Note that fixing 3) *may* address a number of issues we have seen in the past where for *some* reason cleanup processes did not fire. [1] https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27119#note_970712864 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com> [MH: Minor reword of commit message] Signed-off-by: Matthew Heon <mheon@redhat.com>
* conmon: silence json-file errorValentin Rothberg2022-06-23
| | | | | | | We should just silently fall through. The log was flooding the system-service logs when running Gitlab runner. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
* Fix spelling "setup" -> "set up" and similarErik Sjölund2022-06-22
| | | | | | | | | | * Replace "setup", "lookup", "cleanup", "backup" with "set up", "look up", "clean up", "back up" when used as verbs. Replace also variations of those. * Improve language in a few places. Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
* Merge pull request #14659 from eriksjolund/setup_to_set_up_in_codeopenshift-ci[bot]2022-06-21
|\ | | | | [CI:DOCS] "setup" -> "set up" in source code comments
| * [CI:DOCS] "setup" -> "set up" in source code commentsErik Sjölund2022-06-19
| | | | | | | | | | | | | | * Replace "setup", "lookup" with "set up", "look up" when used as verbs. Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
* | podman pod create --shm-sizecdoern2022-06-20
|/ | | | | | | | | expose the --shm-size flag to podman pod create and add proper handling and inheritance for the option. resolves #14609 Signed-off-by: Charlie Doern <cdoern@redhat.com>
* Merge pull request #14299 from cdoern/podCloneopenshift-ci[bot]2022-06-16
|\ | | | | implement podman pod clone
| * podman pod clonecdoern2022-06-10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | implement podman pod clone, a command to create an exact copy of a pod while changing certain config elements current supported flags are: --name change the pod name --destroy remove the original pod --start run the new pod on creation and all infra-container related flags from podman pod create (namespaces etc) resolves #12843 Signed-off-by: cdoern <cdoern@redhat.com>
* | Merge pull request #14596 from ↵openshift-ci[bot]2022-06-15
|\ \ | | | | | | | | | | | | giuseppe/move-conmon-different-cgroup-system-service libpod: improve check to create conmon cgroup
| * | libpod: improve check to create conmon cgroupGiuseppe Scrivano2022-06-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | commit 1951ff168a63157fa2f4711fde283edfc4981ed3 introduced a check so that conmon is not moved to a new cgroup when podman is running inside of a systemd service. This is helpful to integrate podman in systemd so that the spawned conmon lives in the same cgroup as the service that created it. Unfortunately this breaks when podman daemon is running in a systemd service since the same check is in place thus all the conmon processes end up in the same cgroup as the podman daemon. When the podman daemon systemd service stops the conmon processes are also terminated as well as the containers they monitor. Improve the check to exclude podman running as a daemon. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2052697 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | | fix CI: golangci-lint is broken on mainPaul Holzinger2022-06-15
| | | | | | | | | | | | | | | | | | | | | The merge of both 528739cef3d2 and 1b62e4543845 at the same time created a lint error on main. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | Merge pull request #14584 from Luap99/lint-systemdOpenShift Merge Robot2022-06-15
|\ \ \ | | | | | | | | golangci-lint: add systemd build tag
| * | | golangci-lint: add systemd build tagPaul Holzinger2022-06-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Lint the systemd code and fix the reported problems. The remoteclient tag is no longer used so I just removed it. [NO NEW TESTS NEEDED] Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | | Merge pull request #14585 from Luap99/nolintopenshift-ci[bot]2022-06-14
|\ \ \ \ | | | | | | | | | | golangci-lint: enable nolintlint
| * | | | golangci-lint: enable nolintlintPaul Holzinger2022-06-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The nolintlint linter does not deny the use of `//nolint` Instead it allows us to enforce a common nolint style: - force that a linter name must be specified - do not add a space between `//` and `nolint` - make sure nolint is only used when there is actually a problem Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | | | Merge pull request #14582 from giuseppe/no-create-containerenv-if-run-volumeopenshift-ci[bot]2022-06-14
|\ \ \ \ \ | |/ / / / |/| | | | container: do not create .containerenv with -v SRC:/run
| * | | | container: do not create .containerenv with -v SRC:/runGiuseppe Scrivano2022-06-14
| | |/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | if /run is on a volume do not create the file /run/.containerenv as it would leak outside of the container. Closes: https://github.com/containers/podman/issues/14577 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | | | Merge pull request #14580 from jakecorrenti/stats-on-non-running-containeropenshift-ci[bot]2022-06-14
|\ \ \ \ | |_|/ / |/| | | Non-running containers now report statistics via the `podman stats`
| * | | Non-running containers now report statistics via the `podman stats`Jake Correnti2022-06-13
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | command Previously, if a container was not running, and the user ran the `podman stats` command, an error would be reported: `Error: container state improper`. Podman now reports stats as the fields' default values for their respective type if the container is not running: ``` $ podman stats --no-stream demo ID NAME CPU % MEM USAGE / LIMIT MEM % NET IO BLOCK IO PIDS CPU TIME AVG CPU % 4b4bf8ce84ed demo 0.00% 0B / 0B 0.00% 0B / 0B 0B / 0B 0 0s 0.00% ``` Closes: #14498 Signed-off-by: Jake Correnti <jcorrenti13@gmail.com>
* | | | podman cp: do not overwrite non-dirs with dirs and vice versaValentin Rothberg2022-06-10
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | Add a new `--overwrite` flag to `podman cp` to allow for overwriting in case existing users depend on the behavior; they will have a workaround. By default, the flag is turned off to be compatible with Docker and to have a more sane behavior. Fixes: #14420 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
* | | Merge pull request #14480 from cdoern/infraOpenShift Merge Robot2022-06-09
|\ \ \ | | | | | | | | patch for pod host networking & other host namespace handling
| * | | patch for pod host networking & other host namespace handlingcdoern2022-06-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | this patch included additonal host namespace checks when creating a ctr as well as fixing of the tests to check /proc/self/ns/net see #14461 Signed-off-by: cdoern <cdoern@redhat.com>
* | | | Do not error on signalling a just-stopped containerMatthew Heon2022-06-09
| |_|/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous PR #12394 tried to address this, but made a mistake: containers that have just exited do not move to the Exited state but rather the Stopped state - as such, the code would never have run (there is no way we start `podman kill`, and the container transitions to Exited while we are doing it - that requires holding the container lock, which Kill already does). Fix the code to check Stopped as well (we omit Exited entirely but it's a cheap check and our state logic could change in the future). Also, return an error, instead of exiting cleanly - the Kill failed, after all. ErrCtrStateInvalid is already handled by the sig-proxy logic so there won't be issues. [NO NEW TESTS NEEDED] This fixes a race that I cannot reproduce myself, and I have no idea how we'd repro in CI. Signed-off-by: Matthew Heon <mheon@redhat.com>
* | | Merge pull request #14220 from Luap99/resolvconfOpenShift Merge Robot2022-06-07
|\ \ \ | | | | | | | | use resolvconf package from c/common/libnetwork
| * | | use resolvconf package from c/common/libnetworkPaul Holzinger2022-06-07
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Podman and Buildah should use the same code the generate the resolv.conf file. This mostly moved the podman code into c/common and created a better API for it so buildah can use it as well. [NO NEW TESTS NEEDED] All existing tests should continue to pass. Fixes #13599 (There is no way to test this in CI without breaking the hosts resolv.conf) Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | Merge pull request #14483 from ↵OpenShift Merge Robot2022-06-07
|\ \ \ | | | | | | | | | | | | | | | | jakecorrenti/restart-privelaged-containers-after-host-device-change Privileged containers can now restart if the host devices change
| * | | Privileged containers can now restart if the host devices changeJake Correnti2022-06-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a privileged container is running, stops, and the devices on the host change, such as a USB device is unplugged, then a container would no longer start. Previously, the devices from the host were only being added to the container once: when the container was created. Now, this happens every time the container starts. I did this by adding a boolean to the container config that indicates whether to mount all of the devices or not, which can be set via an option. During spec generation, if the `MountAllDevices` option is set in the container config, all host devices are added to the container. Additionally, a couple of functions from `pkg/specgen/generate/config_linux.go` were moved into `pkg/util/utils_linux.go` as they were needed in multiple packages. Closes #13899 Signed-off-by: Jake Correnti <jcorrenti13@gmail.com>
* | | | libpod: store network status when userns is usedPaul Holzinger2022-06-07
| |/ / |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a container with a userns is created the network setup is special. Normally the netns is setup before the oci runtime container is created, however with a userns the container is created first and then the network is setup. In the second case we never saved the container state afterwards. Because of it, podman inspect would not show the network info and network teardown will not happen. This worked with local podman because there was a save() call later in the code path which then also saved the network status. But in the podman API code path this save never happened thus all containers started via API had this problem. Fixes #14465 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | Merge pull request #14499 from giuseppe/make-error-clearerOpenShift Merge Robot2022-06-07
|\ \ \ | | | | | | | | runtime: make error clearer
| * | | runtime: make error clearerGiuseppe Scrivano2022-06-06
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | make the error clearer and state that images created by other tools might not be visible to Podman when it overrides the graph driver. Closes: https://github.com/containers/podman/issues/13970 [NO NEW TESTS NEEDED] Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | | | overlay-volumes: add support for non-volatile upperdir,workdir for anonymous ↵Aditya R2022-06-06
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | volumes Similar feature was added for named overlay volumes here: https://github.com/containers/podman/pull/12712 Following PR just mimics similar feature for anonymous volumes. Often users want their anonymous overlayed volumes to be `non-volatile` in nature that means that same `upper` dir can be re-used by one or more containers but overall of nature of volumes still have to be overlay so work done is still on a overlay not on the actual volume. Following PR adds support for more advanced options i.e custom `workdir` and `upperdir` for overlayed volumes. So that users can re-use `workdir` and `upperdir` across new containers as well. Usage ```console podman run -it -v /some/path:/data:O,upperdir=/path/persistant/upper,workdir=/path/persistant/work alpine sh ``` Signed-off-by: Aditya R <arajan@redhat.com>
* | | Merge pull request #14477 from Luap99/partial-logsOpenShift Merge Robot2022-06-03
|\ \ \ | | | | | | | | podman logs k8s-file: do not reassemble partial log lines
| * | | podman logs k8s-file: do not reassemble partial log linesPaul Holzinger2022-06-03
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The backend should not convert partial lines to full log lines. While this works for most cases it cannot work when the last line is partial since it will just be lost. The frontend logic can already display partial lines correctly. The journald driver also works correctly since it does not such conversion. Fixes #14458 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | Merge pull request #14466 from mheon/fix_9075OpenShift Merge Robot2022-06-03
|\ \ \ | |/ / |/| | Improve robustness of `podman system reset`
| * | Improve robustness of `podman system reset`Matthew Heon2022-06-03
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Firstly, reset is now managed by the runtime itself as a part of initialization. This ensures that it can be used even with runtimes that would otherwise fail to be created - most notably, when the user has changed a core path (runroot/root/tmpdir/staticdir). Secondly, we now attempt a best-effort removal even if the store completely fails to be configured. Third, we now hold the alive lock for the entire reset operation. This ensures that no other Podman process can start while we are running a system reset, and removes any possibility of a race where a user tries to create containers or pull images while we are trying to perform a reset. [NO NEW TESTS NEEDED] we do not test reset last I checked. Fixes #9075 Signed-off-by: Matthew Heon <mheon@redhat.com>
* | Merge pull request #14384 from mheon/move_attachOpenShift Merge Robot2022-06-02
|\ \ | | | | | | Move Attach under the OCI Runtime interface
| * | Move Attach under the OCI Runtime interfaceMatthew Heon2022-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With conmon-rs on the horizon, we need to disentangle Libpod from legacy Conmon to the greatest extent possible. There are definitely opportunities for codesharing between the two, but we have to assume the implementations will be largely disjoint given the different architectures. Fortunately, most of the work has already been done in the past. The conmon-managed OCI runtime mostly sits behind an interface, with a few exceptions - the most notable of those being attach. This PR thus moves Attach behind the interface, to ensure that we can have attach implementations that don't use our existing unix socket streaming if necessary. Still to-do is conmon cleanup. There's a lot of code that removes Conmon-specific files, or kills the Conmon PID, and all of it will need to be refactored behind the interface. [NO NEW TESTS NEEDED] Just moving some things around. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* | | Merge pull request #14395 from vrothberg/healthcheck-fixOpenShift Merge Robot2022-06-02
|\ \ \ | | | | | | | | healthcheck: wait for systemd operations
| * | | healthcheck: wait for systemd operationsValentin Rothberg2022-05-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make sure to wait for the systemd operations to finish when starting/stopping healtcheck timers and services. Also make sure to stop the timer before the service to avoid a race with the timer. [NO NEW TESTS NEEDED] since it is a non-functional change and existing tests are expected to pass. Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
* | | | Merge pull request #14421 from Luap99/statsOpenShift Merge Robot2022-06-02
|\ \ \ \ | |_|_|/ |/| | | podman stats: work with network connect/disconnect
| * | | podman stats: work with network connect/disconnectPaul Holzinger2022-05-31
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Hardcoding the interface name is a bad idea. We have no control over the actual interface name since the user can change it. The correct thing is to read them from the network status. Since the contianer can have more than one interface we have to add the RX/TX values. The other values are currently not used. For podman 5.0 we should change it so that the API can return the statistics per interface and the client should sum the TX/RX for the command output. This is what docker is doing. Fixes #13824 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | | fix podman container restore without CreateNetNSPaul Holzinger2022-05-31
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a container does not use the default podman netns, for example --network none or --network ns:/path a restore would fail because the specgen check validates that c.config.StaticMAC is nil but the unmarshaller sets it to an empty slice. While we could make the check use len() > 0 I feel like it is more common to check with != nil for ip and mac addresses. Adding omitempty tag makes the json marshal/unmarshal work correctly. This should not cause any issues. Fixes #14389 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | | Merge pull request #14383 from jwhonce/wip/info_todoOpenShift Merge Robot2022-05-27
|\ \ \ | | | | | | | | Add Authorization field to Plugins for Info
| * | | Refactor populating uptimeJhon Honce2022-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Refactor populating uptime field to use standard library parsing and math for populating the hour, minute, seconds fields. Note: the go-humanize package does not cover time.Duration just time.time. ```release-note NONE ``` [NO NEW TESTS NEEDED] Signed-off-by: Jhon Honce <jhonce@redhat.com>
| * | | Add Authorixation field to Plugins for InfoJhon Honce2022-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The Authorization field lists the plugins for granting access to the Docker daemon. This field will always be nil for Podman as there is no daemon. The field is included for compatibility. ```release-note NONE ``` [NO NEW TESTS NEEDED] Signed-off-by: Jhon Honce <jhonce@redhat.com>
* | | | Add API support for NoOverwriteDirNonDirJhon Honce2022-05-26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Update method signatures and structs to pass option to buildah code ```release-note NONE ``` [NO NEW TESTS NEEDED] Signed-off-by: Jhon Honce <jhonce@redhat.com>
* | | | Fix swagger model of `InspectPodResponse`Jakob Ahrer2022-05-26
|/ / / | | | | | | | | | | | | | | | | | | | | | | | | `net.IP` gets marshalled as `string` and not `[]uint8` [NO TESTS NEEDED] [NO NEW TESTS NEEDED] Signed-off-by: Jakob Ahrer <jakob@ahrer.dev>
* | | Merge pull request #14369 from mheon/fixmes_2OpenShift Merge Robot2022-05-26
|\ \ \ | | | | | | | | Remove more FIXMEs
| * | | Remove more FIXMEsMatthew Heon2022-05-25
| | |/ | |/| | | | | | | | | | | | | | | | | | | | | | Mostly, just removing the comments. These either have been done, or are no longer a good idea. No code changes. [NO NEW TESTS NEEDED] as such. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* / | First batch of resolutions to FIXMEsMatthew Heon2022-05-25
|/ / | | | | | | | | | | | | | | | | Most of these are no longer relevant, just drop the comments. Most notable change: allow `podman kill` on paused containers. Works just fine when I test it. Signed-off-by: Matthew Heon <mheon@redhat.com>