summaryrefslogtreecommitdiff
path: root/libpod/define/errors.go
Commit message (Collapse)AuthorAge
* libpod: fix wait and exit-code logicValentin Rothberg2022-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit addresses three intertwined bugs to fix an issue when using Gitlab runner on Podman. The three bug fixes are not split into separate commits as tests won't pass otherwise; avoidable noise when bisecting future issues. 1) Podman conflated states: even when asking to wait for the `exited` state, Podman returned as soon as a container transitioned to `stopped`. The issues surfaced in Gitlab tests to fail [1] as `conmon`'s buffers have not (yet) been emptied when attaching to a container right after a wait. The race window was extremely narrow, and I only managed to reproduce with the Gitlab runner [1] unit tests. 2) The clearer separation between `exited` and `stopped` revealed a race condition predating the changes. If a container is configured for autoremoval (e.g., via `run --rm`), the "run" process competes with the "cleanup" process running in the background. The window of the race condition was sufficiently large that the "cleanup" process has already removed the container and storage before the "run" process could read the exit code and hence waited indefinitely. Address the exit-code race condition by recording exit codes in the main libpod database. Exit codes can now be read from a database. When waiting for a container to exit, Podman first waits for the container to transition to `exited` and will then query the database for its exit code. Outdated exit codes are pruned during cleanup (i.e., non-performance critical) and when refreshing the database after a reboot. An exit code is considered outdated when it is older than 5 minutes. While the race condition predates this change, the waiting process has apparently always been fast enough in catching the exit code due to issue 1): `exited` and `stopped` were conflated. The waiting process hence caught the exit code after the container transitioned to `stopped` but before it `exited` and got removed. 3) With 1) and 2), Podman is now waiting for a container to properly transition to the `exited` state. Some tests did not pass after 1) and 2) which revealed the third bug: `conmon` was executed with its working directory pointing to the OCI runtime bundle of the container. The changed working directory broke resolving relative paths in the "cleanup" process. The "cleanup" process error'ed before actually cleaning up the container and waiting "main" process ran indefinitely - or until hitting a timeout. Fix the issue by executing `conmon` with the same working directory as Podman. Note that fixing 3) *may* address a number of issues we have seen in the past where for *some* reason cleanup processes did not fire. [1] https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27119#note_970712864 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com> [MH: Minor reword of commit message] Signed-off-by: Matthew Heon <mheon@redhat.com>
* Standardize on capatalized CgroupsDaniel J Walsh2022-01-14
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* use libnetwork from c/commonPaul Holzinger2022-01-12
| | | | | | | | The libpod/network packages were moved to c/common so that buildah can use it as well. To prevent duplication use it in podman as well and remove it from here. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Improve OCI Runtime errorDaniel J Walsh2021-05-22
| | | | | | | | | | | | | ErrOCIRuntimeNotFound error is misleading. Try to make it more understandable to the user that the OCI Runtime IE crun or runc is not missing, but the command they attempted to run within the container is missing. [NO TESTS NEEDED] Regular tests should handle this. Fixes: https://github.com/containers/podman/issues/10432 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* podman network reload add rootless supportPaul Holzinger2021-05-17
| | | | | | | | | | Allow podman network reload to be run as rootless user. While it is unlikely that the iptable rules are flushed inside the rootless cni namespace, it could still happen. Also fix podman network reload --all to ignore errors when a container does not have the bridge network mode, e.g. slirp4netns. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* migrate Podman to containers/common/libimageValentin Rothberg2021-05-05
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Migrate the Podman code base over to `common/libimage` which replaces `libpod/image` and a lot of glue code entirely. Note that I tried to leave bread crumbs for changed tests. Miscellaneous changes: * Some errors yield different messages which required to alter some tests. * I fixed some pre-existing issues in the code. Others were marked as `//TODO`s to prevent the PR from exploding. * The `NamesHistory` of an image is returned as is from the storage. Previously, we did some filtering which I think is undesirable. Instead we should return the data as stored in the storage. * Touched handlers use the ABI interfaces where possible. * Local image resolution: previously Podman would match "foo" on "myfoo". This behaviour has been changed and Podman will now only match on repository boundaries such that "foo" would match "my/foo" but not "myfoo". I consider the old behaviour to be a bug, at the very least an exotic corner case. * Futhermore, "foo:none" does *not* resolve to a local image "foo" without tag anymore. It's a hill I am (almost) willing to die on. * `image prune` prints the IDs of pruned images. Previously, in some cases, the names were printed instead. The API clearly states ID, so we should stick to it. * Compat endpoint image removal with _force_ deletes the entire not only the specified tag. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* runtime: return findConmon to libpodPeter Hunt2021-04-16
| | | | | | | | I believe moving the conmon probing code to c/common wasn't the best strategy. Different container engines have different requrements of which conmon version is required (based on what flags they use). Signed-off-by: Peter Hunt <pehunt@redhat.com>
* Add --requires flag to podman run/createMatthew Heon2021-04-06
| | | | | | | | | | | | | | | | | | | | Podman has, for a long time, had an internal concept of dependency management, used mainly to ensure that pod infra containers are started before any other container in the pod. We also have the ability to recursively start these dependencies, which we use to ensure that `podman start` on a container in a pod will not fail because the infra container is stopped. We have not, however, exposed these via the command line until now. Add a `--requires` flag to `podman run` and `podman create` to allow users to manually specify dependency containers. These containers must be running before the container will start. Also, make recursive starting with `podman start` default so we can start these containers and their dependencies easily. Fixes #9250 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Fix per review requestMatej Vasek2021-02-04
| | | | Signed-off-by: Matej Vasek <mvasek@redhat.com>
* Initial implementation of volume pluginsMatthew Heon2021-01-14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This implements support for mounting and unmounting volumes backed by volume plugins. Support for actually retrieving plugins requires a pull request to land in containers.conf and then that to be vendored, and as such is not yet ready. Given this, this code is only compile tested. However, the code for everything past retrieving the plugin has been written - there is support for creating, removing, mounting, and unmounting volumes, which should allow full functionality once the c/common PR is merged. A major change is the signature of the MountPoint function for volumes, which now, by necessity, returns an error. Named volumes managed by a plugin do not have a mountpoint we control; instead, it is managed entirely by the plugin. As such, we need to cache the path in the DB, and calls to retrieve it now need to access the DB (and may fail as such). Notably absent is support for SELinux relabelling and chowning these volumes. Given that we don't manage the mountpoint for these volumes, I am extremely reluctant to try and modify it - we could easily break the plugin trying to chown or relabel it. Also, we had no less than *5* separate implementations of inspecting a volume floating around in pkg/infra/abi and pkg/api/handlers/libpod. And none of them used volume.Inspect(), the only correct way of inspecting volumes. Remove them all and consolidate to using the correct way. Compat API is likely still doing things the wrong way, but that is an issue for another day. Fixes #4304 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Expose security attribute errors with their own messagesJuan Antonio Osorio Robles2021-01-12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This creates error objects for runtime errors that might come from the runtime. Thus, indicating to users that the place to debug should be in the security attributes of the container. When creating a container with a SELinux label that doesn't exist, we get a fairly cryptic error message: ``` $ podman run --security-opt label=type:my_container.process -it fedora bash Error: OCI runtime error: write file `/proc/thread-self/attr/exec`: Invalid argument ``` This instead handles any errors coming from LSM's `/proc` API and enhances the error message with a relevant indicator that it's related to the container's security attributes. A sample run looks as follows: ``` $ bin/podman run --security-opt label=type:my_container.process -it fedora bash Error: `/proc/thread-self/attr/exec`: OCI runtime error: unable to assign security attribute ``` With `debug` log level enabled it would be: ``` Error: write file `/proc/thread-self/attr/exec`: Invalid argument: OCI runtime error: unable to assign security attribute ``` Note that these errors wrap ErrOCIRuntime, so it's still possible to to compare these errors with `errors.Is/errors.As`. One advantage of this approach is that we could start handling these errors in a more efficient manner in the future. e.g. If a SELinux label doesn't exist (yet), we could retry until it becomes available. Signed-off-by: Juan Antonio Osorio Robles <jaosorior@redhat.com>
* add network connect|disconnect compat endpointsbaude2020-11-19
| | | | | | | | | | | this enables the ability to connect and disconnect a container from a given network. it is only for the compatibility layer. some code had to be refactored to avoid circular imports. additionally, tests are being deferred temporarily due to some incompatibility/bug in either docker-py or our stack. Signed-off-by: baude <bbaude@redhat.com>
* Add support for network connect / disconnect to DBMatthew Heon2020-11-11
| | | | | | | | | | | | | | | | | | | | | | | | Convert the existing network aliases set/remove code to network connect and disconnect. We can no longer modify aliases for an existing network, but we can add and remove entire networks. As part of this, we need to add a new function to retrieve current aliases the container is connected to (we had a table for this as of the first aliases PR, but it was not externally exposed). At the same time, remove all deconflicting logic for aliases. Docker does absolutely no checks of this nature, and allows two containers to have the same aliases, aliases that conflict with container names, etc - it's just left to DNS to return all the IP addresses, and presumably we round-robin from there? Most tests for the existing code had to be removed because of this. Convert all uses of the old container config.Networks field, which previously included all networks in the container, to use the new DB table. This ensures we actually get an up-to-date list of in-use networks. Also, add network aliases to the output of `podman inspect`. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Merge pull request #8156 from mheon/add_net_aliases_dbOpenShift Merge Robot2020-11-04
|\ | | | | Add network aliases for containers to DB
| * Add network aliases for containers to DBMatthew Heon2020-10-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds the database backend for network aliases. Aliases are additional names for a container that are used with the CNI dnsname plugin - the container will be accessible by these names in addition to its name. Aliases are allowed to change over time as the container connects to and disconnects from networks. Aliases are implemented as another bucket in the database to register all aliases, plus two buckets for each container (one to hold connected CNI networks, a second to hold its aliases). The aliases are only unique per-network, to the global and per-container aliases buckets have a sub-bucket for each CNI network that has aliases, and the aliases are stored within that sub-bucket. Aliases are formatted as alias (key) to container ID (value) in both cases. Three DB functions are defined for aliases: retrieving current aliases for a given network, setting aliases for a given network, and removing all aliases for a given network. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* | Merge pull request #8174 from rhatdan/errorsOpenShift Merge Robot2020-10-29
|\ \ | | | | | | Podman often reports OCI Runtime does not exist, even if it does
| * | Podman often reports OCI Runtime does not exist, even if it doesDaniel J Walsh2020-10-29
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the OCI Runtime tries to set certain settings in cgroups it can get the error "no such file or directory", the wrapper ends up reporting a bogus error like: ``` Request Failed(Internal Server Error): open io.max: No such file or directory: OCI runtime command not found error {"cause":"OCI runtime command not found error","message":"open io.max: No such file or directory: OCI runtime command not found error","response":500} ``` On first reading of this, you would think the OCI Runtime (crun or runc) were not found. But the error is actually reporting message":"open io.max: No such file or directory Which is what we want the user to concentrate on. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* / NewFromLocal can return multiple imagesDaniel J Walsh2020-10-28
|/ | | | | | | | | | If you use additional stores and pull the same image into writable stores, you can end up with the situation where you have the same image twice. This causes image exists to return the wrong error. It should return true in this situation rather then an error. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Fix handling of remove of bogus volumes, networks and PodsDaniel J Walsh2020-09-29
| | | | | | | | | | | | In podman containers rm and podman images rm, the commands exit with error code 1 if the object does not exists. This PR implements similar functionality to volumes, networks, and Pods. Similarly if volumes or Networks are in use by other containers, and return exit code 2. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Fix up errors found by codespellDaniel J Walsh2020-09-11
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Show c/storage (Buildah/CRI-O) containers in psDaniel J Walsh2020-09-09
| | | | | | | | | | | | | | | | | | | | | | | | | | | | The `podman ps --all` command will now show containers that are under the control of other c/storage container systems and the new `ps --storage` option will show only containers that are in c/storage but are not controlled by libpod. In the below examples, the '*working-container' entries were created by Buildah. ``` podman ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 9257ef8c786c docker.io/library/busybox:latest ls /etc 8 hours ago Exited (0) 8 hours ago gifted_jang d302c81856da docker.io/library/busybox:latest buildah 30 hours ago storage busybox-working-container 7a5a7b099d33 localhost/tom:latest ls -alF 30 hours ago Exited (0) 30 hours ago hopeful_hellman 01d601fca090 localhost/tom:latest ls -alf 30 hours ago Exited (1) 30 hours ago determined_panini ee58f429ff26 localhost/tom:latest buildah 33 hours ago storage alpine-working-container podman ps --external CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES d302c81856da docker.io/library/busybox:latest buildah 30 hours ago external busybox-working-container ee58f429ff26 localhost/tom:latest buildah 33 hours ago external alpine-working-container ``` Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com> Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* error when adding container to pod with network informationBrent Baude2020-08-21
| | | | | | | | | | | | because a pod's network information is dictated by the infra container at creation, a container cannot be created with network attributes. this has been difficult for users to understand. we now return an error when a container is being created inside a pod and passes any of the following attributes: * static IP (v4 and v6) * static mac * ports -p (i.e. -p 8080:80) * exposed ports (i.e. 222-225) * publish ports from image -P Signed-off-by: Brent Baude <bbaude@redhat.com>
* API returns 500 in case network is not found instead of 404zhangguanzhang2020-08-02
| | | | Signed-off-by: zhangguanzhang <zhangguanzhang@qq.com>
* Ensure libpod/define does not include libpod/imageMatthew Heon2020-07-31
| | | | | | | | | | | | | The define package under Libpod is intended to be an extremely minimal package, including constants and very little else. However, as a result of some legacy code, it was dragging in all of libpod/image (and, less significantly, the util package). Fortunately, this was just to ensure that error constants were not duplicating, and there's nothing preventing us from importing in the other direction and keeping libpod/define free of dependencies. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Switch all references to github.com/containers/libpod -> podmanDaniel J Walsh2020-07-28
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* move go module to v2Valentin Rothberg2020-07-06
| | | | | | | | | | | | | | | With the advent of Podman 2.0.0 we crossed the magical barrier of go modules. While we were able to continue importing all packages inside of the project, the project could not be vendored anymore from the outside. Move the go module to new major version and change all imports to `github.com/containers/libpod/v2`. The renaming of the imports was done via `gomove` [1]. [1] https://github.com/KSubedi/gomove Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Print errors from individual containers in podsMatthew Heon2020-07-02
| | | | | | | | | | | | | | | | | | | | | | | | | | | The infra/abi code for pods was written in a flawed way, assuming that the map[string]error containing individual container errors was only set when the global error for the pod function was nil; that is not accurate, and we are actually *guaranteed* to set the global error when any individual container errors. Thus, we'd never actually include individual container errors, because the infra code assumed that err being set meant everything failed and no container operations were attempted. We were originally setting the cause of the error to something nonsensical ("container already exists"), so I made a new error indicating that some containers in the pod failed. We can then ignore that error when building the report on the pod operation and actually return errors from individual containers. Unfortunately, this exposed another weakness of the infra code, which was discarding the container IDs. Errors from individual containers are not guaranteed to identify which container they came from, hence the use of map[string]error in the Pod API functions. Rather than restructuring the structs we return from pkg/infra, I just wrapped the returned errors with a message including the ID of the container. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* podman untag: error if tag doesn't existValentin Rothberg2020-06-24
| | | | | | | | | | | Throw an error if a specified tag does not exist. Also make sure that the user input is normalized as we already do for `podman tag`. To prevent regressions, add a set of end-to-end and systemd tests. Last but not least, update the docs and add bash completions. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Do not share container log driver for execMatthew Heon2020-06-17
| | | | | | | | | | | | | | | | | | | | | When the container uses journald logging, we don't want to automatically use the same driver for its exec sessions. If we do we will pollute the journal (particularly in the case of healthchecks) with large amounts of undesired logs. Instead, force exec sessions logs to file for now; we can add a log-driver flag later (we'll probably want to add a `podman logs` command that reads exec session logs at the same time). As part of this, add support for the new 'none' logs driver in Conmon. It will be the default log driver for exec sessions, and can be optionally selected for containers. Great thanks to Joe Gooch (mrwizard@dok.org) for adding support to Conmon for a null log driver, and wiring it in here. Fixes #6555 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Ensure Conmon is alive before waiting for exit fileMatthew Heon2020-06-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This came out of a conversation with Valentin about systemd-managed Podman. He discovered that unit files did not properly handle cases where Conmon was dead - the ExecStopPost `podman rm --force` line was not actually removing the container, but interestingly, adding a `podman cleanup --rm` line would remove it. Both of these commands do the same thing (minus the `podman cleanup --rm` command not force-removing running containers). Without a running Conmon instance, the container process is still running (assuming you killed Conmon with SIGKILL and it had no chance to kill the container it managed), but you can still kill the container itself with `podman stop` - Conmon is not involved, only the OCI Runtime. (`podman rm --force` and `podman stop` use the same code to kill the container). The problem comes when we want to get the container's exit code - we expect Conmon to make us an exit file, which it's obviously not going to do, being dead. The first `podman rm` would fail because of this, but importantly, it would (after failing to retrieve the exit code correctly) set container status to Exited, so that the second `podman cleanup` process would succeed. To make sure the first `podman rm --force` succeeds, we need to catch the case where Conmon is already dead, and instead of waiting for an exit file that will never come, immediately set the Stopped state and remove an error that can be caught and handled. Signed-off-by: Matthew Heon <mheon@redhat.com>
* V2 Restore rmi testsJhon Honce2020-04-22
| | | | | | | * Introduced define.ErrImageInUse to assist in determining the exit code without resorting string searches. Signed-off-by: Jhon Honce <jhonce@redhat.com>
* Add structure for new exec session tracking to DBMatthew Heon2020-03-18
| | | | | | | | | | | | | | | | | | | | | | | As part of the rework of exec sessions, we need to address them independently of containers. In the new API, we need to be able to fetch them by their ID, regardless of what container they are associated with. Unfortunately, our existing exec sessions are tied to individual containers; there's no way to tell what container a session belongs to and retrieve it without getting every exec session for every container. This adds a pointer to the container an exec session is associated with to the database. The sessions themselves are still stored in the container. Exec-related APIs have been restructured to work with the new database representation. The originally monolithic API has been split into a number of smaller calls to allow more fine-grained control of lifecycle. Support for legacy exec sessions has been retained, but in a deprecated fashion; we should remove this in a few releases. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add basic deadlock detection for container start/removeMatthew Heon2020-02-24
| | | | | | | | | | | | | | | We can easily tell if we're going to deadlock by comparing lock IDs before actually taking the lock. Add a few checks for this in common places where deadlocks might occur. This does not yet cover pod operations, where detection is more difficult (and costly) due to the number of locks being involved being higher than 2. Also, add some error wrapping on the Podman side, so we can tell people to use `system renumber` when it occurs. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Ensure volumes can be removed when they fail to unmountMatthew Heon2019-10-14
| | | | | | | | | | | | | | | | | | | | Also, ensure that we don't try to mount them without root - it appears that it can somehow not error and report that mount was successful when it clearly did not succeed, which can induce this case. We reuse the `--force` flag to indicate that a volume should be removed even after unmount errors. It seems fairly natural to expect that --force will remove a volume that is otherwise presenting problems. Finally, ignore EINVAL on unmount - if the mount point no longer exists our job is done. Fixes: #4247 Fixes: #4248 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* rm: add containers eviction with `rm --force`Marco Vedovati2019-09-25
| | | | | | | | | Add ability to evict a container when it becomes unusable. This may happen when the host setup changes after a container creation, making it impossible for that container to be used or removed. Evicting a container is done using the `rm --force` command. Signed-off-by: Marco Vedovati <mvedovati@suse.com>
* Add support for launching containers without CGroupsMatthew Heon2019-09-10
| | | | | | | This is mostly used with Systemd, which really wants to manage CGroups itself when managing containers via unit file. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add conmon probe to runtime constructionPeter Hunt2019-08-08
| | | | | | Now, when a user's conmon is out of date, podman will tell them Signed-off-by: Peter Hunt <pehunt@redhat.com>
* Implement conmon execPeter Hunt2019-07-22
| | | | | | | | | | | | | | | | | | | | | | This includes: Implement exec -i and fix some typos in description of -i docs pass failed runtime status to caller Add resize handling for a terminal connection Customize exec systemd-cgroup slice fix healthcheck fix top add --detach-keys Implement podman-remote exec (jhonce) * Cleanup some orphaned code (jhonce) adapt remote exec for conmon exec (pehunt) Fix healthcheck and exec to match docs Introduce two new OCIRuntime errors to more comprehensively describe situations in which the runtime can error Use these different errors in branching for exit code in healthcheck and exec Set conmon to use new api version Signed-off-by: Jhon Honce <jhonce@redhat.com> Signed-off-by: Peter Hunt <pehunt@redhat.com>
* golangci-lint round #3baude2019-07-21
| | | | | | | this is the third round of preparing to use the golangci-lint on our code base. Signed-off-by: baude <bbaude@redhat.com>
* remove libpod from mainbaude2019-06-25
the compilation demands of having libpod in main is a burden for the remote client compilations. to combat this, we should move the use of libpod structs, vars, constants, and functions into the adapter code where it will only be compiled by the local client. this should result in cleaner code organization and smaller binaries. it should also help if we ever need to compile the remote client on non-Linux operating systems natively (not cross-compiled). Signed-off-by: baude <bbaude@redhat.com>