aboutsummaryrefslogtreecommitdiff
path: root/libpod/state.go
Commit message (Collapse)AuthorAge
* libpod: fix wait and exit-code logicValentin Rothberg2022-06-23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit addresses three intertwined bugs to fix an issue when using Gitlab runner on Podman. The three bug fixes are not split into separate commits as tests won't pass otherwise; avoidable noise when bisecting future issues. 1) Podman conflated states: even when asking to wait for the `exited` state, Podman returned as soon as a container transitioned to `stopped`. The issues surfaced in Gitlab tests to fail [1] as `conmon`'s buffers have not (yet) been emptied when attaching to a container right after a wait. The race window was extremely narrow, and I only managed to reproduce with the Gitlab runner [1] unit tests. 2) The clearer separation between `exited` and `stopped` revealed a race condition predating the changes. If a container is configured for autoremoval (e.g., via `run --rm`), the "run" process competes with the "cleanup" process running in the background. The window of the race condition was sufficiently large that the "cleanup" process has already removed the container and storage before the "run" process could read the exit code and hence waited indefinitely. Address the exit-code race condition by recording exit codes in the main libpod database. Exit codes can now be read from a database. When waiting for a container to exit, Podman first waits for the container to transition to `exited` and will then query the database for its exit code. Outdated exit codes are pruned during cleanup (i.e., non-performance critical) and when refreshing the database after a reboot. An exit code is considered outdated when it is older than 5 minutes. While the race condition predates this change, the waiting process has apparently always been fast enough in catching the exit code due to issue 1): `exited` and `stopped` were conflated. The waiting process hence caught the exit code after the container transitioned to `stopped` but before it `exited` and got removed. 3) With 1) and 2), Podman is now waiting for a container to properly transition to the `exited` state. Some tests did not pass after 1) and 2) which revealed the third bug: `conmon` was executed with its working directory pointing to the OCI runtime bundle of the container. The changed working directory broke resolving relative paths in the "cleanup" process. The "cleanup" process error'ed before actually cleaning up the container and waiting "main" process ran indefinitely - or until hitting a timeout. Fix the issue by executing `conmon` with the same working directory as Podman. Note that fixing 3) *may* address a number of issues we have seen in the past where for *some* reason cleanup processes did not fire. [1] https://gitlab.com/gitlab-org/gitlab-runner/-/issues/27119#note_970712864 Signed-off-by: Valentin Rothberg <vrothberg@redhat.com> [MH: Minor reword of commit message] Signed-off-by: Matthew Heon <mheon@redhat.com>
* use libnetwork from c/commonPaul Holzinger2022-01-12
| | | | | | | | The libpod/network packages were moved to c/common so that buildah can use it as well. To prevent duplication use it in podman as well and remove it from here. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* network db: add new strucutre to container createPaul Holzinger2021-12-14
| | | | | | | | | | Make sure we create new containers in the db with the correct structure. Also remove some unneeded code for alias handling. We no longer need this functions. The specgen format has not been changed for now. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* network db rewrite: migrate existing settingsPaul Holzinger2021-12-14
| | | | | | | | | | | | | | The new network db structure stores everything in the networks bucket. Previously some network settings were not written the the network bucket and only stored in the container config. Instead of the old format which used the container ID as value in the networks buckets we now use the PerNetworkoptions struct there. To migrate existing users we use the state.GetNetworks() function. If it fails to read the new format it will automatically migrate the old config format to the new one. This is allows a flawless migration path. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Rewrite Rename backend in a more atomic fashionMatthew Heon2021-03-02
| | | | | | | | | | | | | Move the core of renaming logic into the DB. This guarantees a lot more atomicity than we have right now (our current solution, removing the container from the DB and re-creating it, is *VERY* not atomic and prone to leaving a corrupted state behind if things go wrong. Moving things into the DB allows us to remove most, but not all, of this - there's still a potential scenario where the c/storage rename fails but the Podman rename succeeds, and we end up with a mismatched state. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add support for network connect / disconnect to DBMatthew Heon2020-11-11
| | | | | | | | | | | | | | | | | | | | | | | | Convert the existing network aliases set/remove code to network connect and disconnect. We can no longer modify aliases for an existing network, but we can add and remove entire networks. As part of this, we need to add a new function to retrieve current aliases the container is connected to (we had a table for this as of the first aliases PR, but it was not externally exposed). At the same time, remove all deconflicting logic for aliases. Docker does absolutely no checks of this nature, and allows two containers to have the same aliases, aliases that conflict with container names, etc - it's just left to DNS to return all the IP addresses, and presumably we round-robin from there? Most tests for the existing code had to be removed because of this. Convert all uses of the old container config.Networks field, which previously included all networks in the container, to use the new DB table. This ensures we actually get an up-to-date list of in-use networks. Also, add network aliases to the output of `podman inspect`. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add tests for network aliasesMatthew Heon2020-11-03
| | | | | | | | | | | | As part of this, we need two new functions, for retrieving all aliases for a network and removing all aliases for a network, both required to test. Also, rework handling for some things the tests discovered were broken (notably conflicts between container name and existing aliases). Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add a way to retrieve all network aliases for a ctrMatthew Heon2020-10-27
| | | | | | | | | The original interface only allowed retrieving aliases for a specific network, not for all networks. This will allow aliases to be retrieved for every network the container is present in, in a single DB operation. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add network aliases for containers to DBMatthew Heon2020-10-27
| | | | | | | | | | | | | | | | | | | | | | | This adds the database backend for network aliases. Aliases are additional names for a container that are used with the CNI dnsname plugin - the container will be accessible by these names in addition to its name. Aliases are allowed to change over time as the container connects to and disconnects from networks. Aliases are implemented as another bucket in the database to register all aliases, plus two buckets for each container (one to hold connected CNI networks, a second to hold its aliases). The aliases are only unique per-network, to the global and per-container aliases buckets have a sub-bucket for each CNI network that has aliases, and the aliases are stored within that sub-bucket. Aliases are formatted as alias (key) to container ID (value) in both cases. Three DB functions are defined for aliases: retrieving current aliases for a given network, setting aliases for a given network, and removing all aliases for a given network. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Unconditionally retrieve pod names via APIMatthew Heon2020-08-10
| | | | | | | | | | | | | | | | | | The ListContainers API previously had a Pod parameter, which determined if pod name was returned (but, notably, not Pod ID, which was returned unconditionally). This was fairly confusing, so we decided to deprecate/remove the parameter and return it unconditionally. To do this without serious performance implications, we need to avoid expensive JSON decodes of pod configuration in the DB. The way our Bolt tables are structured, retrieving name given ID is actually quite cheap, but we did not expose this via the Libpod API. Add a new GetName API to do this. Fixes #7214 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add support for containers.confDaniel J Walsh2020-03-27
| | | | | | | vendor in c/common config pkg for containers.conf Signed-off-by: Qi Wang qiwan@redhat.com Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Add structure for new exec session tracking to DBMatthew Heon2020-03-18
| | | | | | | | | | | | | | | | | | | | | | | As part of the rework of exec sessions, we need to address them independently of containers. In the new API, we need to be able to fetch them by their ID, regardless of what container they are associated with. Unfortunately, our existing exec sessions are tied to individual containers; there's no way to tell what container a session belongs to and retrieve it without getting every exec session for every container. This adds a pointer to the container an exec session is associated with to the database. The sessions themselves are still stored in the container. Exec-related APIs have been restructured to work with the new database representation. The originally monolithic API has been split into a number of smaller calls to allow more fine-grained control of lifecycle. Support for legacy exec sessions has been retained, but in a deprecated fashion; we should remove this in a few releases. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* add libpod/configValentin Rothberg2019-10-31
| | | | | | | | | | | | Refactor the `RuntimeConfig` along with related code from libpod into libpod/config. Note that this is a first step of consolidating code into more coherent packages to make the code more maintainable and less prone to regressions on the long runs. Some libpod definitions were moved to `libpod/define` to resolve circular dependencies. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* rm: add containers eviction with `rm --force`Marco Vedovati2019-09-25
| | | | | | | | | Add ability to evict a container when it becomes unusable. This may happen when the host setup changes after a container creation, making it impossible for that container to be used or removed. Evicting a container is done using the `rm --force` command. Signed-off-by: Marco Vedovati <mvedovati@suse.com>
* Add function for looking up volumes by partial nameMatthew Heon2019-09-09
| | | | | | | | | | This isn't included in Docker, but seems handy enough. Use the new API for 'volume rm' and 'volume inspect'. Fixes #3891 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add volume stateMatthew Heon2019-09-05
| | | | | | | | | | | | We need to be able to track the number of times a volume has been mounted for tmpfs/nfs/etc volumes. As such, we need a mutable state for volumes. Add one, with the expected update/save methods in both states. There is backwards compat here, in that older volumes without a state will still be accepted. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Re-add locks to volumes.Matthew Heon2019-08-28
| | | | | | | | | | This will require a 'podman system renumber' after being applied to get lock numbers for existing volumes. Add the DB backend code for rewriting volume configs and use it for updating lock numbers as part of 'system renumber'. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Switch Libpod over to new explicit named volumesMatthew Heon2019-04-04
| | | | | | | | | | | | | This swaps the previous handling (parse all volume mounts on the container and look for ones that might refer to named volumes) for the new, explicit named volume lists stored per-container. It also deprecates force-removing volumes that are in use. I don't know how we want to handle this yet, but leaving containers that depend on a volume that no longer exists is definitely not correct. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Validate VolumePath against DB configurationMatthew Heon2019-02-26
| | | | | | | | | If this doesn't match, we end up not being able to access named volumes mounted into containers, which is bad. Use the same validation that we use for other critical paths to ensure this one also matches. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add ability to rewrite pod configs in the databaseMatthew Heon2019-02-21
| | | | | | Necessary for rewriting lock IDs as part of renumber. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add a function for overwriting container configMatthew Heon2019-02-21
| | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add "podman volume" commandumohnani82018-12-06
| | | | | | | | | | | | | | | Add support for podman volume and its subcommands. The commands supported are: podman volume create podman volume inspect podman volume ls podman volume rm podman volume prune This is a tool to manage volumes used by podman. For now it only handle named volumes, but eventually it will handle all volumes used by podman. Signed-off-by: umohnani8 <umohnani@redhat.com>
* Fix typoMatthew Heon2018-12-03
| | | | Signed-off-by: Matthew Heon <mheon@redhat.com>
* Fix gofmt and lintMatthew Heon2018-12-02
| | | | Signed-off-by: Matthew Heon <mheon@redhat.com>
* Make DB config validation an explicit stepMatthew Heon2018-12-02
| | | | | | | | | | Previously, we implicitly validated runtime configuration against what was stored in the database as part of database init. Make this an explicit step, so we can call it after the database has been initialized. This will allow us to retrieve paths from the database and use them to overwrite our defaults if they differ. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Add ability to retrieve runtime configuration from DBMatthew Heon2018-12-02
| | | | | | | | | | When we create a Libpod database, we store a number of runtime configuration fields in it. If we can retrieve those, we can use them to configure the runtime to match the DB instead of inbuilt defaults, helping to ensure that we don't error in cases where our compiled-in defaults changed. Signed-off-by: Matthew Heon <mheon@redhat.com>
* Do not fetch pod and ctr State on retrieval in BoltMatthew Heon2018-07-31
| | | | | | | | | | | | | | | | | It's not necessary to fill in state immediately, as we'll be overwriting it on any API call accessing it thanks to syncContainer(). It is also causing races when we fetch it without holding the container lock (which syncContainer() does). As such, just don't retrieve the state on initial pull from the database with Bolt. Also, refactor some Linux-specific netns handling functions out of container_internal_linux.go into boltdb_linux.go. Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #1186 Approved by: rhatdan
* Update documentation for the State interfaceMatthew Heon2018-07-24
| | | | | | | Include details on how namespaces interact with the state. Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
* Add constraint that dependencies must be in the same nsMatthew Heon2018-07-24
| | | | | | | Dependency containers must be in the same namespace, to ensure there are never problems resolving a dependency. Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
* Spell check strings and commentsJhon Honce2018-05-25
| | | | | | | Signed-off-by: Jhon Honce <jhonce@redhat.com> Closes: #831 Approved by: rhatdan
* Add pod stateMatthew Heon2018-05-17
| | | | | | | | | | Add a mutable state to pods, and database backend sutable for modifying and updating said state. Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #784 Approved by: rhatdan
* Containers in a pod can only join namespaces in that podMatthew Heon2018-02-12
| | | | | | | | | | | | | | This solves some dependency problems in the state, and makes sense from a design standpoint. Containers not in a pod can still depend on the namespaces of containers joined to a pod, which we might also want to change in the future. Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #184 Approved by: baude
* Add pod removal codeMatthew Heon2018-02-09
| | | | | | | Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #268 Approved by: rhatdan
* Tear out pod containers map. Instead rely on stateMatthew Heon2018-02-09
| | | | | | | | | | This ensures that there is only one canonical place where containers in a pod are stored, in the state itself. Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #268 Approved by: rhatdan
* Change handling for pods in libpod stateMatthew Heon2018-01-17
| | | | | | | | | | Add new functions to update pods and add/remove containers from them Use these new functions in place of manually modifying pods Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #229 Approved by: rhatdan
* Add ability to retrieve a pod's container from the stateMatthew Heon2018-01-17
| | | | | | | Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #229 Approved by: rhatdan
* Add ability for states to track container dependenciesMatthew Heon2018-01-16
| | | | | | | | | | | Also prevent containers with dependencies from being removed from in memory states. SQLite already enforced this via FOREIGN KEY constraints. Signed-off-by: Matthew Heon <matthew.heon@gmail.com> Closes: #220 Approved by: rhatdan
* Add ability to refresh state in DBMatthew Heon2017-12-07
| | | | | | | Also, ensure we always recreate runtime spec so our net namespace paths will be correct Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
* Wire SQL backed state into rest of libpodMatthew Heon2017-11-18
| | | | Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
* Compile-tested implementation of SQL-backed stateMatthew Heon2017-11-18
| | | | Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
* Initial checkin from CRI-O repoMatthew Heon2017-11-01
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>