| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
| |
improve the heuristic to detect the scope that was created for the container.
This is necessary with systemd running as PID 1, since it moves itself
to a different sub-cgroup, thus stats would not account for other
processes in the same container.
Closes: https://github.com/containers/podman/issues/12400
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Honor custom `target` if specified while running or creating containers
with secret `type=mount`.
Example:
`podman run -it --secret token,type=mount,target=TOKEN ubi8/ubi:latest
bash`
Signed-off-by: Aditya Rajan <arajan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OCICNI port format has one big problem: It does not support ranges.
So if a users forwards a range of 1k ports with podman run -p 1001-2000
we have to store each of the thousand ports individually as array element.
This bloats the db and makes the JSON encoding and decoding much slower.
In many places we already use a better port struct type which supports
ranges, e.g. `pkg/specgen` or the new network interface.
Because of this we have to do many runtime conversions between the two
port formats. If everything uses the new format we can skip the runtime
conversions.
This commit adds logic to replace all occurrences of the old format
with the new one. The database will automatically migrate the ports
to new format when the container config is read for the first time
after the update.
The `ParsePortMapping` function is `pkg/specgen/generate` has been
reworked to better work with the new format. The new logic is able
to deduplicate the given ports. This is necessary the ensure we
store them efficiently in the DB. The new code should also be more
performant than the old one.
To prove that the code is fast enough I added go benchmarks. Parsing
1 million ports took less than 0.5 seconds on my laptop.
Benchmark normalize PortMappings in specgen:
Please note that the 1 million ports are actually 20x 50k ranges
because we cannot have bigger ranges than 65535 ports.
```
$ go test -bench=. -benchmem ./pkg/specgen/generate/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/pkg/specgen/generate
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkParsePortMappingNoPorts-12 480821532 2.230 ns/op 0 B/op 0 allocs/op
BenchmarkParsePortMapping1-12 38972 30183 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMapping100-12 18752 60688 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMapping1k-12 3104 331719 ns/op 223840 B/op 3018 allocs/op
BenchmarkParsePortMapping10k-12 376 3122930 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMapping1m-12 3 390869926 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingReverse100-12 18940 63414 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMappingReverse1k-12 3015 362500 ns/op 223841 B/op 3018 allocs/op
BenchmarkParsePortMappingReverse10k-12 343 3318135 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMappingReverse1m-12 3 403392469 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingRange1-12 37635 28756 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange100-12 39604 28935 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1k-12 38384 29921 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange10k-12 29479 40381 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1m-12 927 1279369 ns/op 143022 B/op 164 allocs/op
PASS
ok github.com/containers/podman/v3/pkg/specgen/generate 25.492s
```
Benchmark convert old port format to new one:
```
go test -bench=. -benchmem ./libpod/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/libpod
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
Benchmark_ocicniPortsToNetTypesPortsNoPorts-12 663526126 1.663 ns/op 0 B/op 0 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1-12 7858082 141.9 ns/op 72 B/op 2 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10-12 2065347 571.0 ns/op 536 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts100-12 138478 8641 ns/op 4216 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1k-12 9414 120964 ns/op 41080 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10k-12 781 1490526 ns/op 401528 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1m-12 4 250579010 ns/op 40001656 B/op 4 allocs/op
PASS
ok github.com/containers/podman/v3/libpod 11.727s
```
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
| |
To avoid creating an expensive deep copy, create an internal function to
access the exec session.
[NO TESTS NEEDED]
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Access the container's spec field directly inside of libpod instead of
calling Spec() which in turn creates expensive JSON deep copies.
Accessing the field directly drops memory consumption of a simple
podman run --rm busybox true from ~700kB to ~600kB.
[NO TESTS NEEDED]
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Update CNI so we can match wrapped errors. This should silence ENOENT
warnings when trying to read the cni conflist files.
Fixes #10926
Because CNI v1.0.0 contains breaking changes we have to change some
import paths. Also we cannot update the CNI version used for the
conflist files created by `podman network create` because this would
require at least containernetwork-plugins v1.0.1 and a updated dnsname
plugin. Because this will take a while until it lands in most distros
we should not use this version. So keep using v0.4.0 for now.
The update from checkpoint-restore/checkpointctl is also required to
make sure it no longer uses CNI to read the network status.
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
added support for pod devices. The device gets added to the infra container and
recreated in all containers that join the pod.
This required a new container config item to keep track of the original device passed in by the user before
the path was parsed into the container device.
Signed-off-by: cdoern <cdoern@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
We do not use the ocicni code anymore so let's get rid of it. Only the
port struct is used but we can copy this into libpod network types so
we can debloat the binary.
The next step is to remove the OCICNI port mapping form the container
config and use the better PortMapping struct everywhere.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make use of the new network interface in libpod.
This commit contains several breaking changes:
- podman network create only outputs the new network name and not file
path.
- podman network ls shows the network driver instead of the cni version
and plugins.
- podman network inspect outputs the new network struct and not the cni
conflist.
- The bindings and libpod api endpoints have been changed to use the new
network structure.
The container network status is stored in a new field in the state. The
status should be received with the new `c.getNetworkStatus`. This will
migrate the old status to the new format. Therefore old containers should
contine to work correctly in all cases even when network connect/
disconnect is used.
New features:
- podman network reload keeps the ip and mac for more than one network.
- podman container restore keeps the ip and mac for more than one
network.
- The network create compat endpoint can now use more than one ipam
config.
The man pages and the swagger doc are updated to reflect the latest
changes.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Kubernetes has a concept of init containers that run and exit before
the regular containers in a pod are started. We added init containers
to podman pods as well. This patch adds support for generating init
containers in the kube yaml when a pod we are converting had init
containers. When playing a kube yaml, it detects an init container
and creates such a container in podman accordingly.
Note, only init containers created with the init type set to "always"
will be generated as the "once" option deletes the init container after
it has run and exited. Play kube will always creates init containers
with the "always" init container type.
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
|
|
|
|
|
|
|
|
| |
When inspecting a container, we now report whether the container
was stopped by a `podman checkpoint` operation via a new bool in
the State portion of inspected, `Checkpointed`.
Signed-off-by: Matthew Heon <mheon@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This leverages conmon's ability to proxy the SD-NOTIFY socket.
This prevents locking caused by OCI runtime blocking, waiting for
SD-NOTIFY messages, and instead passes the messages directly up
to the host.
NOTE: Also re-enable the auto-update tests which has been disabled due
to flakiness. With this change, Podman properly integrates into
systemd.
Fixes: #7316
Signed-off-by: Joseph Gooch <mrwizard@dok.org>
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
| |
[NO TESTS NEEDED] Just fixing spelling.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add the --userns flag to podman pod create and keep
track of the userns setting that pod was created with
so that all containers created within the pod will inherit
that userns setting.
Specifically we need to be able to launch a pod with
--userns=keep-id
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Urvashi Mohnani <umohnani@redhat.com>
|
|
|
|
| |
Signed-off-by: flouthoc <flouthoc.git@gmail.com>
|
|
|
|
|
|
|
|
|
|
| |
when looking up the container cgroup, ignore named hierarchies since
containers running systemd as payload will create a sub-cgroup and
move themselves there.
Closes: https://github.com/containers/podman/issues/10602
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the containers.conf field "NetNS" is set to "Bridge" and the
"RootlessNetworking" field is set to "cni", Podman will now
handle rootless in the same way it does root - all containers
will be joined to a default CNI network, instead of exclusively
using slirp4netns.
If no CNI default network config is present for the user, one
will be auto-generated (this also works for root, but it won't be
nearly as common there since the package should already ship a
config).
I eventually hope to remove the "NetNS=Bridge" bit from
containers.conf, but let's get something in for Brent to work
with.
Signed-off-by: Matthew Heon <mheon@redhat.com>
|
|\
| |
| | |
Support uid,gid,mode options for secrets
|
| |
| |
| |
| |
| |
| |
| | |
Support UID, GID, Mode options for mount type secrets. Also, change
default secret permissions to 444 so all users can read secret.
Signed-off-by: Ashley Cui <acui@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
| |
This change adds the entry `host.containers.internal` to the `/etc/hosts`
file within a new containers filesystem. The ip address is determined by
the containers networking configuration and points to the gateway address
for the containers networking namespace.
Closes #5651
Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some packages used by the remote client imported the libpod package.
This is not wanted because it adds unnecessary bloat to the client and
also causes problems with platform specific code(linux only), see #9710.
The solution is to move the used functions/variables into extra packages
which do not import libpod.
This change shrinks the remote client size more than 6MB compared to the
current master.
[NO TESTS NEEDED]
I have no idea how to test this properly but with #9710 the cross
compile should fail.
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Traditionally, the path resolution for containers has been resolved on
the *host*; relative to the container's mount point or relative to
specified bind mounts or volumes.
While this works nicely for non-running containers, it poses a problem
for running ones. In that case, certain kinds of mounts (e.g., tmpfs)
will not resolve correctly. A tmpfs is held in memory and hence cannot
be resolved relatively to the container's mount point. A copy operation
will succeed but the data will not show up inside the container.
To support these kinds of mounts, we need to join the *running*
container's mount namespace (and PID namespace) when copying.
Note that this change implies moving the copy and stat logic into
`libpod` since we need to keep the container locked to avoid race
conditions. The immediate benefit is that all logic is now inside
`libpod`; the code isn't scattered anymore.
Further note that Docker does not support copying to tmpfs mounts.
Tests have been extended to cover *both* path resolutions for running
and created containers. New tests have been added to exercise the
tmpfs-mount case.
For the record: Some tests could be improved by using `start -a` instead
of a start-exec sequence. Unfortunately, `start -a` is flaky in the CI
which forced me to use the more expensive start-exec option.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
| |
Signed-off-by: Eduardo Vega <edvegavalerio@gmail.com>
|
|
|
|
|
|
|
|
|
| |
We missed bumping the go module, so let's do it now :)
* Automated go code with github.com/sirkon/go-imports-rename
* Manually via `vgrep podman/v2` the rest
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
| |
Use the whitespace linter and fix the reported problems.
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
|
|
|
|
|
|
|
|
|
|
|
| |
Implement podman secret create, inspect, ls, rm
Implement podman run/create --secret
Secrets are blobs of data that are sensitive.
Currently, the only secret driver supported is filedriver, which means creating a secret stores it in base64 unencrypted in a file.
After creating a secret, a user can use the --secret flag to expose the secret inside the container at /run/secrets/[secretname]
This secret will not be commited to an image on a podman commit
Signed-off-by: Ashley Cui <acui@redhat.com>
|
|
|
|
| |
Signed-off-by: Milivoje Legenovic <m.legenovic@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
if a single user is mapped in the user namespace, handle it as root.
It is needed for running unprivileged containers with a single user
available without being forced to run with euid and egid set to 0.
Needs: https://github.com/containers/storage/pull/794
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
| |
Before querying for a container's cgroup path, make sure that the
container is synced. Also make sure to error out if the container
isn't running.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
* Add a new `pkg/copy` to centralize all container-copy related code.
* The new code is based on Buildah's `copier` package.
* The compat `/archive` endpoints use the new `copy` package.
* Update docs and an several new tests.
* Includes many fixes, most notably, the look-up of volumes and mounts.
Breaking changes:
* Podman is now expecting that container-destination paths exist.
Before, Podman created the paths if needed. Docker does not do
that and I believe Podman should not either as it's a recipe for
masking errors. These errors may be user induced (e.g., a path
typo), or internal typos (e.g., when the destination may be a
mistakenly unmounted volume). Let's keep the magic low for such
a security sensitive feature.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes things a lot more clear - if we are actually joining a
CNI network, we are guaranteed to get a non-zero length list of
networks.
We do, however, need to know if the network we are joining is the
default network for inspecting containers as it determines how we
populate the response struct. To handle this, add a bool to
indicate that the network listed was the default network, and
only the default network.
Signed-off-by: Matthew Heon <mheon@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running on cgroups v1, `/proc/{PID}/cgroup` has multiple entries,
each pointing potentially to a different cgroup. Some may be empty,
some may point to parents.
The one we really need is the libpod-specific one, which always is the
longest path. So instead of looking at the first entry, look at all and
select the longest one.
Fixes: #8397
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
this enables the ability to connect and disconnect a container from a
given network. it is only for the compatibility layer. some code had to
be refactored to avoid circular imports.
additionally, tests are being deferred temporarily due to some
incompatibility/bug in either docker-py or our stack.
Signed-off-by: baude <bbaude@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
this enables the ability to connect and disconnect a container from a
given network. it is only for the compatibility layer. some code had to
be refactored to avoid circular imports.
additionally, tests are being deferred temporarily due to some
incompatibility/bug in either docker-py or our stack.
Signed-off-by: baude <bbaude@redhat.com>
|
|
|
|
|
|
|
|
|
| |
When looking up a container's cgroup path, parse /proc/[PID]/cgroup.
This will work across all cgroup managers and configurations and is
supported on cgroups v1 and v2.
Fixes: #8265
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert the existing network aliases set/remove code to network
connect and disconnect. We can no longer modify aliases for an
existing network, but we can add and remove entire networks. As
part of this, we need to add a new function to retrieve current
aliases the container is connected to (we had a table for this
as of the first aliases PR, but it was not externally exposed).
At the same time, remove all deconflicting logic for aliases.
Docker does absolutely no checks of this nature, and allows two
containers to have the same aliases, aliases that conflict with
container names, etc - it's just left to DNS to return all the
IP addresses, and presumably we round-robin from there? Most
tests for the existing code had to be removed because of this.
Convert all uses of the old container config.Networks field,
which previously included all networks in the container, to use
the new DB table. This ensures we actually get an up-to-date list
of in-use networks. Also, add network aliases to the output of
`podman inspect`.
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new "image" mount type to `--mount`. The source of the mount is
the name or ID of an image. The destination is the path inside the
container. Image mounts further support an optional `rw,readwrite`
parameter which if set to "true" will yield the mount writable inside
the container. Note that no changes are propagated to the image mount
on the host (which in any case is read only).
Mounts are overlay mounts. To support read-only overlay mounts, vendor
a non-release version of Buildah.
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we create a container, we assign a cgroup parent based on
the current cgroup manager in use. This parent is only usable
with the cgroup manager the container is created with, so if the
default cgroup manager is later changed or overridden, the
container will not be able to start.
To solve this, store the cgroup manager that created the
container in container configuration, so we can guarantee a
container with a systemd cgroup parent will always be started
with systemd cgroups.
Unfortunately, this is very difficult to test in CI, due to the
fact that we hard-code cgroup manager on all invocations of
Podman in CI.
Fixes #7830
Signed-off-by: Matthew Heon <matthew.heon@pm.me>
|
|
|
|
| |
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
| |
This commit handle the TODO task of breaking the Container
config into smaller sub-configs
Signed-off-by: ldelossa <ldelossa@redhat.com>
|
|
|
|
|
|
|
|
| |
--umask sets the umask inside the container
Defaults to 0022
Co-authored-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Ashley Cui <acui@redhat.com>
|
|
|
|
|
|
|
|
| |
Add support -v for overlay volume mounts in podman.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Signed-off-by: Qi Wang <qiwan@redhat.com>
|
|
|
|
|
|
| |
do not pass network specific options through the network namespace.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
|
| |
If I enter a continer with --userns keep-id, my UID will be present
inside of the container, but most likely my user will not be defined.
This patch will take information about the user and stick it into the
container.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
--sdnotify container|conmon|ignore
With "conmon", we send the MAINPID, and clear the NOTIFY_SOCKET so the OCI
runtime doesn't pass it into the container. We also advertise "ready" when the
OCI runtime finishes to advertise the service as ready.
With "container", we send the MAINPID, and leave the NOTIFY_SOCKET so the OCI
runtime passes it into the container for initialization, and let the container advertise further metadata.
This is the default, which is closest to the behavior podman has done in the past.
The "ignore" option removes NOTIFY_SOCKET from the environment, so neither podman nor
any child processes will talk to systemd.
This removes the need for hardcoded CID and PID files in the command line, and
the PIDFile directive, as the pid is advertised directly through sd-notify.
Signed-off-by: Joseph Gooch <mrwizard@dok.org>
|
|\
| |
| | |
Add --tz flag to create, run
|
| |
| |
| |
| |
| |
| |
| | |
--tz flag sets timezone inside container
Can be set to IANA timezone as well as `local` to match host machine
Signed-off-by: Ashley Cui <acui@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the advent of Podman 2.0.0 we crossed the magical barrier of go
modules. While we were able to continue importing all packages inside
of the project, the project could not be vendored anymore from the
outside.
Move the go module to new major version and change all imports to
`github.com/containers/libpod/v2`. The renaming of the imports
was done via `gomove` [1].
[1] https://github.com/KSubedi/gomove
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When running under systemd there is no need to create yet another
cgroup for the container.
With conmon-delegated the current cgroup will be split in two sub
cgroups:
- supervisor
- container
The supervisor cgroup will hold conmon and the podman process, while
the container cgroup is used by the OCI runtime (using the cgroupfs
backend).
Closes: https://github.com/containers/libpod/issues/6400
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
| |
Add --preservefds to podman run. close https://github.com/containers/libpod/issues/6458
Signed-off-by: Qi Wang <qiwan@redhat.com>
|