| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
| |
As rootless we have to reload the port mappings. If it fails we should
return an error instead of the warning.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
| |
When run as rootless the podman network reload command tries to reload
the rootlessport ports because the childIP could have changed.
However if the containers has no ports we should skip this instead of
printing a warning.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
| |
A rootless container created with a custom userns and forwarded ports
did not work. I refactored the network setup to make the setup logic
more clear.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OCICNI port format has one big problem: It does not support ranges.
So if a users forwards a range of 1k ports with podman run -p 1001-2000
we have to store each of the thousand ports individually as array element.
This bloats the db and makes the JSON encoding and decoding much slower.
In many places we already use a better port struct type which supports
ranges, e.g. `pkg/specgen` or the new network interface.
Because of this we have to do many runtime conversions between the two
port formats. If everything uses the new format we can skip the runtime
conversions.
This commit adds logic to replace all occurrences of the old format
with the new one. The database will automatically migrate the ports
to new format when the container config is read for the first time
after the update.
The `ParsePortMapping` function is `pkg/specgen/generate` has been
reworked to better work with the new format. The new logic is able
to deduplicate the given ports. This is necessary the ensure we
store them efficiently in the DB. The new code should also be more
performant than the old one.
To prove that the code is fast enough I added go benchmarks. Parsing
1 million ports took less than 0.5 seconds on my laptop.
Benchmark normalize PortMappings in specgen:
Please note that the 1 million ports are actually 20x 50k ranges
because we cannot have bigger ranges than 65535 ports.
```
$ go test -bench=. -benchmem ./pkg/specgen/generate/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/pkg/specgen/generate
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkParsePortMappingNoPorts-12 480821532 2.230 ns/op 0 B/op 0 allocs/op
BenchmarkParsePortMapping1-12 38972 30183 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMapping100-12 18752 60688 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMapping1k-12 3104 331719 ns/op 223840 B/op 3018 allocs/op
BenchmarkParsePortMapping10k-12 376 3122930 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMapping1m-12 3 390869926 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingReverse100-12 18940 63414 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMappingReverse1k-12 3015 362500 ns/op 223841 B/op 3018 allocs/op
BenchmarkParsePortMappingReverse10k-12 343 3318135 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMappingReverse1m-12 3 403392469 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingRange1-12 37635 28756 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange100-12 39604 28935 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1k-12 38384 29921 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange10k-12 29479 40381 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1m-12 927 1279369 ns/op 143022 B/op 164 allocs/op
PASS
ok github.com/containers/podman/v3/pkg/specgen/generate 25.492s
```
Benchmark convert old port format to new one:
```
go test -bench=. -benchmem ./libpod/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/libpod
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
Benchmark_ocicniPortsToNetTypesPortsNoPorts-12 663526126 1.663 ns/op 0 B/op 0 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1-12 7858082 141.9 ns/op 72 B/op 2 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10-12 2065347 571.0 ns/op 536 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts100-12 138478 8641 ns/op 4216 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1k-12 9414 120964 ns/op 41080 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10k-12 781 1490526 ns/op 401528 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1m-12 4 250579010 ns/op 40001656 B/op 4 allocs/op
PASS
ok github.com/containers/podman/v3/libpod 11.727s
```
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Duplicate Address Detection slows the ipv6 setup down for 1-2 seconds.
Since slirp4netns is run it is own namespace and not directly routed
we can skip this to make the ipv6 address immediately available.
We change the default to make sure the slirp tap interface gets the
correct value assigned so DAD is disabled for it.
Also make sure to change this value back to the original after slirp4netns
is ready in case users rely on this sysctl.
Fixes #11062
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't use reexec for the rootlessport process, instead make it a
separate binary to reduce the memory usage. The problem with reexec is
that it will import all packages that podman uses and therefore loads a
lot of stuff into the heap. The rootlessport process however only needs
the rootlesskit library.
The memory usage is a concern since the rootlessport process will spawn
two process per container which has ports forwarded. The processes stay
until the container dies. On my laptop the current reexec version uses
47800 KB RSS. The new separate binary only uses 4540 KB RSS. This is
more than a 90% improvement.
The Makefile has been updated to compile the new binary and install it
to the libexec directory.
Fixes #10790
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Access the container's config field directly inside of libpod instead of
calling `Config()` which in turn creates expensive JSON deep copies.
Accessing the field directly drops memory consumption of a simple
`podman run --rm busybox true` from 1245kB to 410kB.
[NO TESTS NEEDED]
Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
|
|
|
|
|
|
|
|
| |
Remove ERROR: Error stutter from logrus messages also.
[ NO TESTS NEEDED] This is just code cleanup.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Make use of the new network interface in libpod.
This commit contains several breaking changes:
- podman network create only outputs the new network name and not file
path.
- podman network ls shows the network driver instead of the cni version
and plugins.
- podman network inspect outputs the new network struct and not the cni
conflist.
- The bindings and libpod api endpoints have been changed to use the new
network structure.
The container network status is stored in a new field in the state. The
status should be received with the new `c.getNetworkStatus`. This will
migrate the old status to the new format. Therefore old containers should
contine to work correctly in all cases even when network connect/
disconnect is used.
New features:
- podman network reload keeps the ip and mac for more than one network.
- podman container restore keeps the ip and mac for more than one
network.
- The network create compat endpoint can now use more than one ipam
config.
The man pages and the swagger doc are updated to reflect the latest
changes.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Creating the rootlessport socket can fail with `bind: invalid argument`
when the socket path is longer than 108 chars. This is the case for
users with a long runtime directory.
Since the kernel does not allow to use socket paths with more then 108
chars use a workaround to open the socket path.
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]
To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.
Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().
Fixes #10052
[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a new service reaper package. Podman currently does not reap all
child processes. The slirp4netns and rootlesskit processes are not
reaped. The is not a problem for local podman since the podman process
dies before the other processes and then init will reap them for us.
However with podman system service it is possible that the podman
process is still alive after slirp died. In this case podman has to reap
it or the slirp process will be a zombie until the service is stopped.
The service reaper will listen in an extra goroutine on SIGCHLD. Once it
receives this signal it will try to reap all pids that were added with
`AddPID()`. While I would like to just reap all children this is not
possible because many parts of the code use `os/exec` with `cmd.Wait()`.
If we reap before `cmd.Wait()` things can break, so reaping everything
is not an option.
[NO TESTS NEEDED]
Fixes #9777
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This change adds the entry `host.containers.internal` to the `/etc/hosts`
file within a new containers filesystem. The ip address is determined by
the containers networking configuration and points to the gateway address
for the containers networking namespace.
Closes #5651
Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The source ip for the rootlesskit port forwarder was hardcoded to the
standard slirp4netns ip. This is incorrect since users can change the
subnet used by slirp4netns with `--network slirp4netns:cidr=10.5.0.0/24`.
The container interface ip is always the .100 in the subnet. Only when
the rootlesskit port forwarder child ip matches the container interface
ip the port forwarding will work.
Fixes #9828
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
|
|
This should make maintenance easier.
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
|