| Commit message (Collapse) | Author | Age |
|
|
|
| |
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
|
|
|
|
|
|
|
|
| |
The libpod/network packages were moved to c/common so that buildah can
use it as well. To prevent duplication use it in podman as well and
remove it from here.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OCICNI port format has one big problem: It does not support ranges.
So if a users forwards a range of 1k ports with podman run -p 1001-2000
we have to store each of the thousand ports individually as array element.
This bloats the db and makes the JSON encoding and decoding much slower.
In many places we already use a better port struct type which supports
ranges, e.g. `pkg/specgen` or the new network interface.
Because of this we have to do many runtime conversions between the two
port formats. If everything uses the new format we can skip the runtime
conversions.
This commit adds logic to replace all occurrences of the old format
with the new one. The database will automatically migrate the ports
to new format when the container config is read for the first time
after the update.
The `ParsePortMapping` function is `pkg/specgen/generate` has been
reworked to better work with the new format. The new logic is able
to deduplicate the given ports. This is necessary the ensure we
store them efficiently in the DB. The new code should also be more
performant than the old one.
To prove that the code is fast enough I added go benchmarks. Parsing
1 million ports took less than 0.5 seconds on my laptop.
Benchmark normalize PortMappings in specgen:
Please note that the 1 million ports are actually 20x 50k ranges
because we cannot have bigger ranges than 65535 ports.
```
$ go test -bench=. -benchmem ./pkg/specgen/generate/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/pkg/specgen/generate
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
BenchmarkParsePortMappingNoPorts-12 480821532 2.230 ns/op 0 B/op 0 allocs/op
BenchmarkParsePortMapping1-12 38972 30183 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMapping100-12 18752 60688 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMapping1k-12 3104 331719 ns/op 223840 B/op 3018 allocs/op
BenchmarkParsePortMapping10k-12 376 3122930 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMapping1m-12 3 390869926 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingReverse100-12 18940 63414 ns/op 141088 B/op 315 allocs/op
BenchmarkParsePortMappingReverse1k-12 3015 362500 ns/op 223841 B/op 3018 allocs/op
BenchmarkParsePortMappingReverse10k-12 343 3318135 ns/op 1223650 B/op 30027 allocs/op
BenchmarkParsePortMappingReverse1m-12 3 403392469 ns/op 124593840 B/op 4000624 allocs/op
BenchmarkParsePortMappingRange1-12 37635 28756 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange100-12 39604 28935 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1k-12 38384 29921 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange10k-12 29479 40381 ns/op 131584 B/op 9 allocs/op
BenchmarkParsePortMappingRange1m-12 927 1279369 ns/op 143022 B/op 164 allocs/op
PASS
ok github.com/containers/podman/v3/pkg/specgen/generate 25.492s
```
Benchmark convert old port format to new one:
```
go test -bench=. -benchmem ./libpod/
goos: linux
goarch: amd64
pkg: github.com/containers/podman/v3/libpod
cpu: Intel(R) Core(TM) i7-10850H CPU @ 2.70GHz
Benchmark_ocicniPortsToNetTypesPortsNoPorts-12 663526126 1.663 ns/op 0 B/op 0 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1-12 7858082 141.9 ns/op 72 B/op 2 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10-12 2065347 571.0 ns/op 536 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts100-12 138478 8641 ns/op 4216 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1k-12 9414 120964 ns/op 41080 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts10k-12 781 1490526 ns/op 401528 B/op 4 allocs/op
Benchmark_ocicniPortsToNetTypesPorts1m-12 4 250579010 ns/op 40001656 B/op 4 allocs/op
PASS
ok github.com/containers/podman/v3/libpod 11.727s
```
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Don't use reexec for the rootlessport process, instead make it a
separate binary to reduce the memory usage. The problem with reexec is
that it will import all packages that podman uses and therefore loads a
lot of stuff into the heap. The rootlessport process however only needs
the rootlesskit library.
The memory usage is a concern since the rootlessport process will spawn
two process per container which has ports forwarded. The processes stay
until the container dies. On my laptop the current reexec version uses
47800 KB RSS. The new separate binary only uses 4540 KB RSS. This is
more than a 90% improvement.
The Makefile has been updated to compile the new binary and install it
to the libexec directory.
Fixes #10790
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
| |
Remove ERROR: Error stutter from logrus messages also.
[ NO TESTS NEEDED] This is just code cleanup.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
We do not use the ocicni code anymore so let's get rid of it. Only the
port struct is used but we can copy this into libpod network types so
we can debloat the binary.
The next step is to remove the OCICNI port mapping form the container
config and use the better PortMapping struct everywhere.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
| |
When we restart a container via podman restart or restart policy the
rootlessport process fails with `address already in use` because the
socketfile still exists.
This is a regression and was introduced in commit abdedc31a25e.
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Creating the rootlessport socket can fail with `bind: invalid argument`
when the socket path is longer than 108 chars. This is the case for
users with a long runtime directory.
Since the kernel does not allow to use socket paths with more then 108
chars use a workaround to open the socket path.
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the rootlessport process is started the stdout/stderr are attached
to the podman process. However once everything is setup podman exits and
when the rootlessport process tries to write to stdout it will fail with
SIGPIPE. The code handles this signal and puts /dev/null to stdout and
stderr but this is not robust. I do not understand the exact cause but
sometimes the process is still killed by SIGPIPE. Either go lost the
signal or the process got already killed before the goroutine could
handle it.
Instead of handling SIGPIPE just set /dev/null to stdout and stderr
before podman exits. With this there should be no race and no way to
run into SIGPIPE errors.
[NO TESTS NEEDED]
Fixes #11248
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The rootlessport forwarder requires a child IP to be set. This must be a
valid ip in the container network namespace. The problem is that after a
network disconnect and connect the eth0 ip changed. Therefore the
packages are dropped since the source ip does no longer exists in the
netns.
One solution is to set the child IP to 127.0.0.1, however this is a
security problem. [1]
To fix this we have to recreate the ports after network connect and
disconnect. To make this work the rootlessport process exposes a socket
where podman network connect/disconnect connect to and send to new child
IP to rootlessport. The rootlessport process will remove all ports and
recreate them with the new correct child IP.
Also bump rootlesskit to v0.14.3 to fix a race with RemovePort().
Fixes #10052
[1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199
Signed-off-by: Paul Holzinger <pholzing@redhat.com>
|
|
|
|
|
|
|
|
| |
Use the whitespace linter and fix the reported problems.
[NO TESTS NEEDED]
Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
|
|
|
|
|
|
|
|
|
| |
set the source IP to the slirp4netns address instead of 127.0.0.1 when
using rootlesskit.
Closes: https://github.com/containers/podman/issues/5138
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
The same channel is written to by two different goroutines.
Use a different channel for each of them so to avoid writing to a
closed channel.
Closes: https://github.com/containers/libpod/issues/6018
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
| |
Dup2 is not defined on arm64 in the syscall package.
Closes: https://github.com/containers/libpod/issues/5587
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
| |
when a sigpipe is received the stdout/stderr pipe was closed, so
reopen them with /dev/null.
Closes: https://github.com/containers/libpod/issues/5541
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
| |
otherwise the rootless parent process might wait indefinitely when the
rootless-child process exits early.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
there is a race condition where the child process is immediately
killed:
[pid 2576752] arch_prctl(0x3001 /* ARCH_??? */, 0x7ffdf612f170) = -1 EINVAL (Invalid argument)
[pid 2576752] access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
[pid 2576752] --- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=2576742, si_uid=0} ---
[pid 2576752] +++ killed by SIGTERM +++
this happens because the parent process here really means the "parent
thread".
Since there is no way of running it on the main thread,
let's skip this functionality altogether and use kill(2).
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
write to the error pipe only in case of an error. Otherwise we may
end up in a race condition in the select statement below as the read
from errChan happens before initComplete and the function returns
immediately nil.
Closes: https://github.com/containers/libpod/issues/5182
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
| |
Previously, rootlessport was using /var/tmp as the tmp dir.
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
|
|
|
|
| |
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
|
|
RootlessKit port forwarder has a lot of advantages over the slirp4netns port forwarder:
* Very high throughput.
Benchmark result on Travis: socat: 5.2 Gbps, slirp4netns: 8.3 Gbps, RootlessKit: 27.3 Gbps
(https://travis-ci.org/rootless-containers/rootlesskit/builds/597056377)
* Connections from the host are treated as 127.0.0.1 rather than 10.0.2.2 in the namespace.
No UDP issue (#4586)
* No tcp_rmem issue (#4537)
* Probably works with IPv6. Even if not, it is trivial to support IPv6. (#4311)
* Easily extensible for future support of SCTP
* Easily extensible for future support of `lxc-user-nic` SUID network
RootlessKit port forwarder has been already adopted as the default port forwarder by Rootless Docker/Moby,
and no issue has been reported AFAIK.
As the port forwarder is imported as a Go package, no `rootlesskit` binary is required for Podman.
Fix #4586
May-fix #4559
Fix #4537
May-fix #4311
See https://github.com/rootless-containers/rootlesskit/blob/v0.7.0/pkg/port/builtin/builtin.go
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
|