summaryrefslogtreecommitdiff
path: root/libpod/networking_linux.go
Commit message (Collapse)AuthorAge
* Fix race conditions in rootless cni setupPaul Holzinger2021-07-15
| | | | | | | | | | | | | | | | | | | | There was an race condition when calling `GetRootlessCNINetNs()`. It created the rootless cni directory before it got locked. Therefore another process could have called cleanup and removed this directory before it was used resulting in errors. The lockfile got moved into the XDG_RUNTIME_DIR directory to prevent a panic when the parent dir was removed by cleanup. Fixes #10930 Fixes #10922 To make this even more robust `GetRootlessCNINetNs()` will now return locked. This guarantees that we can run `Do()` after `GetRootlessCNINetNs()` before another process could have called `Cleanup()` in between. [NO TESTS NEEDED] CI is flaking, hopefully this will fix it. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* CNI-in-slirp4netns: fix bind-mount for /run/systemd/resolve/stub-resolv.confAkihiro Suda2021-07-15
| | | | | | | | | | | | | | | Fix issue 10929 : `[Regression in 3.2.0] CNI-in-slirp4netns DNS gets broken when running a rootful container after running a rootless container` When /etc/resolv.conf on the host is a symlink to /run/systemd/resolve/stub-resolv.conf, we have to mount an empty filesystem on /run/systemd/resolve in the child namespace, so as to isolate the directory from the host mount namespace. Otherwise our bind-mount for /run/systemd/resolve/stub-resolv.conf is unmounted when systemd-resolved unlinks and recreates /run/systemd/resolve/stub-resolv.conf on the host. [NO TESTS NEEDED] Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
* Make rootless-cni setup more robustPaul Holzinger2021-07-15
| | | | | | | | | | | | | | | | | | | The rootless cni namespace needs a valid /etc/resolv.conf file. On some distros is a symlink to somewhere under /run. Because the kernel will follow the symlink before mounting, it is not possible to mount a file at exactly /etc/resolv.conf. We have to ensure that the link target will be available in the rootless cni mount ns. Fixes #10855 Also fixed a bug in the /var/lib/cni directory lookup logic. It used `filepath.Base` instead of `filepath.Dir` and thus looping infinitely. Fixes #10857 [NO TESTS NEEDED] Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Do not use inotify for OCICNIPaul Holzinger2021-06-24
| | | | | | | | | | | | | | | | Podman does not need to watch the cni config directory. If a network is not found in the cache, OCICNI will reload the networks anyway and thus even podman system service should work as expected. Also include a change to not mount a "new" /var by default in the rootless cni ns, instead try to use /var/lib/cni first and then the parent dir. This allows users to store cni configs under /var/... which is the case for the CI compose test. [NO TESTS NEEDED] Fixes #10686 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* getContainerNetworkInfo: lock netNsCtr before syncPaul Holzinger2021-06-24
| | | | | | | | | `syncContainer()` requires the container to be locked, otherwise we can end up with undefined behavior. [NO TESTS NEEDED] Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Fix network connect race with docker-composePaul Holzinger2021-06-14
| | | | | | | | | | | Network connect/disconnect has to call the cni plugins when the network namespace is already configured. This is the case for `ContainerStateRunning` and `ContainerStateCreated`. This is important otherwise the network is not attached to this network namespace and libpod will throw errors like `network inspection mismatch...` This problem happened when using `docker-compose up` in attached mode. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Merge pull request #9972 from bblenard/issue-5651-hostname-for-container-gatewayOpenShift Merge Robot2021-05-17
|\ | | | | Add host.containers.internal entry into container's etc/hosts
| * Add host.containers.internal entry into container's etc/hostsBaron Lenardson2021-05-17
| | | | | | | | | | | | | | | | | | | | | | This change adds the entry `host.containers.internal` to the `/etc/hosts` file within a new containers filesystem. The ip address is determined by the containers networking configuration and points to the gateway address for the containers networking namespace. Closes #5651 Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
* | podman network reload add rootless supportPaul Holzinger2021-05-17
|/ | | | | | | | | | Allow podman network reload to be run as rootless user. While it is unlikely that the iptable rules are flushed inside the rootless cni namespace, it could still happen. Also fix podman network reload --all to ignore errors when a container does not have the bridge network mode, e.g. slirp4netns. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix rootlesskit port forwarder with custom slirp cidrPaul Holzinger2021-04-23
| | | | | | | | | | | | | The source ip for the rootlesskit port forwarder was hardcoded to the standard slirp4netns ip. This is incorrect since users can change the subnet used by slirp4netns with `--network slirp4netns:cidr=10.5.0.0/24`. The container interface ip is always the .100 in the subnet. Only when the rootlesskit port forwarder child ip matches the container interface ip the port forwarding will work. Fixes #9828 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* podman unshare: add --rootless-cni to join the nsPaul Holzinger2021-04-07
| | | | | | | | | | Add a new --rootless-cni option to podman unshare to also join the rootless-cni network namespace. This is useful if you want to connect to a rootless container via IP address. This is only possible from the rootless-cni namespace and not from the host namespace. This option also helps to debug problems in the rootless-cni namespace. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* rootless cni add /usr/sbin to PATH if not presentPaul Holzinger2021-04-06
| | | | | | | | | The CNI plugins need access to iptables in $PATH. On debian /usr/sbin is not added to $PATH for rootless users. This will break rootless cni completely. To prevent breaking existing users add /usr/sbin to $PATH in podman if needed. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Use the slrip4netns dns in the rootless cni nsPaul Holzinger2021-04-01
| | | | | | | | If a user only has a local dns server in the resolv.conf file the dns resolution will fail. Instead we create a new resolv.conf which will use the slirp4netns dns. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Cleanup the rootless cni namespacePaul Holzinger2021-04-01
| | | | | | | Delte the network namespace and kill the slirp4netns process when it is no longer needed. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Remove unused rootless-cni-infra container filesPaul Holzinger2021-04-01
| | | | Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Only use rootless RLK when the container has portsPaul Holzinger2021-04-01
| | | | | | | Do not invoke the rootlesskit port forwarder when the container has no ports. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Enable rootless network connect/disconnectPaul Holzinger2021-04-01
| | | | | | | With the new rootless cni supporting network connect/disconnect is easy. Combine common setps into extra functions to prevent code duplication. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Move slirp4netns functions into an extra filePaul Holzinger2021-04-01
| | | | | | This should make maintenance easier. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Add rootless support for cni and --uidmapPaul Holzinger2021-04-01
| | | | | | This is supported with the new rootless cni logic. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* rootless cni without infra containerPaul Holzinger2021-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of creating an extra container create a network and mount namespace inside the podman user namespace. This ns is used to for rootless cni operations. This helps to align the rootless and rootful network code path. If we run as rootless we just have to set up a extra net ns and initialize slirp4netns in it. The ocicni lib will be called in that net ns. This design allows allows easier maintenance, no extra container with pause processes, support for rootless cni with --uidmap and possibly more. The biggest problem is backwards compatibility. I don't think live migration can be possible. If the user reboots or restart all cni containers everything should work as expected again. The user is left with the rootless-cni-infa container and image but this can safely be removed. To make the existing cni configs work we need execute the cni plugins in a extra mount namespace. This ensures that we can safely mount over /run and /var which have to be writeable for the cni plugins without removing access to these files by the main podman process. One caveat is that we need to keep the netns files at `XDG_RUNTIME_DIR/netns` accessible. `XDG_RUNTIME_DIR/rootless-cni/{run,var}` will be mounted to `/{run,var}`. To ensure that we keep the netns directory we bind mount this relative to the new root location, e.g. XDG_RUNTIME_DIR/rootless-cni/run/user/1000/netns before we mount the run directory. The run directory is mounted recursive, this makes the netns directory at the same path accessible as before. This also allows iptables-legacy to work because /run/xtables.lock is now writeable. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Silence podman network reload errors with iptables-nftPaul Holzinger2021-03-30
| | | | | | | | | | | | Make sure we do not display the expected error when using podman network reload. This is already done for iptables-legacy however iptables-nft creates a slightly different error message so check for this as well. The error is logged at info level. [NO TESTS NEEDED] The test VMs do not use iptables-nft so there is no way to test this. It is already tested for iptables-legacy. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix cni teardown errorsPaul Holzinger2021-03-04
| | | | | | | | | | | Make sure to pass the cni interface descriptions to cni teardowns. Otherwise cni cannot find the correct cache files because the interface name might not match the networks. This can only happen when network disconnect was used. Fixes #9602 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Network connect error if net mode is not bridgePaul Holzinger2021-02-23
| | | | | | | | | | Only the the network mode bridge supports cni networks. Other network modes cannot use network connect/disconnect so we should throw a error. Fixes #9496 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix podman network IDs handlingPaul Holzinger2021-02-22
| | | | | | | | | | | | | The libpod network logic knows about networks IDs but OCICNI does not. We cannot pass the network ID to OCICNI. Instead we need to make sure we only use network names internally. This is also important for libpod since we also only store the network names in the state. If we would add a ID there the same networks could accidentally be added twice. Fixes #9451 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* bump go module to v3Valentin Rothberg2021-02-22
| | | | | | | | | We missed bumping the go module, so let's do it now :) * Automated go code with github.com/sirkon/go-imports-rename * Manually via `vgrep podman/v2` the rest Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Enable golint linterPaul Holzinger2021-02-11
| | | | | | | | Use the golint linter and fix the reported problems. [NO TESTS NEEDED] Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix podman network disconnect wrong NetworkStatus numberPaul Holzinger2021-02-04
| | | | | | | | | | | | | The allocated `tmpNetworkStatus` must be allocated with the length 0. Otherwise append would add new elements to the end of the slice and not at the beginning of the allocated memory. This caused inspect to fail since the number of networks did not matched the number of network statuses. Fixes #9234 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Make slirp MTU configurable (network_cmd_options)bitstrings2021-02-02
| | | | | | | | The mtu default value is currently forced to 65520. This let the user control it using the config key network_cmd_options, i.e.: network_cmd_options=["mtu=9000"] Signed-off-by: bitstrings <pino.silvaggio@gmail.com>
* Add default net info in container inspectbaude2021-01-26
| | | | | | | | | | | | | | when inspecting a container that is only connected to the default network, we should populate the default network in the container inspect information. Fixes: #6618 Signed-off-by: baude <bbaude@redhat.com> MH: Small fixes, added another test Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* networking: lookup child IP in networksGiuseppe Scrivano2021-01-23
| | | | | | | | | | | | if a CNI network is added to the container, use the IP address in that network instead of hard-coding the slirp4netns default. commit 5e65f0ba30f3fca73f8c207825632afef08378c1 introduced this regression. Closes: https://github.com/containers/podman/issues/9065 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* libpod: move slirp magic IPs to constsGiuseppe Scrivano2021-01-22
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootlessport: set source IP to slirp4netns deviceGiuseppe Scrivano2021-01-22
| | | | | | | | | set the source IP to the slirp4netns address instead of 127.0.0.1 when using rootlesskit. Closes: https://github.com/containers/podman/issues/5138 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Switch references of /var/run -> /runDaniel J Walsh2021-01-07
| | | | | | | | | | Systemd is now complaining or mentioning /var/run as a legacy directory. It has been many years where /var/run is a symlink to /run on all most distributions, make the change to the default. Partial fix for https://github.com/containers/podman/issues/8369 Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* The slirp4netns sandbox requires pivot_rootAnders F Björklund2020-12-29
| | | | | | Disable the sandbox, when running on rootfs Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
* SpellingJosh Soref2020-12-22
| | | | Signed-off-by: Josh Soref <jsoref@users.noreply.github.com>
* Make `podman stats` slirp check more robustMatthew Heon2020-12-08
| | | | | | | | | | | Just checking for `rootless.IsRootless()` does not catch all the cases where slirp4netns is in use - we actually allow it to be used as root as well. Fortify the conditional here so we don't fail in the root + slirp case. Fixes #7883 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Merge pull request #8571 from Luap99/podman-network-reloadOpenShift Merge Robot2020-12-08
|\ | | | | Implement pod-network-reload
| * Implement pod-network-reloadMatthew Heon2020-12-07
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This adds a new command, 'podman network reload', to reload the networks of existing containers, forcing recreation of firewall rules after e.g. `firewall-cmd --reload` wipes them out. Under the hood, this works by calling CNI to tear down the existing network, then recreate it using identical settings. We request that CNI preserve the old IP and MAC address in most cases (where the container only had 1 IP/MAC), but there will be some downtime inherent to the teardown/bring-up approach. The architecture of CNI doesn't really make doing this without downtime easy (or maybe even possible...). At present, this only works for root Podman, and only locally. I don't think there is much of a point to adding remote support (this is very much a local debugging command), but I think adding rootless support (to kill/recreate slirp4netns) could be valuable. Signed-off-by: Matthew Heon <matthew.heon@pm.me> Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* | Add ability to set system wide options for slirp4netnsAshley Cui2020-12-04
|/ | | | | | Wire in containers.conf options for slirp Signed-off-by: Ashley Cui <acui@redhat.com>
* network connect disconnect on non-running containersbaude2020-11-30
| | | | | | | a container can connect and disconnet to networks even when not in a running state. Signed-off-by: baude <bbaude@redhat.com>
* update container status with new resultsbaude2020-11-23
| | | | | | | | | a bug was being caused by the fact that the container network results were not being updated properly. given that jhon is on PTO, this PR will replace #8362 Signed-off-by: baude <bbaude@redhat.com>
* Refactor compat container create endpointJhon Honce2020-11-23
| | | | | | | | | | | | * Make endpoint compatibile with docker-py network expectations * Update specgen helper when called from compat endpoint * Update godoc on types * Add test for network/container create using docker-py method * Add syslog logging when DEBUG=1 for tests Fixes #8361 Signed-off-by: Jhon Honce <jhonce@redhat.com>
* Make c.networks() list include the default networkMatthew Heon2020-11-20
| | | | | | | | | | | | | | This makes things a lot more clear - if we are actually joining a CNI network, we are guaranteed to get a non-zero length list of networks. We do, however, need to know if the network we are joining is the default network for inspecting containers as it determines how we populate the response struct. To handle this, add a bool to indicate that the network listed was the default network, and only the default network. Signed-off-by: Matthew Heon <mheon@redhat.com>
* add network connect|disconnect compat endpointsbaude2020-11-19
| | | | | | | | | | | this enables the ability to connect and disconnect a container from a given network. it is only for the compatibility layer. some code had to be refactored to avoid circular imports. additionally, tests are being deferred temporarily due to some incompatibility/bug in either docker-py or our stack. Signed-off-by: baude <bbaude@redhat.com>
* add network connect|disconnect compat endpointsbaude2020-11-17
| | | | | | | | | | | this enables the ability to connect and disconnect a container from a given network. it is only for the compatibility layer. some code had to be refactored to avoid circular imports. additionally, tests are being deferred temporarily due to some incompatibility/bug in either docker-py or our stack. Signed-off-by: baude <bbaude@redhat.com>
* Merge pull request #8304 from rhatdan/errorOpenShift Merge Robot2020-11-12
|\ | | | | Cleanup error reporting
| * Cleanup error reportingDaniel J Walsh2020-11-11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The error message reported is overlay complicated and the added test does not really help the user. Currently the error looks like: podman run -p 80:80 fedora echo hello Error: failed to expose ports via rootlessport: "cannot expose privileged port 80, you might need to add "net.ipv4.ip_unprivileged_port_start=0" (currently 1024) to /etc/sysctl.conf, or choose a larger port number (>= 1024): listen tcp 0.0.0.0:80: bind: permission denied\n" After this change ./bin/podman run -p 80:80 fedora echo hello Error: cannot expose privileged port 80, you might need to add "net.ipv4.ip_unprivileged_port_start=0" (currently 1024) to /etc/sysctl.conf, or choose a larger port number (>= 1024): listen tcp 0.0.0.0:80: bind: permission denied Control chars have been eliminated. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* | Add support for network connect / disconnect to DBMatthew Heon2020-11-11
|/ | | | | | | | | | | | | | | | | | | | | | | | Convert the existing network aliases set/remove code to network connect and disconnect. We can no longer modify aliases for an existing network, but we can add and remove entire networks. As part of this, we need to add a new function to retrieve current aliases the container is connected to (we had a table for this as of the first aliases PR, but it was not externally exposed). At the same time, remove all deconflicting logic for aliases. Docker does absolutely no checks of this nature, and allows two containers to have the same aliases, aliases that conflict with container names, etc - it's just left to DNS to return all the IP addresses, and presumably we round-robin from there? Most tests for the existing code had to be removed because of this. Convert all uses of the old container config.Networks field, which previously included all networks in the container, to use the new DB table. This ensures we actually get an up-to-date list of in-use networks. Also, add network aliases to the output of `podman inspect`. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* network aliases for container creationbaude2020-11-09
| | | | | | | | podman can now support adding network aliases when running containers (--network-alias). It requires an updated dnsname plugin as well as an updated ocicni to work properly. Signed-off-by: baude <bbaude@redhat.com>
* Fix dnsname when joining a different network namespace in a podPaul Holzinger2020-10-30
| | | | | | | | | | When creating a container in a pod the podname was always set as the dns entry. This is incorrect when the container is not part of the pods network namespace. This happend both rootful and rootless. To fix this check if we are part of the pods network namespace and if not use the container name as dns entry. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>