summaryrefslogtreecommitdiff
path: root/libpod/networking_linux.go
Commit message (Collapse)AuthorAge
* rename rootless cni ns to rootless netnsPaul Holzinger2021-11-05
| | | | | | | | | | | | Since we want to use the rootless cni ns also for netavark we should pick a more generic name. The name is now "rootless network namespace" or short "rootless netns". The rename might cause some issues after the update but when the all containers are restarted or the host is rebooted it should work correctly. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* mount full XDG_RUNTIME_DIR in rootless cni nsPaul Holzinger2021-11-05
| | | | | | | We should mount the full runtime directory into the namespace instead of just the netns dir. This allows more use cases. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Fix rootless cni netns cleanup logicPaul Holzinger2021-11-05
| | | | | | | | | | | | | | The check if cleanup is needed reads all container and checks if there are running containers with bridge networking. If we do not find any we have to cleanup the ns. However there was a problem with this because the state is empty by default so the running check never worked. Fortunately the was a second check which relies on the CNI files so we still did cleanup anyway. With netavark I noticed that this check is broken because the CNI files were not present. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* move network alias validation to container createPaul Holzinger2021-09-28
| | | | | | | | | | Podman 4.0 currently errors when you use network aliases for a network which has dns disabled. Because the error happens on network setup this can cause regression for old working containers. The network backend should not validate this. Instead podman should check this at container create time and also for network connect. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* always add short container id as net aliasPaul Holzinger2021-09-28
| | | | | | | | | | | | | | | | This matches what docker does. Also make sure the net aliases are also shown when the container is stopped. docker-compose uses this special alias entry to check if it is already correctly connected to the network. [1] Because we do not support static ips on network connect at the moment calling disconnect && connect will loose the static ip. Fixes #11748 [1] https://github.com/docker/compose/blob/0bea52b18dda3de8c28fcfb0c80cc08b8950645e/compose/service.py#L663-L667 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* standardize logrus messages to upper caseDaniel J Walsh2021-09-22
| | | | | | | | Remove ERROR: Error stutter from logrus messages also. [ NO TESTS NEEDED] This is just code cleanup. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Drop OCICNI dependencyPaul Holzinger2021-09-15
| | | | | | | | | | | We do not use the ocicni code anymore so let's get rid of it. Only the port struct is used but we can copy this into libpod network types so we can debloat the binary. The next step is to remove the OCICNI port mapping form the container config and use the better PortMapping struct everywhere. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Wire network interface into libpodPaul Holzinger2021-09-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Make use of the new network interface in libpod. This commit contains several breaking changes: - podman network create only outputs the new network name and not file path. - podman network ls shows the network driver instead of the cni version and plugins. - podman network inspect outputs the new network struct and not the cni conflist. - The bindings and libpod api endpoints have been changed to use the new network structure. The container network status is stored in a new field in the state. The status should be received with the new `c.getNetworkStatus`. This will migrate the old status to the new format. Therefore old containers should contine to work correctly in all cases even when network connect/ disconnect is used. New features: - podman network reload keeps the ip and mac for more than one network. - podman container restore keeps the ip and mac for more than one network. - The network create compat endpoint can now use more than one ipam config. The man pages and the swagger doc are updated to reflect the latest changes. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* fix restart always with rootlessportPaul Holzinger2021-09-13
| | | | | | | | When a container is automatically restarted due its restart policy and the container uses rootless cni networking with ports forwarded we have to start a new rootlessport process since it exits with conmon. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* rootless cni: resolve absolute symlinks correctlyPaul Holzinger2021-08-30
| | | | | | | | | | | When /etc/resolv.conf is a symlink to an absolute path use it and not join it the the previous path. [NO TESTS NEEDED] This depends on the host layout. Fixes #11358 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* InfraContainer Reworkcdoern2021-08-26
| | | | | | | | | | InfraContainer should go through the same creation process as regular containers. This change was from the cmd level down, involving new container CLI opts and specgen creating functions. What now happens is that both container and pod cli options are populated in cmd and used to create a podSpecgen and a containerSpecgen. The process then goes as follows FillOutSpecGen (infra) -> MapSpec (podOpts -> infraOpts) -> PodCreate -> MakePod -> createPodOptions -> NewPod -> CompleteSpec (infra) -> MakeContainer -> NewContainer -> newContainer -> AddInfra (to pod state) Signed-off-by: cdoern <cdoern@redhat.com>
* podman inspect show exposed portsPaul Holzinger2021-08-24
| | | | | | | | | | | | | | Podman inspect has to show exposed ports to match docker. This requires storing the exposed ports in the container config. A exposed port is shown as `"80/tcp": null` while a forwarded port is shown as `"80/tcp": [{"HostIp": "", "HostPort": "8080" }]`. Also make sure to add the exposed ports to the new image when the container is commited. Fixes #10777 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Fix rootless cni dns without systemd stub resolverPaul Holzinger2021-08-16
| | | | | | | | | | | | | | | | | When a host uses systemd-resolved but not the resolved stub resolver the following symlinks are created: `/etc/resolv.conf` -> `/run/systemd/resolve/stub-resolv.conf` -> `/run/systemd/resolve/resolv.conf`. Because the code uses filepath.EvalSymlinks we put the new resolv.conf to `/run/systemd/resolve/resolv.conf` but the `/run/systemd/resolve/stub-resolv.conf` link does not exists in the mount ns. To fix this we will walk the symlinks manually until we reach the first one under `/run` and use this for the resolv.conf file destination. This fixes a regression which was introduced in e73d4829900c. Fixes #11222 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Run codespell to fix spellingDaniel J Walsh2021-08-11
| | | | | | [NO TESTS NEEDED] Just fixing spelling. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* fix rootless port forwarding with network dis-/connectPaul Holzinger2021-08-03
| | | | | | | | | | | | | | | | | | | | | | | | The rootlessport forwarder requires a child IP to be set. This must be a valid ip in the container network namespace. The problem is that after a network disconnect and connect the eth0 ip changed. Therefore the packages are dropped since the source ip does no longer exists in the netns. One solution is to set the child IP to 127.0.0.1, however this is a security problem. [1] To fix this we have to recreate the ports after network connect and disconnect. To make this work the rootlessport process exposes a socket where podman network connect/disconnect connect to and send to new child IP to rootlessport. The rootlessport process will remove all ports and recreate them with the new correct child IP. Also bump rootlesskit to v0.14.3 to fix a race with RemovePort(). Fixes #10052 [1] https://nvd.nist.gov/vuln/detail/CVE-2021-20199 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Merge pull request #10939 from Luap99/rootless-cniOpenShift Merge Robot2021-07-15
|\ | | | | Fix race conditions in rootless cni setup
| * Fix race conditions in rootless cni setupPaul Holzinger2021-07-15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There was an race condition when calling `GetRootlessCNINetNs()`. It created the rootless cni directory before it got locked. Therefore another process could have called cleanup and removed this directory before it was used resulting in errors. The lockfile got moved into the XDG_RUNTIME_DIR directory to prevent a panic when the parent dir was removed by cleanup. Fixes #10930 Fixes #10922 To make this even more robust `GetRootlessCNINetNs()` will now return locked. This guarantees that we can run `Do()` after `GetRootlessCNINetNs()` before another process could have called `Cleanup()` in between. [NO TESTS NEEDED] CI is flaking, hopefully this will fix it. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | CNI-in-slirp4netns: fix bind-mount for /run/systemd/resolve/stub-resolv.confAkihiro Suda2021-07-15
|/ | | | | | | | | | | | | | | Fix issue 10929 : `[Regression in 3.2.0] CNI-in-slirp4netns DNS gets broken when running a rootful container after running a rootless container` When /etc/resolv.conf on the host is a symlink to /run/systemd/resolve/stub-resolv.conf, we have to mount an empty filesystem on /run/systemd/resolve in the child namespace, so as to isolate the directory from the host mount namespace. Otherwise our bind-mount for /run/systemd/resolve/stub-resolv.conf is unmounted when systemd-resolved unlinks and recreates /run/systemd/resolve/stub-resolv.conf on the host. [NO TESTS NEEDED] Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
* Make rootless-cni setup more robustPaul Holzinger2021-07-06
| | | | | | | | | | | | | | | | | | | The rootless cni namespace needs a valid /etc/resolv.conf file. On some distros is a symlink to somewhere under /run. Because the kernel will follow the symlink before mounting, it is not possible to mount a file at exactly /etc/resolv.conf. We have to ensure that the link target will be available in the rootless cni mount ns. Fixes #10855 Also fixed a bug in the /var/lib/cni directory lookup logic. It used `filepath.Base` instead of `filepath.Dir` and thus looping infinitely. Fixes #10857 [NO TESTS NEEDED] Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Merge pull request #10754 from Luap99/sync-lockOpenShift Merge Robot2021-06-23
|\ | | | | getContainerNetworkInfo: lock netNsCtr before sync
| * getContainerNetworkInfo: lock netNsCtr before syncPaul Holzinger2021-06-22
| | | | | | | | | | | | | | | | | | `syncContainer()` requires the container to be locked, otherwise we can end up with undefined behavior. [NO TESTS NEEDED] Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* | Do not use inotify for OCICNIPaul Holzinger2021-06-22
|/ | | | | | | | | | | | | | | | Podman does not need to watch the cni config directory. If a network is not found in the cache, OCICNI will reload the networks anyway and thus even podman system service should work as expected. Also include a change to not mount a "new" /var by default in the rootless cni ns, instead try to use /var/lib/cni first and then the parent dir. This allows users to store cni configs under /var/... which is the case for the CI compose test. [NO TESTS NEEDED] Fixes #10686 Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Fix network connect race with docker-composePaul Holzinger2021-06-11
| | | | | | | | | | | Network connect/disconnect has to call the cni plugins when the network namespace is already configured. This is the case for `ContainerStateRunning` and `ContainerStateCreated`. This is important otherwise the network is not attached to this network namespace and libpod will throw errors like `network inspection mismatch...` This problem happened when using `docker-compose up` in attached mode. Signed-off-by: Paul Holzinger <pholzing@redhat.com>
* Enable port forwarding on hostBrent Baude2021-06-01
| | | | | | | | | | | Using the gvproxy application on the host, we can now port forward from the machine vm on the host. It requires that 'gvproxy' be installed in an executable location. gvproxy can be found in the containers/gvisor-tap-vsock github repo. [NO TESTS NEEDED] Signed-off-by: Brent Baude <bbaude@redhat.com>
* Merge pull request #9972 from bblenard/issue-5651-hostname-for-container-gatewayOpenShift Merge Robot2021-05-17
|\ | | | | Add host.containers.internal entry into container's etc/hosts
| * Add host.containers.internal entry into container's etc/hostsBaron Lenardson2021-05-17
| | | | | | | | | | | | | | | | | | | | | | This change adds the entry `host.containers.internal` to the `/etc/hosts` file within a new containers filesystem. The ip address is determined by the containers networking configuration and points to the gateway address for the containers networking namespace. Closes #5651 Signed-off-by: Baron Lenardson <lenardson.baron@gmail.com>
* | podman network reload add rootless supportPaul Holzinger2021-05-17
|/ | | | | | | | | | Allow podman network reload to be run as rootless user. While it is unlikely that the iptable rules are flushed inside the rootless cni namespace, it could still happen. Also fix podman network reload --all to ignore errors when a container does not have the bridge network mode, e.g. slirp4netns. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix rootlesskit port forwarder with custom slirp cidrPaul Holzinger2021-04-23
| | | | | | | | | | | | | The source ip for the rootlesskit port forwarder was hardcoded to the standard slirp4netns ip. This is incorrect since users can change the subnet used by slirp4netns with `--network slirp4netns:cidr=10.5.0.0/24`. The container interface ip is always the .100 in the subnet. Only when the rootlesskit port forwarder child ip matches the container interface ip the port forwarding will work. Fixes #9828 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* podman unshare: add --rootless-cni to join the nsPaul Holzinger2021-04-07
| | | | | | | | | | Add a new --rootless-cni option to podman unshare to also join the rootless-cni network namespace. This is useful if you want to connect to a rootless container via IP address. This is only possible from the rootless-cni namespace and not from the host namespace. This option also helps to debug problems in the rootless-cni namespace. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* rootless cni add /usr/sbin to PATH if not presentPaul Holzinger2021-04-06
| | | | | | | | | The CNI plugins need access to iptables in $PATH. On debian /usr/sbin is not added to $PATH for rootless users. This will break rootless cni completely. To prevent breaking existing users add /usr/sbin to $PATH in podman if needed. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Use the slrip4netns dns in the rootless cni nsPaul Holzinger2021-04-01
| | | | | | | | If a user only has a local dns server in the resolv.conf file the dns resolution will fail. Instead we create a new resolv.conf which will use the slirp4netns dns. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Cleanup the rootless cni namespacePaul Holzinger2021-04-01
| | | | | | | Delte the network namespace and kill the slirp4netns process when it is no longer needed. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Remove unused rootless-cni-infra container filesPaul Holzinger2021-04-01
| | | | Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Only use rootless RLK when the container has portsPaul Holzinger2021-04-01
| | | | | | | Do not invoke the rootlesskit port forwarder when the container has no ports. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Enable rootless network connect/disconnectPaul Holzinger2021-04-01
| | | | | | | With the new rootless cni supporting network connect/disconnect is easy. Combine common setps into extra functions to prevent code duplication. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Move slirp4netns functions into an extra filePaul Holzinger2021-04-01
| | | | | | This should make maintenance easier. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Add rootless support for cni and --uidmapPaul Holzinger2021-04-01
| | | | | | This is supported with the new rootless cni logic. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* rootless cni without infra containerPaul Holzinger2021-04-01
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead of creating an extra container create a network and mount namespace inside the podman user namespace. This ns is used to for rootless cni operations. This helps to align the rootless and rootful network code path. If we run as rootless we just have to set up a extra net ns and initialize slirp4netns in it. The ocicni lib will be called in that net ns. This design allows allows easier maintenance, no extra container with pause processes, support for rootless cni with --uidmap and possibly more. The biggest problem is backwards compatibility. I don't think live migration can be possible. If the user reboots or restart all cni containers everything should work as expected again. The user is left with the rootless-cni-infa container and image but this can safely be removed. To make the existing cni configs work we need execute the cni plugins in a extra mount namespace. This ensures that we can safely mount over /run and /var which have to be writeable for the cni plugins without removing access to these files by the main podman process. One caveat is that we need to keep the netns files at `XDG_RUNTIME_DIR/netns` accessible. `XDG_RUNTIME_DIR/rootless-cni/{run,var}` will be mounted to `/{run,var}`. To ensure that we keep the netns directory we bind mount this relative to the new root location, e.g. XDG_RUNTIME_DIR/rootless-cni/run/user/1000/netns before we mount the run directory. The run directory is mounted recursive, this makes the netns directory at the same path accessible as before. This also allows iptables-legacy to work because /run/xtables.lock is now writeable. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Silence podman network reload errors with iptables-nftPaul Holzinger2021-03-30
| | | | | | | | | | | | Make sure we do not display the expected error when using podman network reload. This is already done for iptables-legacy however iptables-nft creates a slightly different error message so check for this as well. The error is logged at info level. [NO TESTS NEEDED] The test VMs do not use iptables-nft so there is no way to test this. It is already tested for iptables-legacy. Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix cni teardown errorsPaul Holzinger2021-03-04
| | | | | | | | | | | Make sure to pass the cni interface descriptions to cni teardowns. Otherwise cni cannot find the correct cache files because the interface name might not match the networks. This can only happen when network disconnect was used. Fixes #9602 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Network connect error if net mode is not bridgePaul Holzinger2021-02-23
| | | | | | | | | | Only the the network mode bridge supports cni networks. Other network modes cannot use network connect/disconnect so we should throw a error. Fixes #9496 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix podman network IDs handlingPaul Holzinger2021-02-22
| | | | | | | | | | | | | The libpod network logic knows about networks IDs but OCICNI does not. We cannot pass the network ID to OCICNI. Instead we need to make sure we only use network names internally. This is also important for libpod since we also only store the network names in the state. If we would add a ID there the same networks could accidentally be added twice. Fixes #9451 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* bump go module to v3Valentin Rothberg2021-02-22
| | | | | | | | | We missed bumping the go module, so let's do it now :) * Automated go code with github.com/sirkon/go-imports-rename * Manually via `vgrep podman/v2` the rest Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Enable golint linterPaul Holzinger2021-02-11
| | | | | | | | Use the golint linter and fix the reported problems. [NO TESTS NEEDED] Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Fix podman network disconnect wrong NetworkStatus numberPaul Holzinger2021-02-04
| | | | | | | | | | | | | The allocated `tmpNetworkStatus` must be allocated with the length 0. Otherwise append would add new elements to the end of the slice and not at the beginning of the allocated memory. This caused inspect to fail since the number of networks did not matched the number of network statuses. Fixes #9234 Signed-off-by: Paul Holzinger <paul.holzinger@web.de>
* Make slirp MTU configurable (network_cmd_options)bitstrings2021-02-02
| | | | | | | | The mtu default value is currently forced to 65520. This let the user control it using the config key network_cmd_options, i.e.: network_cmd_options=["mtu=9000"] Signed-off-by: bitstrings <pino.silvaggio@gmail.com>
* Add default net info in container inspectbaude2021-01-26
| | | | | | | | | | | | | | when inspecting a container that is only connected to the default network, we should populate the default network in the container inspect information. Fixes: #6618 Signed-off-by: baude <bbaude@redhat.com> MH: Small fixes, added another test Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* networking: lookup child IP in networksGiuseppe Scrivano2021-01-23
| | | | | | | | | | | | if a CNI network is added to the container, use the IP address in that network instead of hard-coding the slirp4netns default. commit 5e65f0ba30f3fca73f8c207825632afef08378c1 introduced this regression. Closes: https://github.com/containers/podman/issues/9065 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* libpod: move slirp magic IPs to constsGiuseppe Scrivano2021-01-22
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootlessport: set source IP to slirp4netns deviceGiuseppe Scrivano2021-01-22
| | | | | | | | | set the source IP to the slirp4netns address instead of 127.0.0.1 when using rootlesskit. Closes: https://github.com/containers/podman/issues/5138 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>