summaryrefslogtreecommitdiff
path: root/pkg/rootless/rootless_linux.c
Commit message (Collapse)AuthorAge
* validate fds --preserve-fdsQi Wang2020-08-04
| | | | | | validate file descriptors passed from podman run and podman exec --preserve-fds. Signed-off-by: Qi Wang <qiwan@redhat.com>
* rootless: system service joins immediately the namespacesGiuseppe Scrivano2020-08-03
| | | | | | | | | | when there is a pause process running, let the "system service" podman instance join immediately the existing namespaces. Closes: https://github.com/containers/podman/issues/7180 Closes: https://github.com/containers/podman/issues/6660 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: child exits immediately on userns errorsGiuseppe Scrivano2020-07-30
| | | | | | | if the parent process failed to create the user namespace, let the child exit immediately. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Add podman image mountDaniel J Walsh2020-07-28
| | | | | | | | | | | | | There are many use cases where you want to just mount an image without creating a container on it. For example you might want to just examine the content in an image after you pull it for security analysys. Or you might want to just use the executables on the image without running it in a container. The image is mounted readonly since we do not want people changing images. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Cleanup handling of podman mount/unmountDaniel J Walsh2020-07-27
| | | | | | | | | | | We should default to the user name unmount rather then the internal name of umount. Also User namespace was not being handled correctly. We want to inform the user that if they do a mount when in rootless mode that they have to be first in the podman unshare state. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* rootless: move ns open before forkGiuseppe Scrivano2020-04-29
| | | | | | | | | | commit 788fdc685b00dee5ccb594bef845204250c4c123 introduced a race where the target process dies before the child process opens the namespace files. Move the open before the fork so if it fails the parent process can attempt to join a different container instead of failing. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: move join namespace inside child processGiuseppe Scrivano2020-04-20
| | | | | | | | open the namespace file descriptors inside of the child process. Closes: https://github.com/containers/libpod/issues/5873 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: use snprintfGiuseppe Scrivano2020-04-13
| | | | | | use directly snprintf instead of strlen+strcpy. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: fix usage with hidepid=1Giuseppe Scrivano2020-03-19
| | | | | | | | | | | when /proc is mounted with hidepid=1 a process doesn't see processes from the outer user namespace. This causes an issue reading the cmdline from the parent process. To address it, always read the command line from /proc/self instead of using /proc/PARENT_PID. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: fix segfault when open fd >= FD_SETSIZEGiuseppe Scrivano2020-02-25
| | | | | | | | | if there are more than FD_SETSIZE open fds passed down to the Podman process, the initialization code could crash as it attempts to store them into a fd_set. Use an array of fd_set structs, each of them holding only FD_SETSIZE file descriptors. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: enable shortcut only for podmanGiuseppe Scrivano2020-01-29
| | | | | | | disable joining automatically the user namespace if the process is not podman. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: set C variables also on shortcutGiuseppe Scrivano2020-01-20
| | | | | | | | | make sure the rootless env variables are set also when we are joining directly the user+mount namespace without creating a new process. It is required by pkg/unshare in containers/common. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: add fallback for renameat2 at runtimeGiuseppe Scrivano2019-12-04
| | | | | | | | | | | | the renameat2 syscall might be defined in the C library but lacking support in the kernel. In such case, let it fallback to open(O_CREAT)+rename as it does on systems lacking the definition for renameat2. Closes: https://github.com/containers/libpod/issues/4570 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: provide workaround for missing renameat2Giuseppe Scrivano2019-11-06
| | | | | | | | | on RHEL 7.7 renameat2 is not implemented for s390x, provide a workaround. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1768519 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: use SYS_renameat2 instead of __NR_renameat2Giuseppe Scrivano2019-11-06
| | | | | | use the correct definition for the syscall number. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Merge pull request #3782 from eriksjolund/fix_realloc_in_rootless_linux.cOpenShift Merge Robot2019-08-11
|\ | | | | Fix incorrect use of realloc()
| * Fix incorrect use of realloc()Erik Sjölund2019-08-11
| | | | | | | | Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
* | Adjust read count so that a newline can be added afterwardsErik Sjölund2019-08-11
|/ | | | Signed-off-by: Erik Sjölund <erik.sjolund@gmail.com>
* Fix a couple of errors descovered by coverityDaniel J Walsh2019-08-09
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* rootless: do not join namespace if it has already euid == 0Giuseppe Scrivano2019-07-01
| | | | | | | | | do not attempt to join the rootless namespace if it is running already with euid == 0. Closes: https://github.com/containers/libpod/issues/3463 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Exclude SIGTERM from blocked signals for pause process.Danila Kiver2019-06-28
| | | | | | | | | | | | Currently pause process blocks all signals which may cause its termination, including SIGTERM. This behavior hangs init(1) during system shutdown, until pause process gets SIGKILLed after some grace period. To avoid this hanging, SIGTERM is excluded from list of blocked signals. Fixes #3440 Signed-off-by: Danila Kiver <danila.kiver@mail.ru>
* Merge pull request #3379 from openSUSE/rootless-fixOpenShift Merge Robot2019-06-21
|\ | | | | Fix format specifiers in rootless_linux.c
| * Fix format specifiers in rootless_linux.cSascha Grunert2019-06-20
| | | | | | | | | | | | | | Format `%d` expects argument of type `int`, but the argument has a type of `long int`. Signed-off-by: Sascha Grunert <sgrunert@suse.com>
* | Merge pull request #3380 from openSUSE/asprintf-fixOpenShift Merge Robot2019-06-20
|\ \ | | | | | | Handle possible asprintf failure in rootless_linux.c
| * | Handle possible asprintf failure in rootless_linux.cSascha Grunert2019-06-20
| |/ | | | | | | | | | | If `asprintf` fails we early exit now. Signed-off-by: Sascha Grunert <sgrunert@suse.com>
* / Fix execvp uage in rootless_linux.cSascha Grunert2019-06-20
|/ | | | | | | The second argument of `execlp` should be of type `char *`, so we need to add an additional argument there. Signed-off-by: Sascha Grunert <sgrunert@suse.com>
* rootless: block signals on re-execGiuseppe Scrivano2019-06-03
| | | | | | | | | | | | | we are allowed to use only signal safe functions between a fork of a multithreaded application and the next execve. Since setenv(3) is not signal safe, block signals. We are already doing it for creating a new namespace. This is mostly a cleanup since reexec_in_user_namespace_wait is used only only to join existing namespaces when we have not a pause.pid file. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: use TEMP_FAILURE_RETRY macroGiuseppe Scrivano2019-05-31
| | | | | | avoid checking for EINTR for every syscall that could block. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: fix return typeGiuseppe Scrivano2019-05-31
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: make sure the buffer is NUL terminatedGiuseppe Scrivano2019-05-31
| | | | | | | after we read from the pause PID file, NUL terminate the buffer to avoid reading garbage from the stack. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: new function to join existing conmon processesGiuseppe Scrivano2019-05-25
| | | | | | | | | | | | | | | move the logic for joining existing namespaces down to the rootless package. In main_local we still retrieve the list of conmon pid files and use it from the rootless package. In addition, create a temporary user namespace for reading these files, as the unprivileged user might not have enough privileges for reading the conmon pid file, for example when running with a different uidmap and root in the container is different than the rootless user. Closes: https://github.com/containers/libpod/issues/3187 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: block signals for pauseGiuseppe Scrivano2019-05-25
| | | | | | | block signals for the pause process, so it can't be killed by mistake. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: store also the original GID in the hostGiuseppe Scrivano2019-05-23
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: join namespace immediately when possibleGiuseppe Scrivano2019-05-17
| | | | | | | | | | | add a shortcut for joining immediately the namespace so we don't need to re-exec Podman. With the pause process simplificaton, we can now attempt to join the namespaces as soon as Podman starts (and before the Go runtime kicks in), so that we don't need to re-exec and use just one process. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: use a pause processGiuseppe Scrivano2019-05-17
| | | | | | | | | | | | | | | | | use a pause process to keep the user and mount namespace alive. The pause process is created immediately on reload, and all successive Podman processes will refer to it for joining the user&mount namespace. This solves all the race conditions we had on joining the correct namespaces using the conmon processes. As a fallback if the join fails for any reason (e.g. the pause process was killed), then we try to join the running containers as we were doing before. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: not close more FDs than neededGiuseppe Scrivano2019-04-18
| | | | | | | | | | | | we were previously closing as many FDs as they were open when we first started Podman in the range (3-MAX-FD). This would cause issues if there were empty intervals, as these FDs are later on used by the Golang runtime. Store exactly what FDs were first open in a fd_set, so that we can close exactly the FDs that were open at startup. Closes: https://github.com/containers/libpod/issues/2964 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Revert "rootless: set controlling terminal for podman in the userns"Giuseppe Scrivano2019-04-14
| | | | | | | | This reverts commit 531514e8231e7f42efb7e7992d62e516f9577363. Closes: https://github.com/containers/libpod/issues/2926 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: set controlling terminal for podman in the usernsGiuseppe Scrivano2019-04-12
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: use a single user namespaceGiuseppe Scrivano2019-04-01
| | | | | | | | | | | | | | | | | | | | | simplify the rootless implementation to use a single user namespace for all the running containers. This makes the rootless implementation behave more like root Podman, where each container is created in the host environment. There are multiple advantages to it: 1) much simpler implementation as there is only one namespace to join. 2) we can join namespaces owned by different containers. 3) commands like ps won't be limited to what container they can access as previously we either had access to the storage from a new namespace or access to /proc when running from the host. 4) rootless varlink works. 5) there are only two ways to enter in a namespace, either by creating a new one if no containers are running or joining the existing one from any container. Containers created by older Podman versions must be restarted. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: change env prefixGiuseppe Scrivano2019-03-28
| | | | | | | | | | from _LIBPOD to _CONTAINERS. The same change was done in buildah unshare. This is necessary for podman to detect we are running in a rootless environment and work properly from a "buildah unshare" session. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: propagate errors from infoGiuseppe Scrivano2019-03-08
| | | | | | | | | we use "podman info" to reconfigure the runtime after a reboot, but we don't propagate the error message back if something goes wrong. Closes: https://github.com/containers/libpod/issues/2584 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: fix clone syscall on s390 and cris archsGiuseppe Scrivano2019-03-05
| | | | | | | | | | | | | | | from the clone man page: On the cris and s390 architectures, the order of the first two arguments is reversed: long clone(void *child_stack, unsigned long flags, int *ptid, int *ctid, unsigned long newtls); Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1672714 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: force same cwd when re-execingGiuseppe Scrivano2019-02-22
| | | | | | | | | | when joining an existing namespace, we were not maintaining the current working directory, causing commands like export -o to fail when they weren't referring to absolute paths. Closes: https://github.com/containers/libpod/issues/2381 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Adjust LISTEN_PID for reexec in varlink modeHarald Hoyer2019-02-21
| | | | | | | | Because the varlink server honors the socket activation protocol, LISTEN_PID has to be adjusted with the new PID. https://varlink.org/FAQ.html#how-does-socket-activation-work Signed-off-by: Harald Hoyer <harald@redhat.com>
* Cleanup coverity scan issuesDaniel J Walsh2019-01-15
| | | | | | If realloc fails, then buffer will be leaked, this change frees up the buffer. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* rootless: add function to join user and mount namespaceGiuseppe Scrivano2018-12-21
| | | | | | | | | | Add the possibility to join directly the user and mount namespace without looking up the parent of the user namespace. We need this in order to be able the conmon process, as the mount namespace is kept alive only there. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: detect when user namespaces are not enabledGiuseppe Scrivano2018-10-11
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: report more error messages from the startup phaseGiuseppe Scrivano2018-10-11
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: fix an hang on older versions of setresuid/setresgidGiuseppe Scrivano2018-10-11
| | | | | | | | | | | | | | | | the issue is caused by the Go Runtime that messes up with the process signals, overriding SIGSETXID and SIGCANCEL which are used internally by glibc. They are used to inform all the threads to update their stored uid/gid information. This causes a hang on the set*id glibc wrappers since the handler installed by glibc is never invoked. Since we are running with only one thread, we don't really need to update other threads or even the current thread as we are not using getuid/getgid before the execvp. Closes: https://github.com/containers/libpod/issues/1625 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: check uid with Geteuid() instead of Getuid()Giuseppe Scrivano2018-09-04
| | | | | | | | | | | | change the tests to use chroot to set a numeric UID/GID. Go syscall.Credential doesn't change the effective UID/GID of the process. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com> Closes: #1372 Approved by: mheon