summaryrefslogtreecommitdiff
path: root/pkg/spec/spec.go
Commit message (Collapse)AuthorAge
* Split up create config handling of namespaces and securityPeter Hunt2019-11-07
| | | | | | | | As it stands, createconfig is a huge struct. This works fine when the only caller is when we create a container with a fully created config. However, if we wish to share code for security and namespace configuration, a single large struct becomes unweildy, as well as difficult to configure with the single createConfigToOCISpec function. This PR breaks up namespace and security configuration into their own structs, with the eventual goal of allowing the namespace/security fields to be configured by the pod create cli, and allow the infra container to share this with the pod's containers. Signed-off-by: Peter Hunt <pehunt@redhat.com>
* namespaces: by default create cgroupns on cgroups v2Giuseppe Scrivano2019-11-05
| | | | | | | | | | | | | | | | change the default on cgroups v2 and create a new cgroup namespace. When a cgroup namespace is used, processes inside the namespace are only able to see cgroup paths relative to the cgroup namespace root and not have full visibility on all the cgroups present on the system. The previous behaviour is maintained on a cgroups v1 host, where a cgroup namespace is not created by default. Closes: https://github.com/containers/libpod/issues/4363 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* add libpod/configValentin Rothberg2019-10-31
| | | | | | | | | | | | Refactor the `RuntimeConfig` along with related code from libpod into libpod/config. Note that this is a first step of consolidating code into more coherent packages to make the code more maintainable and less prone to regressions on the long runs. Some libpod definitions were moved to `libpod/define` to resolve circular dependencies. Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* rootless: detect no system session with --cgroup-manager=systemdGiuseppe Scrivano2019-10-23
| | | | | | | if the cgroup manager is set to systemd, detect if dbus is available, otherwise fallback to --cgroup-manager=cgroupfs. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: do not set PIDs limit if --cgroup-manager=cgroupfsGiuseppe Scrivano2019-10-11
| | | | | | | even if the system is using cgroups v2, rootless is not able to setup limits when the cgroup-manager is not systemd. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Setup a reasonable default for pids-limit 4096Daniel J Walsh2019-10-04
| | | | | | | | | | | CRI-O defaults to 1024 for the maximum pids in a container. Podman should have a similar limit. Once we have a containers.conf, we can set the limit in this file, and have it easily customizable. Currently the documentation says that -1 sets pids-limit=max, but -1 fails. This patch allows -1, but also indicates that 0 also sets the max pids limit. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* rootless: allow cgroupfs manager on cgroups v2Giuseppe Scrivano2019-10-02
| | | | | | | if there are no resources specified, make sure the OCI resources block is empty so that the OCI runtime won't complain. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Add support for launching containers without CGroupsMatthew Heon2019-09-10
| | | | | | | This is mostly used with Systemd, which really wants to manage CGroups itself when managing containers via unit file. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Set base mount options for bind mounts from base systemMatthew Heon2019-08-28
| | | | | | | | | | | | | | | | | | If I mount, say, /usr/bin into my container - I expect to be able to run the executables in that mount. Unconditionally applying noexec would be a bad idea. Before my patches to change mount options and allow exec/dev/suid being set explicitly, we inferred the mount options from where on the base system the mount originated, and the options it had there. Implement the same functionality for the new option handling. There's a lot of performance left on the table here, but I don't know that this is ever going to take enough time to make it worth optimizing. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add support for 'exec', 'suid', 'dev' mount flagsMatthew Heon2019-08-28
| | | | | | | | | | | | | | | | | | Previously, we explicitly set noexec/nosuid/nodev on every mount, with no ability to disable them. The 'mount' command on Linux will accept their inverses without complaint, though - 'noexec' is counteracted by 'exec', 'nosuid' by 'suid', etc. Add support for passing these options at the command line to disable our explicit forcing of security options. This also cleans up mount option handling significantly. We are still parsing options in more than one place, which isn't good, but option parsing for bind and tmpfs mounts has been unified. Fixes: #3819 Fixes: #3803 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* refer to container whose namespace we sharePeter Hunt2019-08-07
| | | | Signed-off-by: Peter Hunt <pehunt@redhat.com>
* Properly share UTS namespaces in a podPeter Hunt2019-08-07
| | | | | | Sharing a UTS namespace means sharing the hostname. Fix situations where a container in a pod didn't properly share the hostname of the pod. Signed-off-by: Peter Hunt <pehunt@redhat.com>
* Vendor in buildah 1.9.2Daniel J Walsh2019-07-30
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* podman: support --userns=ns|containerGiuseppe Scrivano2019-07-25
| | | | | | | | allow to join the user namespace of another container. Closes: https://github.com/containers/libpod/issues/3629 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Merge pull request #3593 from giuseppe/rootless-privileged-devicesOpenShift Merge Robot2019-07-18
|\ | | | | rootless: add host devices with --privileged
| * rootless: add rw devices with --privilegedGiuseppe Scrivano2019-07-18
| | | | | | | | | | | | | | | | | | when --privileged is specified, add all the devices that are usable by the user. Closes: https://bugzilla.redhat.com/show_bug.cgi?id=1730773 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | libpod: support for cgroup namespaceGiuseppe Scrivano2019-07-18
|/ | | | | | | | | | | | | | allow a container to run in a new cgroup namespace. When running in a new cgroup namespace, the current cgroup appears to be the root, so that there is no way for the container to access cgroups outside of its own subtree. By default it uses --cgroup=host to keep the previous behavior. To create a new namespace, --cgroup=private must be provided. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Populate inspect with security-opt settingsMatthew Heon2019-07-17
| | | | | | | | We can infer no-new-privileges. For now, manually populate seccomp (can't infer what file we sourced from) and SELinux/Apparmor (hard to tell if they're enabled or not). Signed-off-by: Matthew Heon <mheon@redhat.com>
* Move the HostConfig portion of Inspect inside libpodMatthew Heon2019-07-17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we first began writing Podman, we ran into a major issue when implementing Inspect. Libpod deliberately does not tie its internal data structures to Docker, and stores most information about containers encoded within the OCI spec. However, Podman must present a CLI compatible with Docker, which means it must expose all the information in 'docker inspect' - most of which is not contained in the OCI spec or libpod's Config struct. Our solution at the time was the create artifact. We JSON'd the complete CreateConfig (a parsed form of the CLI arguments to 'podman run') and stored it with the container, restoring it when we needed to run commands that required the extra info. Over the past month, I've been looking more at Inspect, and refactored large portions of it into Libpod - generating them from what we know about the OCI config and libpod's (now much expanded, versus previously) container configuration. This path comes close to completing the process, moving the last part of inspect into libpod and removing the need for the create artifact. This improves libpod's compatability with non-Podman containers. We no longer require an arbitrarily-formatted JSON blob to be present to run inspect. Fixes: #3500 Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* spec: rework --ulimit hostGiuseppe Scrivano2019-07-17
| | | | | | | it seems enough to not specify any ulimit block to maintain the host limits. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Merge pull request #3563 from giuseppe/fix-single-mapping-rootlessOpenShift Merge Robot2019-07-12
|\ | | | | spec: fix userns with less than 5 gids
| * spec: fix userns with less than 5 gidsGiuseppe Scrivano2019-07-12
| | | | | | | | | | | | | | when the container is running in a user namespace, check if gid=5 is available, otherwise drop the option gid=5 for /dev/pts. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | Merge pull request #3491 from giuseppe/rlimit-hostOpenShift Merge Robot2019-07-11
|\ \ | |/ |/| podman: add --ulimit host
| * podman: add --ulimit hostGiuseppe Scrivano2019-07-08
| | | | | | | | | | | | | | | | | | add a simple way to copy ulimit values from the host. if --ulimit host is used then the current ulimits in place are copied to the container. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* | first pass of corrections for golangci-lintbaude2019-07-10
|/ | | | Signed-off-by: baude <bbaude@redhat.com>
* util: drop IsCgroup2UnifiedMode and use it from cgroupsGiuseppe Scrivano2019-06-26
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: force resources to be nil on cgroup v1Giuseppe Scrivano2019-05-20
| | | | | | | | | force the resources block to be empty instead of having default values. Regression introduced by 8e88461511e81d2327e4c1a1315bb58fda1827ca Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Don't set apparmor if --priviligedDaniel J Walsh2019-05-20
| | | | Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* rootless, spec: allow resources with cgroup v2Giuseppe Scrivano2019-05-13
| | | | | | | We were always raising an error when the rootless user attempted to setup resources, but this is not the case anymore with cgroup v2. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Move handling of ReadOnlyTmpfs into new mounts codeMatthew Heon2019-05-01
| | | | Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Migrate to unified volume handling codeMatthew Heon2019-05-01
| | | | | | | | | | | | | Unify handling for the --volume, --mount, --volumes-from, --tmpfs and --init flags into a single file and set of functions. This will greatly improve readability and maintainability. Further, properly handle superceding and conflicting mounts. Our current patchwork has serious issues when mounts conflict, or when a mount from --volumes-from or an image volume should be overwritten by a user volume or named volume. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Remove non-config fields from CreateConfigMatthew Heon2019-05-01
| | | | | | | | | The goal here is to keep only the configuration directly used to build the container in CreateConfig, and scrub temporary state and helpers that we need to generate. We'll keep those internally in MakeContainerConfig. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add a new function for converting a CreateConfigMatthew Heon2019-05-01
| | | | | | | | | | | Right now, there are two major API calls necessary to turn a filled-in CreateConfig into the options and OCI spec necessary to make a libpod Container. I'm intending on refactoring both of these extensively to unify a few things, so make a common frontend to both that will prevent API changes from leaking out of the package. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* auto pass http_proxy into containerJames Cassell2019-04-30
| | | | Signed-off-by: James Cassell <code@james.cassell.me>
* Add --read-only-tmpfs optionsDaniel J Walsh2019-04-26
| | | | | | | | | | | The --read-only-tmpfs option caused podman to mount tmpfs on /run, /tmp, /var/tmp if the container is running int read-only mode. The default is true, so you would need to execute a command like --read-only --read-only-tmpfs=false to turn off this behaviour. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* spec: mask /sys/kernel when bind mounting /sysGiuseppe Scrivano2019-04-11
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* oci: add /sys/kernel to the masked pathsGiuseppe Scrivano2019-04-11
| | | | Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Drop LocalVolumes from our the databaseMatthew Heon2019-04-04
| | | | | | | | We were never using it. It's actually a potentially quite sizable field (very expensive to decode an array of structs!). Removing it should do no harm. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Add handling for new named volumes code in pkg/specMatthew Heon2019-04-04
| | | | | | | | | Now that named volumes must be explicitly enumerated rather than passed in with all other volumes, we need to split normal and named volumes up before passing them into libpod. This PR does this. Signed-off-by: Matthew Heon <matthew.heon@pm.me>
* Vendor docker/docker, fsouza and more #2TomSweeneyRedHat2019-03-13
| | | | | | | | | | | | Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com> Vendors in fsouza/docker-client, docker/docker and a few more related. Of particular note, changes to the TweakCapabilities() function from docker/docker along with the parse.IDMappingOptions() function from Buildah. Please pay particular attention to the related changes in the call from libpod to those functions during the review. Passes baseline tests.
* Fix SELinux on host shared systems in usernsDaniel J Walsh2019-03-11
| | | | | | | | | | | | | Currently if you turn on --net=host on a rootless container and have selinux-policy installed in the image, tools running with SELinux will see that the system is SELinux enabled in rootless mode. This patch mounts a tmpfs over /sys/fs/selinux blocking this behaviour. This patch also fixes the fact that if you shared --pid=host we were not masking over certin /proc paths. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* podman: fix ro bind mounts if no* opts are on the sourceGiuseppe Scrivano2019-02-25
| | | | | | | | | | | | | | This is a workaround for the runc issue: https://github.com/opencontainers/runc/issues/1247 If the source of a bind mount has any of nosuid, noexec or nodev, be sure to propagate them to the bind mount so that when runc tries to remount using MS_RDONLY, these options are also used. Closes: https://github.com/containers/libpod/issues/2312 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: do not override /dev/pts if not neededGiuseppe Scrivano2019-02-06
| | | | | | | | | | when running in rootless mode we were unconditionally overriding /dev/pts to take ride of gid=5. This is not needed when multiple gids are present in the namespace, which is always the case except when running the tests suite with only one mapping. So change it to check how many gids are present before overriding the default mount. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* rootless: fix --pid=host without --privilegedGiuseppe Scrivano2019-01-18
| | | | | | | When using --pid=host don't try to cover /proc paths, as they are coming from the /proc bind mounted from the host. Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* apparmor: apply default profile at container initializationValentin Rothberg2019-01-09
| | | | | | | | | | | | | | | | | | | Apply the default AppArmor profile at container initialization to cover all possible code paths (i.e., podman-{start,run}) before executing the runtime. This allows moving most of the logic into pkg/apparmor. Also make the loading and application of the default AppArmor profile versio-indepenent by checking for the `libpod-default-` prefix and over-writing the profile in the run-time spec if needed. The intitial run-time spec of the container differs a bit from the applied one when having started the container, which results in displaying a potentially outdated AppArmor profile when inspecting a container. To fix that, load the container config from the file system if present and use it to display the data. Fixes: #2107 Signed-off-by: Valentin Rothberg <rothberg@redhat.com>
* Update vendor of runcDaniel J Walsh2019-01-04
| | | | | | | | | | | | | | Updating the vendor or runc to pull in some fixes that we need. In order to get this vendor to work, we needed to update the vendor of docker/docker, which causes all sorts of issues, just to fix the docker/pkg/sysinfo. Rather then doing this, I pulled in pkg/sysinfo into libpod and fixed the code locally. I then switched the use of docker/pkg/sysinfo to libpod/pkg/sysinfo. I also switched out the docker/pkg/mount to containers/storage/pkg/mount Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Fixes to handle /dev/shm correctly.Daniel J Walsh2018-12-24
| | | | | | | | | | | | | | | | | | We had two problems with /dev/shm, first, you mount the container read/only then /dev/shm was mounted read/only. This is a bug a tmpfs directory should be read/write within a read-only container. The second problem is we were ignoring users mounted /dev/shm from the host. If user specified podman run -d -v /dev/shm:/dev/shm ... We were dropping this mount and still using the internal mount. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* Allow users to specify a directory for additonal devicesDaniel J Walsh2018-12-21
| | | | | | | Podman will search through the directory and will add any device nodes that it finds. If no devices are found we return an error. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
* rootless: add new netmode "slirp4netns"Giuseppe Scrivano2018-11-27
| | | | | | | | so that inspect reports the correct network configuration. Closes: https://github.com/containers/libpod/issues/1453 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
* Mount proper cgroup for systemd to manage inside of the container.Daniel J Walsh2018-10-15
| | | | | | | | | | | | We are still requiring oci-systemd-hook to be installed in order to run systemd within a container. This patch properly mounts /sys/fs/cgroup/systemd/libpod_parent/libpod-UUID on /sys/fs/cgroup/systemd inside of container. Since we need the UUID of the container, we needed to move Systemd to be a config option of the container. Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>