| Commit message (Collapse) | Author | Age |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CRIU supports to leave processes running after checkpointing:
-R|--leave-running leave tasks in running state after checkpoint
runc also support to leave containers running after checkpointing:
--leave-running leave the process running after checkpointing
With this commit the support to leave a container running after
checkpointing is brought to Podman:
--leave-running, -R leave the container running after writing checkpoint to disk
Now it is possible to checkpoint a container at some point in time
without stopping the container. This can be used to rollback the
container to an early state:
$ podman run --tmpfs /tmp --name podman-criu-test -d docker://docker.io/yovfiatbeb/podman-criu-test
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
3
$ podman container checkpoint -R -l
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
4
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
5
$ podman stop -l
$ podman container restore -l
$ curl 10.88.64.253:8080/examples/servlets/servlet/HelloWorldExample
4
So after checkpointing the container kept running and was stopped after
some time. Restoring this container will restore the state right at the
checkpoint.
Signed-off-by: Adrian Reber <areber@redhat.com>
|
|
|
|
|
|
|
|
|
| |
For upcoming changes to the Checkpoint() functions this commit switches
checkpoint options from a boolean to a struct, so that additional
options can be passed easily to Checkpoint() without changing the
function parameters all the time.
Signed-off-by: Adrian Reber <areber@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
At scale, it appears that we sometimes hit the 1000ms timeout to create
the PID file when a container is created or executed.
Increasing the value to 60s should help when running a lot of containers
in heavy-loaded environment.
Related #1495
Fixes #1816
Signed-off-by: Emilien Macchi <emilien@redhat.com>
|
|\
| |
| | |
move defer'd function declaration ahead of prepare error return
|
| |
| |
| |
| | |
Signed-off-by: baude <bbaude@redhat.com>
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When syncing container state, we normally call out to runc to see
the container's status. This does have significant performance
implications, though, and we've seen issues with large amounts of
runc processes being spawned.
This patch attempts to use stat calls on the container exit file
created by Conmon instead to sync state. This massively decreases
the cost of calling updateContainer (it has gone from an
almost-unconditional fork/exec of runc to a single stat call that
can be avoided in most states).
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
|
|\
| |
| | |
get user and group information using securejoin and runc's user library
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
for the purposes of performance and security, we use securejoin to contstruct
the root fs's path so that symlinks are what they appear to be and no pointing
to something naughty.
then instead of chrooting to parse /etc/passwd|/etc/group, we now use the runc user/group
methods which saves us quite a bit of performance.
Signed-off-by: baude <bbaude@redhat.com>
|
|/
|
|
|
|
|
|
|
| |
Only return `ErrCtrStateInvalid` errors when the mount counter is equal
to 1. Also fix the "can't unmount [...] last mount[..]" error which
hasn't been returned when the error passed to `errors.Errorf()` is nil.
Fixes: #1695
Signed-off-by: Valentin Rothberg <vrothberg@suse.com>
|
|
|
|
|
|
| |
Like Ricky Bobby, we want to go fast.
Signed-off-by: baude <bbaude@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
runc uses CRIU to support checkpoint and restore of containers. This
brings an initial checkpoint/restore implementation to podman.
None of the additional runc flags are yet supported and container
migration optimization (pre-copy/post-copy) is also left for the future.
The current status is that it is possible to checkpoint and restore a
container. I am testing on RHEL-7.x and as the combination of RHEL-7 and
CRIU has seccomp troubles I have to create the container without
seccomp.
With the following steps I am able to checkpoint and restore a
container:
# podman run --security-opt="seccomp=unconfined" -d registry.fedoraproject.org/f27/httpd
# curl -I 10.22.0.78:8080
HTTP/1.1 403 Forbidden # <-- this is actually a good answer
# podman container checkpoint <container>
# curl -I 10.22.0.78:8080
curl: (7) Failed connect to 10.22.0.78:8080; No route to host
# podman container restore <container>
# curl -I 10.22.0.78:8080
HTTP/1.1 403 Forbidden
I am using CRIU, runc and conmon from git. All required changes for
checkpoint/restore support in podman have been merged in the
corresponding projects.
To have the same IP address in the restored container as before
checkpointing, CNI is told which IP address to use.
If the saved network configuration cannot be found during restore, the
container is restored with a new IP address.
For CRIU to restore established TCP connections the IP address of the
network namespace used for restore needs to be the same. For TCP
connections in the listening state the IP address can change.
During restore only one network interface with one IP address is handled
correctly. Support to restore containers with more advanced network
configuration will be implemented later.
v2:
* comment typo
* print debug messages during cleanup of restore files
* use createContainer() instead of createOCIContainer()
* introduce helper CheckpointPath()
* do not try to restore a container that is paused
* use existing helper functions for cleanup
* restructure code flow for better readability
* do not try to restore if checkpoint/inventory.img is missing
* git add checkpoint.go restore.go
v3:
* move checkpoint/restore under 'podman container'
v4:
* incorporated changes from latest reviews
Signed-off-by: Adrian Reber <areber@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To work better with Kata containers, we need to delete() from the
OCI runtime as a part of cleanup, to ensure resources aren't
retained longer than they need to be.
To enable this, we need to add a new state to containers,
ContainerStateExited. Containers transition from
ContainerStateStopped to ContainerStateExited via cleanupRuntime
which is invoked as part of cleanup(). A container in the Exited
state is identical to Stopped, except it has been removed from
the OCI runtime and thus will be handled differently when
initializing the container.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
| |
We added a timeout for convenience, but most invocations don't
care about it. Refactor it into WaitWithTimeout() and add a
Wait() that doesn't require a timeout and uses the default.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #1527
Approved by: mheon
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When managing the containers with systemd, it takes a bit more than
250ms to have podman creating the pidfile.
Increasing the value to 1 second will avoid timeout issues when running
a lot of containers managed by systemd.
This patch was tested in a VM with 56 services (OpenStack) deployed by
TripleO and managed by systemd.
Fixes #1495
Signed-off-by: Emilien Macchi <emilien@redhat.com>
Closes: #1497
Approved by: rhatdan
|
|
|
|
|
|
|
| |
Waiting uses a lot of CPU, so drop back to checking once/second
and allow user to pass in the interval.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
|
|
| |
Prior to this patch, we were polling continuously to check if a
container had died. This patch changes this to poll 10 times a
second, which should be more than sufficient and drastically
reduce CPU utilization.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Manage the case where the main process of the container creates and
joins a new user namespace.
In this case we want to join only the first child in the new
hierarchy, which is the user namespace that was used to create the
container.
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
Closes: #1331
Approved by: rhatdan
|
|
|
|
|
|
|
|
|
|
|
| |
Runc exec expects the --user flag to be formatted as UID:GID.
Use chrootuser code to translate whatever user is passed to exec
into this format.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #1315
Approved by: vrothberg
|
|
|
|
|
|
|
|
|
|
| |
Need to get some small changes into libpod to pull back into buildah
to complete buildah transition.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #1270
Approved by: mheon
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we add mounts from images, volumes and internal.
We can accidently over mount an existing mount. This patch sorts the mounts
to make sure a parent directory is always mounted before its content.
Had to change the default propagation on image volume mounts from shared
to private to stop mount points from leaking out of the container.
Also switched from using some docker/docker/pkg to container/storage/pkg
to remove some dependencies on Docker.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #1243
Approved by: mheon
|
|
|
|
|
|
|
|
|
|
|
|
| |
podman umount will currently only unmount file system if not other
process is using it, otherwise the umount decrements the container
storage to indicate that the caller is no longer using the mount
point, once the count gets to 0, the file system is actually unmounted.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #1184
Approved by: TomSweeneyRedHat
|
|
|
|
|
|
|
|
|
|
|
| |
Moved contents of RestartWithTimeout to restartWithTimeout in container_internal to be able to call restart without locking in function.
Refactored startNode to be able to either start or restart a node.
Built pod Restart() with new startNode with refresh true.
Signed-off-by: haircommander <pehunt@redhat.com>
Closes: #1152
Approved by: rhatdan
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we unmount storage that is still in use.
We should not be unmounting storeage that we mounted
via a different command or by podman mount. This
change relies on containers/storage to umount keep track of
how many times the storage was mounted before really unmounting
it from the system.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
|
|
|
|
|
|
|
| |
Signed-off-by: Matthew Heon <mheon@redhat.com>
Closes: #1068
Approved by: baude
|
|
|
|
|
|
|
| |
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #981
Approved by: baude
|
|
|
|
|
|
|
|
|
|
| |
The Refresh() function is used to reset a container's state after
a database format change to state is made that requires migration
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #981
Approved by: baude
|
|
|
|
|
|
|
|
|
|
| |
Since we are checking if err is non nil in defer function we need
to define it, so that the check will work correctly.
Signed-off-by: Daniel J Walsh <dwalsh@redhat.com>
Closes: #985
Approved by: mheon
|
|
|
|
|
|
|
|
|
|
|
| |
Move the StartContainer call after the attach to the UNIX socket. It
solves a race where the StartContainer could be done earlier and a
short-lived container could already exit by the time we tried to
attach to the socket.
Closes: https://github.com/projectatomic/libpod/issues/835
Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
|
|
|
|
|
|
|
|
|
|
|
| |
This allows us to restart containers that have never been started
without error. This makes RestartWithTimeout work with running,
stopped, and created containers.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #719
Approved by: rhatdan
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
first pass at adding in the container related endpoints/methods for the libpod
backend. Couple of important notes:
* endpoints that can use a console are not going to be done until we have "remote" console
* several of the container methods should probably be able to stream as opposed to a one-off return
Signed-off-by: baude <bbaude@redhat.com>
Closes: #708
Approved by: baude
|
|
|
|
|
|
|
|
|
| |
Made necessary changes to functions to include contex.Context wherever needed
Signed-off-by: umohnani8 <umohnani@redhat.com>
Closes: #640
Approved by: baude
|
|
|
|
|
|
|
| |
Signed-off-by: TomSweeneyRedHat <tsweeney@redhat.com>
Closes: #619
Approved by: mheon
|
|
|
|
|
|
|
|
|
|
|
| |
Comparing Go interfaces, like io.Reader, to nil does not work. As
such, we need to include a bool with each stream telling whether
to attach to it.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #608
Approved by: baude
|
|
|
|
|
|
|
|
|
|
| |
This allows us to attach to attach to just stdout or stderr or
stdin, or any combination of these.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #608
Approved by: baude
|
|
|
|
|
|
|
| |
Signed-off-by: Matthew Heon <mheon@redhat.com>
Closes: #610
Approved by: giuseppe
|
|
|
|
|
|
|
|
| |
Resolves: #586 and #520
Signed-off-by: baude <bbaude@redhat.com>
Closes: #592
Approved by: mheon
|
|
|
|
|
|
|
|
|
| |
Resolves: #575
Signed-off-by: baude <bbaude@redhat.com>
Closes: #588
Approved by: mheon
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of checking during init(), which could result in major
locking issues when used with pods, make our dependency checks in
the public API instead. This avoids doing them when we start pods
(where, because of the dependency graph, we can reasonably say
all dependencies are up before we start a container).
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #577
Approved by: rhatdan
|
|
|
|
|
|
|
|
|
| |
This will help dependency races
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #577
Approved by: rhatdan
|
|
|
|
|
|
|
|
|
|
| |
Cull funcs from runtime_img.go which are no longer needed. Also, fix any remaining
spots that use the old image technique.
Signed-off-by: baude <bbaude@redhat.com>
Closes: #532
Approved by: mheon
|
|
|
|
|
|
|
|
|
|
| |
Migrate the podman create and commit subcommandis to leverage the images library. I also had
to migrate the cmd/ portions of run and rmi.
Signed-off-by: baude <bbaude@redhat.com>
Closes: #498
Approved by: mheon
|
|
|
|
|
|
|
| |
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #506
Approved by: rhatdan
|
|
|
|
|
|
|
| |
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #482
Approved by: baude
|
|
|
|
|
|
|
|
|
|
|
|
| |
This solves our prior problems with attach races by ensuring the
order is correct.
Also contains substantial cleanups to the attach code.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #482
Approved by: baude
|
|
|
|
|
|
|
| |
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #482
Approved by: baude
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Separate Init() and Start() does not make sense on the pod side,
where we may have to start containers in order to initialize
others due to dependency orders.
Also adjusts internal containers API for more code sharing.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #478
Approved by: rhatdan
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Refactors creation of bind mounts into a separate function that
can be called from elsewhere (e.g. pod start or container
restart). This function stores the mounts in the DB using the
field established last commit.
Spec generation now relies upon this field in the DB instead of
manually enumerating files to be bind mounted in.
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #462
Approved by: baude
|
|
|
|
|
|
|
| |
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #462
Approved by: baude
|
|
|
|
|
|
|
|
|
| |
It will be needed for restarting containers
Signed-off-by: Matthew Heon <matthew.heon@gmail.com>
Closes: #462
Approved by: baude
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The progress should not be show for import, load, and commit. It makes machine
parsing of the output much more difficult. Also, each command should output an
image ID or name for the user.
Added a --verbose flag for users that still want to see progress.
Resolves issue #450
Signed-off-by: baude <bbaude@redhat.com>
Closes: #456
Approved by: rhatdan
|