On pod delete, we were looking to read files that we had just deleted. In particular,
stopSandbox for QEMU was called (we cleanup up vmpath), and then QEMU's
save function was called, which immediately checks for the PID file.
Let's only update the persist store for QEMU if QEMU is actually
running. This'll avoid Error messages being displayed when we are
stopping and deleting a sandbox:
```
level=error msg="Could not read qemu pid file"
```
I reviewed CLH, and it looks like it is already taking appropriate
action, so no changes needed.
Ideally we won't spend much time saving state to persist.json unless
there's an actual error during stop/delete/shutdown path, as the persist will
also be removed after the pod is removed. We may want to optimize this,
as currently we are doing a persist store when deleting each container
(after the sandbox is stopped, VM is killed), and when we stop the sandbox.
This'll require more rework... tracked in:
https://github.com/kata-containers/kata-containers/issues/1181Fixes: #1179
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
If the upcast from resultingRoutes to *grpc.IRoutes fails, we return
(nil, err), but previous code ensures that err is nil at that point, so we
return no error.
fixes#1206
Forward port of
0ffaeeb5d8
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the upcast from resultingInterfaces to *grpc.Interfaces fails, we
return (nil, err), but previous code ensures that err is nil at that
point, so we return no error.
Forward port of
b86e904c2dfixes#1206
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
oci.proto imports "google/protobuf/wrappers.proto", but doesn't appear to
use it, which causes a warning from protoc when we compile it. Remove the
import to fix the warning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We should always cleanup the vm directory when doing `stopSandbox`,
while we are skipping the cleanup process on some error code paths when
using cloud-hypervisor driver.
Fixes: #1098
Signed-off-by: Bo Chen <chen.bo@intel.com>
In PR 1079, CleanupContainer's parameter of sandboxID is changed to VCSandbox, but at cleanup,
there is no VCSandbox is constructed, we should load it from disk by loadSandboxConfig() in
persist.go. This commit reverts parts of #1079Fixes: #1119
Signed-off-by: bin liu <bin@hyper.sh>
Remove global sandbox variable, and save *Sandbox to hypervisor struct.
For some needs, hypervisor may need to use methods from Sandbox.
Signed-off-by: bin liu <bin@hyper.sh>
Update cloud-hypervisor to commit 2706319.
Fixes a limitation in OpenAPITools/openapi-generator tool,
it's impossible to send go zero types, like false and 0 to
cloud-hypervisor because `omitempty` is added if a field is not
required.
See cloud-hypervisor/cloud-hypervisor#1961 for more information
Signed-off-by: Julio Montes <julio.montes@intel.com>
Guest consumes 120Mb more of memory when DAX is enabled and the default
FS cache size (8G) is used. Disable dax when it is not required
reducing guest's memory footprint.
Without this patch:
```
7fdea4000000-7fdee4000000 rw-s 18850589 /memfd:ch_ram (deleted)
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 187876 kB
```
With this patch:
```
7fa970000000-7fa9b0000000 rw-s 612001 /memfd:ch_ram (deleted)
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 57308 kB
Pss: 56722 kB
```
fixes#1100
Signed-off-by: Julio Montes <julio.montes@intel.com>
Use `if err := q.qmpSetup(); err != nil` to reduce code and make it easy
to read. And remove checking err if last function call also return an error,
return the function call directly.
Fixes: #1081
Signed-off-by: bin liu <bin@hyper.sh>
The release v0.11.0 of cloud-hypervisor features the following changes:
1) Improved Linux Boot Time, 2) `SIGTERM/SIGINT` Interrupt Signal,
Handling 3) Default Log Level Changed, 4) `io_uring` support by default
for `virtio-block` (on host kernel version 5.8+), 5) Windows Guest
Support, 6) New `--balloon` Parameter Added, 7) Experimental
`virtio-watchdog` Support, 8) Bug fixes.
Fixes: #1089
Signed-off-by: Bo Chen <chen.bo@intel.com>
Make `asset.go` the arbiter of asset annotations by removing all asset
annotations lists from other parts of the codebase.
This makes the code simpler, easier to maintain, and more robust.
Specifically, the previous behaviour was inconsistent as the following
ways:
- `createAssets()` in `sandbox.go` was not handling the following asset
annotations:
- firmware:
- `io.katacontainers.config.hypervisor.firmware`
- `io.katacontainers.config.hypervisor.firmware_hash`
- hypervisor:
- `io.katacontainers.config.hypervisor.path`
- `io.katacontainers.config.hypervisor.hypervisor_hash`
- hypervisor control binary:
- `io.katacontainers.config.hypervisor.ctlpath`
- `io.katacontainers.config.hypervisor.hypervisorctl_hash`
- jailer:
- `io.katacontainers.config.hypervisor.jailer_path`
- `io.katacontainers.config.hypervisor.jailer_hash`
- `addAssetAnnotations()` in the `oci` package was not handling the
following asset annotations:
- hypervisor:
- `io.katacontainers.config.hypervisor.path`
- `io.katacontainers.config.hypervisor.hypervisor_hash`
- hypervisor control binary:
- `io.katacontainers.config.hypervisor.ctlpath`
- `io.katacontainers.config.hypervisor.hypervisorctl_hash`
- jailer:
- `io.katacontainers.config.hypervisor.jailer_path`
- `io.katacontainers.config.hypervisor.jailer_hash`
This change fixes the bug where specifying a custom hypervisor path via an
asset annotation was having no effect.
Fixes: #1085.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Add missing annotation definitions for a hypervisor control binary:
- `io.katacontainers.config.hypervisor.ctlpath`
- `io.katacontainers.config.hypervisor.hypervisorctl_hash`
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
When guest panic, dump guest kernel memory to host filesystem.
And also includes:
- hypervisor config
- hypervisor version
- and state of sandbox
Fixes: #1012
Signed-off-by: bin liu <bin@hyper.sh>
bindmount remount events are not propagated through mount subtrees,
so we have to remount the shared dir mountpoint directly.
E.g.,
```
mkdir -p source dest foo source/foo
mount -o bind --make-shared source dest
mount -o bind foo source/foo
echo bind mount rw
mount | grep foo
echo remount ro
mount -o remount,bind,ro source/foo
mount | grep foo
```
would result in:
```
bind mount rw
/dev/xvda1 on /home/ubuntu/source/foo type ext4 (rw,relatime,discard,data=ordered)
/dev/xvda1 on /home/ubuntu/dest/foo type ext4 (rw,relatime,discard,data=ordered)
remount ro
/dev/xvda1 on /home/ubuntu/source/foo type ext4 (ro,relatime,discard,data=ordered)
/dev/xvda1 on /home/ubuntu/dest/foo type ext4 (rw,relatime,discard,data=ordered)
```
The reason is that bind mount creats new mount structs and attaches them to different mount subtrees.
However, MS_REMOUNT only looks for existing mount structs to modify and does not try to propagate the
change to mount structs in other subtrees.
Fixes: #1061
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
In cloud-hypervisor, it provides a single unified way of unplugging
devices, e.g. the `/vm.RemoveDevice` HTTP API. Taking advantage of this
API, we can simplify our implementation of `hotplugRemoveDevice` in
`clh.go`, where we can consolidate similar code paths for different
device unplug (e.g. no need to implement `hotplugRemoveBlockDevice` and
`hotplugRemoveVfioDevice` separately). We will only need to retrieve the
right `deviceID` based on the type of devices, and use the single
unified HTTP API for device unplug.
Fixes: #1076
Signed-off-by: Bo Chen <chen.bo@intel.com>
Cloud-hypervisor supports DAX, let's enable it to reduce its memory
footprint.
Before this patch:
**19.96M**
```
20448kB -- [/usr/share/kata-containers/kata.img]
```
With this patch:
**10.83M**
```
11100kB -- [/usr/share/kata-containers/kata.img]
```
fixes#1056
Signed-off-by: Julio Montes <julio.montes@intel.com>
Allow API consumers to change the maximum number of ports in the
virtio-serial devices, setting a lower number of ports can improve the
boot time and reduce the attack surface.
Before this patch on arm64:
[ 0.028664] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.055031] printk: console [hvc0] enabled
After this patch on arm64:
[ 0.028484] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[ 0.031370] printk: console [hvc0] enabled
Fixes: #2676
Signed-off-by: Jia He <justin.he@arm.com>
Add the verification of some basic protections, namely that:
- EnableAnnotations is honored
- Dangerous paths cannot be modified if no match
- Errors are returned when expected
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Warning from gocyclo during make check:
virtcontainers/pkg/oci/utils.go:404:1: cyclomatic complexity 37 of func `addHypervisorConfigOverrides` is high (> 30) (gocyclo)
func addHypervisorConfigOverrides(ocispec specs.Spec, config *vc.SandboxConfig, runtime RuntimeConfig) error {
^
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
There are a few interesting corner cases to consider for this
function.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
James O.D Hunt: "But also, regexpContains() and
checkPathIsInGlobList() seem like good candidates for some unit
tests. The "look" obvious, but a few boundary condition tests would be
useful I think (filenames with spaces, backslashes, special
characters, and relative & absolute paths are also an interesting
thought here)."
There aren't that many boundary conditions on a list with regexps,
if you assume the regexp match function itself works. However, the
tests is useful in documenting expectations.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Add a field "enable_annotations" to the runtime configuration that can
be used to whitelist annotations using a list of regular expressions,
which are used to match any part of the base annotation name, i.e. the
part after "io.katacontainers.config.hypervisor."
For example, the following configuraiton will match "virtio_fs_daemon",
"initrd" and "jailer_path", but not "path" nor "firmware":
enable_annotations = [ "virtio.*", "initrd", "_path" ]
The default is an empty list of enabled annotations, which disables
annotations entirely.
If an anontation is rejected, the message is something like:
annotation io.katacontainers.config.hypervisor.virtio_fs_daemon is not enabled
Fixes: #901
Suggested-by: Peng Tao <tao.peng@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
When filtering annotations that correspond to paths,
e.g. hypervisor.path, it is better to use a glob syntax than a regexp
syntax, as it is more usual for paths, and prevents classes of matches
that are undesirable in our case, such as matching .. against .*
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
A comment talking about runtime related annotations describes them as
being related to the agent. A similar comment for the agent
annotations is missing.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This one could theoretically be used to overwrite data on the host.
It seems somewhat less risky than the earlier ones for a number
of reasons, but worth protecting a little anyway.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>