According to runtime-spec:
The poststop hooks MUST be invoked by the runtime. If any
poststop hook fails, the runtime MUST log a warning, but
the remaining hooks and lifecycle continue as if the hook
had succeeded.
Fixes: #1252
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Update cloud-hypervisor to commit 2706319.
Fixes a limitation in OpenAPITools/openapi-generator tool,
it's impossible to send go zero types, like false and 0 to
cloud-hypervisor because `omitempty` is added if a field is not
required.
See cloud-hypervisor/cloud-hypervisor#1961 for more information
Signed-off-by: Julio Montes <julio.montes@intel.com>
Guest consumes 120Mb more of memory when DAX is enabled and the default
FS cache size (8G) is used. Disable dax when it is not required
reducing guest's memory footprint.
Without this patch:
```
7fdea4000000-7fdee4000000 rw-s 18850589 /memfd:ch_ram (deleted)
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 187876 kB
```
With this patch:
```
7fa970000000-7fa9b0000000 rw-s 612001 /memfd:ch_ram (deleted)
Size: 1048576 kB
KernelPageSize: 4 kB
MMUPageSize: 4 kB
Rss: 57308 kB
Pss: 56722 kB
```
fixes#1100
Signed-off-by: Julio Montes <julio.montes@intel.com>
The release v0.11.0 of cloud-hypervisor features the following changes:
1) Improved Linux Boot Time, 2) `SIGTERM/SIGINT` Interrupt Signal,
Handling 3) Default Log Level Changed, 4) `io_uring` support by default
for `virtio-block` (on host kernel version 5.8+), 5) Windows Guest
Support, 6) New `--balloon` Parameter Added, 7) Experimental
`virtio-watchdog` Support, 8) Bug fixes.
Fixes: #1089
Signed-off-by: Bo Chen <chen.bo@intel.com>
RemoveContainerRequest results in calling to deleteContainer, according
to spec calling to RemoveContainer is idempotent and "must not return
an error if the container has already been removed", hence, don't
return error if the error reports that the container is not found.
Fixes: #836
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
On pod delete, we were looking to read files that we had just deleted. In particular,
stopSandbox for QEMU was called (we cleanup up vmpath), and then QEMU's
save function was called, which immediately checks for the PID file.
Let's only update the persist store for QEMU if QEMU is actually
running. This'll avoid Error messages being displayed when we are
stopping and deleting a sandbox:
```
level=error msg="Could not read qemu pid file"
```
I reviewed CLH, and it looks like it is already taking appropriate
action, so no changes needed.
Ideally we won't spend much time saving state to persist.json unless
there's an actual error during stop/delete/shutdown path, as the persist will
also be removed after the pod is removed. We may want to optimize this,
as currently we are doing a persist store when deleting each container
(after the sandbox is stopped, VM is killed), and when we stop the sandbox.
This'll require more rework... tracked in:
https://github.com/kata-containers/kata-containers/issues/1181Fixes: #1179
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
If the upcast from resultingRoutes to *grpc.IRoutes fails, we return
(nil, err), but previous code ensures that err is nil at that point, so we
return no error.
fixes#1206
Forward port of
0ffaeeb5d8
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the upcast from resultingInterfaces to *grpc.Interfaces fails, we
return (nil, err), but previous code ensures that err is nil at that
point, so we return no error.
Forward port of
b86e904c2dfixes#1206
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On runtime/Makefile the value of DESTDIR is set to "/", unless one
pass that variable as an argument to `make`. This change will
allow its overwrite if DESTDIR is exported in the environment as
well.
Fixes#1182
Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
For experimental-virtiofs, we use it to test virtiofs with DAX. Let's
rename its virtiofsd to virtiofsd-dax.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We've been shipping it for a long time. It's time to make it default
replacing the old obsolet 9pfs.
Fixes: #935
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Dave Gilbert brough up that passing --thread-pool-size=1 to virtiofsd
may result in a performance improvement especially when using
`cache=none`. While our current default is `cache=auto`, Dave mentioned
that he seems no harm in having it set and he also mentiond that it may
use a lot less stack space on aarch/arm.
Fixes: #943
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
To support a few common configurations for Kata, including:
- `io.containerd.kata.v2`
- `io.containerd.kata-qemu.v2`
- `io.containerd.kata-clh.v2`
`kata-monintor` changes to use regexp instead of direct string comparison.
Fixes: #957
Signed-off-by: bin liu <bin@hyper.sh>
Add the verification of some basic protections, namely that:
- EnableAnnotations is honored
- Dangerous paths cannot be modified if no match
- Errors are returned when expected
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Warning from gocyclo during make check:
virtcontainers/pkg/oci/utils.go:404:1: cyclomatic complexity 37 of func `addHypervisorConfigOverrides` is high (> 30) (gocyclo)
func addHypervisorConfigOverrides(ocispec specs.Spec, config *vc.SandboxConfig, runtime RuntimeConfig) error {
^
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
There are a few interesting corner cases to consider for this
function.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
James O.D Hunt: "But also, regexpContains() and
checkPathIsInGlobList() seem like good candidates for some unit
tests. The "look" obvious, but a few boundary condition tests would be
useful I think (filenames with spaces, backslashes, special
characters, and relative & absolute paths are also an interesting
thought here)."
There aren't that many boundary conditions on a list with regexps,
if you assume the regexp match function itself works. However, the
tests is useful in documenting expectations.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This was discovered while checking a massive change in variables.
The root cause for the error is a very long list of manual
replacements, that is best replaced with a $(foreach).
All individual variables in the output configuration files were
checked against the old build using diff.
This is a forward port of a makefile fix included in
PR https://github.com/kata-containers/runtime/issues/3004
for issue https://github.com/kata-containers/runtime/issues/2943Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The entries used to be things like PATH_LIST, which are too generic.
Replace them with more precise name with a distinguishing keyword,
namely VALID. For example valid_hypervisor_paths.
Fixes: #901
Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
When there is a default value from the code (usually empty) that
differs from a possible suggested value from the distro, then the
wording "default: empty" is confusing.
Fixes: #901
Suggested-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Add a field "enable_annotations" to the runtime configuration that can
be used to whitelist annotations using a list of regular expressions,
which are used to match any part of the base annotation name, i.e. the
part after "io.katacontainers.config.hypervisor."
For example, the following configuraiton will match "virtio_fs_daemon",
"initrd" and "jailer_path", but not "path" nor "firmware":
enable_annotations = [ "virtio.*", "initrd", "_path" ]
The default is an empty list of enabled annotations, which disables
annotations entirely.
If an anontation is rejected, the message is something like:
annotation io.katacontainers.config.hypervisor.virtio_fs_daemon is not enabled
Fixes: #901
Suggested-by: Peng Tao <tao.peng@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
When filtering annotations that correspond to paths,
e.g. hypervisor.path, it is better to use a glob syntax than a regexp
syntax, as it is more usual for paths, and prevents classes of matches
that are undesirable in our case, such as matching .. against .*
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
A comment talking about runtime related annotations describes them as
being related to the agent. A similar comment for the agent
annotations is missing.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Add variables to override defaults at build time for the various lists
used to control path annotations.
Fixes: #901
Suggested-by: Fabiano Fidencio <fidencio@redhat.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This one could theoretically be used to overwrite data on the host.
It seems somewhat less risky than the earlier ones for a number
of reasons, but worth protecting a little anyway.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Add the following text explaining the risk of using regular
expressions in path lists:
Each member of the list can be a regular expression, but prefer names.
Otherwise, please read and understand the following carefully.
SECURITY WARNING: If you use regular expressions, be mindful that
an attacker could craft an annotation that uses .. to escape the paths
you gave. For example, if your regexp is /bin/qemu.* then if there is
a directory named /bin/qemu.d/, then an attacker can pass an annotation
containing /bin/qemu.d/../put-any-binary-name-here and attack your host.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
This also adds annotation for ctlpath which were not present
before. It's better to implement the code consistenly right now to make
sure that we don't end up with a leaky implementation tacked on later.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The jailer_path annotation can be used to execute arbitrary code on
the host. Add a jailer_path_list configuration entry providing a list
of regular expressions that can be used to filter annotations that
represent valid file names.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The path_list configuration gives a series of regular expressions that
limit which values are acceptable through annotations in order to
avoid kata launching arbitrary binaries on the host when receiving an
annotation.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
The annotation is provided, so it should be respected.
Furthermore, it is important to implement it with the appropriate
protetions similar to what was done for virtiofsd.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Sending the virtio_fs_daemon annotation can be used to execute
arbitrary code on the host. In order to prevent this, restrict the
values of the annotation to a list provided by the configuration
file.
Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Paths mentioned in the hypervisor configuration can be overriden
using annotations, which is potentially dangerous. For each path,
add a 'List' variant that specifies the list of acceptable values
from annotations.
Bug: https://bugs.launchpad.net/katacontainers.io/+bug/1878234Fixes: #901
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Kata doesn't map any numa topologies in the guest. Let's make sure we
clear the Cpuset fields before passing container updates to the
guest.
Note, in the future we may want to have a vCPU to guest CPU mapping and
still include the cpuset.Cpus. Until we have this support, clear this as
well.
Fixes: #932
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.
Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>