Commit Graph

308 Commits

Author SHA1 Message Date
James O. D. Hunt
e5d4259ab4 runtime: Simplify make variables for clh
Simplify definition of the `CLHVALIDHYPERVISORPATHS` build variable to
use the already defined `CLHPATH`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-10-20 09:14:09 +01:00
Julio Montes
f162e7e960 Merge pull request #948 from justin-he/max_ports
virtcontainers: Append max_ports to virtio-serial device
2020-10-19 08:55:06 -05:00
Xu Wang
80cfd01130 Merge pull request #936 from bergwolf/default-virtiofs
runtime: set virtio-fs as default fs sharing method
2020-10-17 17:28:30 +08:00
Peng Tao
5f47f224ff Merge pull request #945 from fidencio/wip/virtiofs-performance-improvements-backport-and-default-settings
VirtioFS: backports & default settings to improve performance
2020-10-17 11:13:22 +08:00
Peng Tao
6ed669a17b packaging: install virtiofsd for normal qemu build as well
For experimental-virtiofs, we use it to test virtiofs with DAX. Let's
rename its virtiofsd to virtiofsd-dax.

Depends-on: github.com/kata-containers/tests#2951
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-17 10:58:27 +08:00
Jia He
da79b4be67 virtcontainers: Append max_ports to virtio-serial device
Allow API consumers to change the maximum number of ports in the
virtio-serial devices, setting a lower number of ports can improve the
boot time and reduce the attack surface.

Before this patch on arm64:
[    0.028664] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.055031] printk: console [hvc0] enabled

After this patch on arm64:
[    0.028484] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.031370] printk: console [hvc0] enabled

Fixes: #2676
Signed-off-by: Jia He <justin.he@arm.com>
2020-10-16 23:40:54 +08:00
Peng Tao
bcf4853062 runtime: enable virtiofs by default
We've been shipping it for a long time. It's time to make it default
replacing the old obsolet 9pfs.

Fixes: #935
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-10-16 15:58:35 +08:00
Peng Tao
0d5d69e8cd Merge pull request #902 from c3d/bug-v2/launchpad-1878234-access
Validate runtime annotations
2020-10-16 15:47:45 +08:00
bin liu
1b7ed32836 kata-monitor: use regexp to check if runtime is kata containers
To support a few common configurations for Kata, including:

- `io.containerd.kata.v2`
- `io.containerd.kata-qemu.v2`
- `io.containerd.kata-clh.v2`

`kata-monintor` changes to use regexp instead of direct string comparison.

Fixes: #957

Signed-off-by: bin liu <bin@hyper.sh>
2020-10-15 18:42:44 +08:00
Christophe de Dinechin
c5771be2de annotations: Correct unit tests to validate new protections
Add the verification of some basic protections, namely that:
- EnableAnnotations is honored
- Dangerous paths cannot be modified if no match
- Errors are returned when expected

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
398d79184c annotations: Split addHypervisorOverrides to reduce complexity
Warning from gocyclo during make check:
 virtcontainers/pkg/oci/utils.go:404:1: cyclomatic complexity 37 of func `addHypervisorConfigOverrides` is high (> 30) (gocyclo)
 func addHypervisorConfigOverrides(ocispec specs.Spec, config *vc.SandboxConfig, runtime RuntimeConfig) error {
^

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b2b3bc7ad8 annotations: Add unit test for checkPathIsInGlobs
There are a few interesting corner cases to consider for this
function.

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
6f52179ce4 annotations: Add unit test for regexpContains function
James O.D Hunt: "But also, regexpContains() and
checkPathIsInGlobList() seem like good candidates for some unit
tests. The "look" obvious, but a few boundary condition tests would be
useful I think (filenames with spaces, backslashes, special
characters, and relative & absolute paths are also an interesting
thought here)."

There aren't that many boundary conditions on a list with regexps,
if you assume the regexp match function itself works. However, the
tests is useful in documenting expectations.

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
966bd57344 makefile: Add missing generated vars to USER_VARS
This was discovered while checking a massive change in variables.
The root cause for the error is a very long list of manual
replacements, that is best replaced with a $(foreach).

All individual variables in the output configuration files were
checked against the old build using diff.

This is a forward port of a makefile fix included in
PR https://github.com/kata-containers/runtime/issues/3004
for issue https://github.com/kata-containers/runtime/issues/2943

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
be6ee2550d makefile: Improve names of config entries for annotation checks
The entries used to be things like PATH_LIST, which are too generic.
Replace them with more precise name with a distinguishing keyword,
namely VALID. For example valid_hypervisor_paths.

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b119427405 annotations: Give better names to local variabes in search functions
Use more meaningful variable names for clarity.

Fixes: #901

Suggested-by: James O.D. Hunt james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b5db114aad annotations: Rename checkPathIsInGlobList with checkPathIsInGlobs
The name is shorter and more specific

Fixes: #901

Suggested-by: James O.D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
d65a7d1083 config: Add better comments in the template files
When there is a default value from the code (usually empty) that
differs from a possible suggested value from the distro, then the
wording "default: empty" is confusing.

Fixes: #901

Suggested-by: Julio Montes <julio.montes@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
7c6aede5d4 config: Whitelist hypervisor annotations by name
Add a field "enable_annotations" to the runtime configuration that can
be used to whitelist annotations using a list of regular expressions,
which are used to match any part of the base annotation name, i.e. the
part after "io.katacontainers.config.hypervisor."

For example, the following configuraiton will match "virtio_fs_daemon",
"initrd" and "jailer_path", but not "path" nor "firmware":

  enable_annotations = [ "virtio.*", "initrd", "_path" ]

The default is an empty list of enabled annotations, which disables
annotations entirely.

If an anontation is rejected, the message is something like:

  annotation io.katacontainers.config.hypervisor.virtio_fs_daemon is not enabled

Fixes: #901

Suggested-by: Peng Tao <tao.peng@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
f047fced0b config: Use glob instead of regexp to match paths in annotations
When filtering annotations that correspond to paths,
e.g. hypervisor.path, it is better to use a glob syntax than a regexp
syntax, as it is more usual for paths, and prevents classes of matches
that are undesirable in our case, such as matching .. against .*

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
11b9c90cd8 annotations: Fix typo in comment
A comment talking about runtime related annotations describes them as
being related to the agent. A similar comment for the agent
annotations is missing.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
c16cdcb2a5 config: Add makefile variables for path lists
Add variables to override defaults at build time for the various lists
used to control path annotations.

Fixes: #901

Suggested-by: Fabiano Fidencio <fidencio@redhat.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
4e89b885d2 config: Protect file_mem_backend against annotation attacks
This one could theoretically be used to overwrite data on the host.
It seems somewhat less risky than the earlier ones for a number
of reasons, but worth protecting a little anyway.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
aae9656d8b config: Protect vhost_user_store_path against annotation attacks
This path could be used to overwrite data on the host.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
5588165399 config: Add security warning on configuration examples
Add the following text explaining the risk of using regular
expressions in path lists:

Each member of the list can be a regular expression, but prefer names.
Otherwise, please read and understand the following carefully.
SECURITY WARNING: If you use regular expressions, be mindful that
an attacker could craft an annotation that uses .. to escape the paths
you gave. For example, if your regexp is /bin/qemu.* then if there is
a directory named /bin/qemu.d/, then an attacker can pass an annotation
containing /bin/qemu.d/../put-any-binary-name-here and attack your host.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b21a829c61 config: Protect ctlpath from annotation attack
This also adds annotation for ctlpath which were not present
before. It's better to implement the code consistenly right now to make
sure that we don't end up with a leaky implementation tacked on later.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
27b6620b23 config: Protect jailer_path annotation
The jailer_path annotation can be used to execute arbitrary code on
the host. Add a jailer_path_list configuration entry providing a list
of regular expressions that can be used to filter annotations that
represent valid file names.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
076690179d config: Add examples for path_list configuration
The path_list configuration gives a series of regular expressions that
limit which values are acceptable through annotations in order to
avoid kata launching arbitrary binaries on the host when receiving an
annotation.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
2d431c61c6 annotations: Simplify negative logic
Replace strange negative logic  (!ok -> continue) with positive
logic (ok -> do it)

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
2ca9ca892d config: Add hypervisor path override through annotations
The annotation is provided, so it should be respected.
Furthermore, it is important to implement it with the appropriate
protetions similar to what was done for virtiofsd.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
2e093dfd8b config: Fix typo in function name
There was an extra 'p' in addHypervisorVirtioFsOverrides.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
bf13ff0a3a config: Protect virtio_fs_daemon annotation
Sending the virtio_fs_daemon annotation can be used to execute
arbitrary code on the host. In order to prevent this, restrict the
values of the annotation to a list provided by the configuration
file.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
8c75de1966 config: Add 'List' alternates for hypervisor configuration paths
Paths mentioned in the hypervisor configuration can be overriden
using annotations, which is potentially dangerous. For each path,
add a 'List' variant that specifies the list of acceptable values
from annotations.

Bug: https://bugs.launchpad.net/katacontainers.io/+bug/1878234

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Eric Ernst
d8a8fe47fb cpuset: don't set cpuset.mems in the guest
Kata doesn't map any numa topologies in the guest. Let's make sure we
clear the Cpuset fields before passing container updates to the
guest.

Note, in the future we may want to have a vCPU to guest CPU mapping and
still include the cpuset.Cpus. Until we have this support, clear this as
well.

Fixes: #932

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Eric Ernst
88cd712876 sandbox: consider cpusets if quota is not enforced
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Eric Ernst
77a463e57a cpuset: support setting mems for sandbox
CPUSet cgroup allows for pinning the memory associated with a cpuset to
a given numa node. Similar to cpuset.cpus, we should take cpuset.mems
into account for the sandbox-cgroup that Kata creates.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Eric Ernst
2d690536b8 cpuset: add cpuset pkg
Pulled from 1.18.4 Kubernetes, adding the cpuset pkg for managing
CPUSet calculations on the host. Go mod'ing the original code from
k8s.io/kubernetes was very painful, and this is very static, so let's
just pull in what we need.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-13 15:54:03 -07:00
Fabiano Fidêncio
1a9515a998 runtime: Pass --thread-pool-size=1 to virtiofsd
Dave Gilbert brough up that passing --thread-pool-size=1 to virtiofsd
may result in a performance improvement especially when using
`cache=none`. While our current default is `cache=auto`, Dave mentioned
that he seems no harm in having it set and he also mentiond that it may
use a lot less stack space on aarch/arm.

Fixes: #943

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2020-10-13 22:33:08 +02:00
Eric Ernst
12cc0ee168 sandbox: don't constrain cpus, mem only cpuset, devices
Allow for constraining the cpuset as well as the devices-whitelist . Revert
sandbox constraints for cpu/memory, as they break the K8S use case. Can
re-add behind a non-default flag in the future.

The sandbox CPUSet should be updated every time a container is created,
updated, or removed.

To facilitate this without rewriting the 'non constrained cgroup'
handling, let's add to the Sandbox's cgroupsUpdate function.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-12 21:31:27 -07:00
Eric Ernst
b6cf68a985 cgroups: add ability to update CPUSet
Add function for applying a cpuset change to a cgroup

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-12 21:31:27 -07:00
Eric Ernst
b812d4f7fa virtcontainers: add method for calculating cpuset for sandbox
Calculate sandbox's CPUSet as the union of each of the container's
CPUSets.

Signed-off-by: Eric Ernst <eric.g.ernst@gmail.com>
2020-10-12 21:31:27 -07:00
Bin Liu
43da14e7b3 Merge pull request #752 from YchauWang/clear-moke-code01
runtime: Clear the VCMock 1.x API Methods from 2.0
2020-10-09 17:41:21 +08:00
Jose Carlos Venegas Munoz
c4472481bc virtiofs: Disable DAX
virtiofs DAX support is not stable today, there are
a few corner cases to make it default.

Fixes: #862
Fixes: #875

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2020-10-08 10:59:10 -05:00
Bo Chen
c33ee54a21 clh: Support VFIO device unplug
This patch adds the support of VFIO device unplug when using
cloud-hypervisor.

Fixes: #860

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-10-05 12:20:13 -07:00
Bo Chen
1f4dfa3166 clh: Remove unnecessary VmmPing
We can rely on the error handling of the actual HTTP API calls to catch
errors, and don't need to call VmmPing explicitly in advance.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-10-05 12:17:45 -07:00
Bo Chen
cc80ae0afb versions: cloud-hypervisor: Bump to version 6d30fe05
The cloud-hypervisor commit `6d30fe05` introduced a fix on its API for
VFIO device hotplug (`VmAddDevice`), which is required for supporting
VFIO unplug through openAPI calls in kata.

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-10-05 12:13:36 -07:00
Julio Montes
ea1cb37b59 versions: cloud-hypervisor: bump version
Use commit c54452c08a467a3e35d8d72f2a91d424e9718c57 as
version for cloud-hypervisor.
Bring openapi fix cloud-hypervisor/cloud-hypervisor#1760 to
support SGX.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-10-01 08:24:29 -05:00
Julio Montes
0ebffdf29f runtime: cloud-hypervisor: tag openapi-generator-cli container
Tag openapi-generator-cli container to v4.3.1 that is the latest
stable, this way we can have reproducible builds and the same
generated code in all the systems

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-10-01 08:24:29 -05:00
Julio Montes
7d63823141 runtime/vendor: add k8s.io/apimachinery/pkg/api/resource
k8s.io/apimachinery/pkg/api/resource is a memory quantities parser,
we use it to parse the SGX EPC size defined by the `sgx.intel.com/epc`
annotation

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-10-01 08:24:29 -05:00
Julio Montes
6df165c19d runtime: add support for SGX
Support the `sgx.intel.com/epc` annotation that is defined by the intel
k8s plugin. This annotation enables SGX. Hardware-based isolation and
memory encryption.

For example, use `sgx.intel.com/epc = "64Mi"` to create a container
with 1 EPC section with pre-allocated memory.

At the time of writing this patch, SGX patches have not landed on the
linux kernel project.
The following github kernel fork contains all the SGX patches for the
host and guest: https://github.com/intel/kvm-sgx

fixes #483

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-10-01 08:24:29 -05:00