Commit Graph

21 Commits

Author SHA1 Message Date
Manabu Sugimoto
c617bbe70d runtime: Pass SELinux policy for containers to the agent
Pass SELinux policy for containers to the agent if `disable_guest_selinux`
is set to `false` in the runtime configuration. The `container_t` type
is applied to the container process inside the guest by default.
Users can also set a custom SELinux policy to the container process using
`guest_selinux_label` in the runtime configuration. This will be an
alternative configuration of Kubernetes' security context for SELinux
because users cannot specify the policy in Kata through Kubernetes's security
context. To apply SELinux policy to the container, the guest rootfs must
be CentOS that is created and built with `SELINUX=yes`.

Fixes: #4812

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-11-29 19:07:56 +09:00
Liang Zhou
ef925d40ce runtime: enable sandbox feature on qemu
Enable "-sandbox on" in qemu can introduce another protect layer
on the host, to make the secure container more secure.

The default option is disable because this feature may introduce some
performance cost, even though user can enable
/proc/sys/net/core/bpf_jit_enable to reduce the impact.

Fixes: #2266

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2022-06-17 15:30:46 -07:00
Samuel Ortiz
5e119e90e8 virtcontainers: Rename the Network structure fields and methods
We are converting the Network structure into an interface, so that
different host OSes can have different networking implementations for
Kata.
One step into that direction is to rename all the Network structure
fields and methods to something that is less Linux networking namespace
specific. This will make the Network interface naming consistent.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Julio Montes
49223e67af runtime: remove enable_swap option
`enable_swap` option was added long time ago to add
`-realtime mlock=off` to the QEMU's command line.
Kata now supports QEMU 6, `-realtime` option has been deprecated and
`mlock=on` is causing unexpected behaviors in kata.
This patch removes support for `enable_swap`, `-realtime` and `mlock=`
since they are causing bugs in kata.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2022-01-18 11:12:29 -06:00
Julio Montes
80ab91ac2f runtime: virtcontainers/persist: fix govet fieldalignment
Fix structures alignment

Signed-off-by: Julio Montes <julio.montes@intel.com>
2021-07-20 11:59:15 -05:00
Julio Montes
7834f4127f virtcontainers: change memory_offset to uint64
`memory_offset` is used to increase the maximum amount of memory
supported in a VM, this offset is equal to the NVDIMM/PMEM device that
is hot added, in real use case workloads such devices are bigger than
4G, which is the current limit (uint32).

fixes #2006

Signed-off-by: Julio Montes <julio.montes@intel.com>
2021-06-16 07:16:49 -05:00
Eric Ernst
7f1030d303 sandbox-bindmount: persist mount information
Without this, if the shim dies, we will not have a reliable way to
identify what mounts should be cleaned up if `containerd-shim-kata-v2
cleanup` is called for the sandbox.

Before this, if you `ctr run` with a sandbox bindmount defined and SIGKILL the
containerd-shim-kata-v2, you'll notice the sandbox bindmount left on
host.

With this change, the shim is able to get the sandbox bindmount
information from disk and do the appropriate cleanup.

Fixes #1896

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-05-21 12:54:35 -07:00
Christophe de Dinechin
dcb9f40394 config: Protect annotation for entropy_source
It would be undesirable to be given an annotation like "/dev/null".
Filter out bad annotation values.

Fixes: #1043

Suggested-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2021-04-22 15:26:40 +02:00
bin
532ff7c909 runtime: update virtcontainers API documentation
Virtcontainers API documentation is outdated, update documentation from the latest
source.

Fixes: #1455

Signed-off-by: bin <bin@hyper.sh>
2021-03-29 11:50:53 +08:00
Christophe de Dinechin
7c6aede5d4 config: Whitelist hypervisor annotations by name
Add a field "enable_annotations" to the runtime configuration that can
be used to whitelist annotations using a list of regular expressions,
which are used to match any part of the base annotation name, i.e. the
part after "io.katacontainers.config.hypervisor."

For example, the following configuraiton will match "virtio_fs_daemon",
"initrd" and "jailer_path", but not "path" nor "firmware":

  enable_annotations = [ "virtio.*", "initrd", "_path" ]

The default is an empty list of enabled annotations, which disables
annotations entirely.

If an anontation is rejected, the message is something like:

  annotation io.katacontainers.config.hypervisor.virtio_fs_daemon is not enabled

Fixes: #901

Suggested-by: Peng Tao <tao.peng@linux.alibaba.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
4e89b885d2 config: Protect file_mem_backend against annotation attacks
This one could theoretically be used to overwrite data on the host.
It seems somewhat less risky than the earlier ones for a number
of reasons, but worth protecting a little anyway.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
aae9656d8b config: Protect vhost_user_store_path against annotation attacks
This path could be used to overwrite data on the host.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
b21a829c61 config: Protect ctlpath from annotation attack
This also adds annotation for ctlpath which were not present
before. It's better to implement the code consistenly right now to make
sure that we don't end up with a leaky implementation tacked on later.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
27b6620b23 config: Protect jailer_path annotation
The jailer_path annotation can be used to execute arbitrary code on
the host. Add a jailer_path_list configuration entry providing a list
of regular expressions that can be used to filter annotations that
represent valid file names.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Christophe de Dinechin
bf13ff0a3a config: Protect virtio_fs_daemon annotation
Sending the virtio_fs_daemon annotation can be used to execute
arbitrary code on the host. In order to prevent this, restrict the
values of the annotation to a list provided by the configuration
file.

Fixes: #901

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-10-14 16:10:12 +02:00
Julio Montes
6df165c19d runtime: add support for SGX
Support the `sgx.intel.com/epc` annotation that is defined by the intel
k8s plugin. This annotation enables SGX. Hardware-based isolation and
memory encryption.

For example, use `sgx.intel.com/epc = "64Mi"` to create a container
with 1 EPC section with pre-allocated memory.

At the time of writing this patch, SGX patches have not landed on the
linux kernel project.
The following github kernel fork contains all the SGX patches for the
host and guest: https://github.com/intel/kvm-sgx

fixes #483

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-10-01 08:24:29 -05:00
Penny Zheng
1099a28830 kata 2.0: delete use_vsock option and proxy abstraction
With kata containers moving to 2.0, (hybrid-)vsock will be the only
way to directly communicate between host and agent.
And kata-proxy as additional component to handle the multiplexing on
serial port is also no longer needed.
Cleaning up related unit tests, and also add another mock socket type
`MockHybridVSock` to deal with ttrpc-based hybrid-vsock mock server.

Fixes: #389

Signed-off-by: Penny Zheng penny.zheng@arm.com
2020-07-16 04:20:02 +00:00
bin liu
bd8f03a5ef runtime: remove agent abstraction
This PR will delete agent abstraction and use Kata agent as the only one agent.

Fixes: #377

Signed-off-by: bin liu <bin@hyper.sh>
2020-07-08 10:07:40 +08:00
Jia He
fa9d619e8a qemu: add cpu_features option
[ port from runtime commit 0100af18a2afdd6dfcc95129ec6237ba4915b3e5 ]

To control whether guest can enable/disable some CPU features. E.g. pmu=off,
vmx=off. As discussed in the thread [1], the best approach is to let users
specify them. How about adding a new option in the configuration file.

Currently this patch only supports this option in qemu,no other vmm.

[1] https://github.com/kata-containers/runtime/pull/2559#issuecomment-603998256

Signed-off-by: Jia He <justin.he@arm.com>
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-06-29 20:16:11 -07:00
Penny Zheng
c2645f5d5a rate-limiter: add rate limiter configuration/annotation on VM level
Add configuration/annotation about network I/O throttling on VM level.
rx_rate_limiter_max_rate is dedicated to control network inbound
bandwidth per pod.
tx_rate_limiter_max_rate is dedicated to control network outbound
bandwidth per pod.

Fixes: #250

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
2020-06-24 06:14:04 +00:00
Peng Tao
a02a8bda66 runtime: move all code to src/runtime
To prepare for merging into kata-containers repository.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2020-04-27 19:39:25 -07:00