Commit Graph

414 Commits

Author SHA1 Message Date
Guixiong Wei
202049f35e feat(runtime-rs): introduce huge page type to select VM RAM's backend
This commit allows us to specify the huge page backend when enabling huge
page. Currently, we support two backends: thp and hugetlbfs, the default
is hugetlbfs.

To ensure backward compatibility, we introduce another configuration item
"hugepage_type" to select the memory backend, which is available only when
"enable_hugepages" is true. Besides, we add an annotation
"io.katacontainers.config.hypervisor.hugepage_type" to configure huge page
type per pod.

Fixes: #6703

Signed-off-by: Guixiong Wei <weiguixiong@bytedance.com>
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
2023-09-12 11:28:27 +08:00
Zhongtao Hu
e1f54f96d0 Merge pull request #7766 from Apokleos/wrap-vsock-virtiofs
runtime-rs: bring hybrid vsock devices in manager.
2023-09-12 09:27:34 +08:00
James O. D. Hunt
c0f697fcc5 runtime: Allow kernel_params annotation
To support the removal of the `initcall_debug` and `earlyprintk=`
options from the default guest kernel cmdline, add `kernel_params` to the list
of enabled annotations to allow those kernel options (or others) to be
set using `kata-deploy` for either runtime.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 12:12:12 +01:00
James O. D. Hunt
976d10150c runtime-rs: hypervisor: Remove debug kernel options
Removed the following kernel command line options:

- `earlyprintk=ttyS0`
- `initcall_debug`

Both these options are only useful when debugging a guest kernel failure
which is not a common occurrence.

Further, the `earlyprintk=` option can have a large negative performance
impact (it can increase the VM boot time significantly).

If the user wishes to use either of these options, they can add them to the
`kernel_params=` setting in the Kata configuration file's hypervisor
stanza.

Fixes: #7886.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-09-11 09:43:39 +01:00
Zhongtao Hu
aa85e0b3ec Merge pull request #7714 from justxuewei/volumes-cleanup
runtime-rs: Fix volumes and rootfs cleanup issues
2023-09-06 10:13:55 +08:00
alex.lyn
7870b33a2d runtime-rs: bring hybridVsock devices in manager.
Currently, virtio_vsock are still outside of the device
manager. This causes some management issues,such as the
inability to unify PCI address management.

Just do some work for hybrid vsock.

Fixes: #7655

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-09-05 08:46:56 +08:00
Zixuan Tan
dffc16e5b3 runtime-rs: check peer close in log_forwarder
The log_forwarder task does not check if the peer has closed, causing a
meaningless loop during the period of “kata vm exit”, when the peer
closed, and “ShutdownContainer RPC received” that aborts the log forwarder.

This patch fixes the problem.

Fixes: #7741

Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
2023-08-25 19:00:07 +08:00
Xuewei Niu
268e846558 runtime-rs: Fix volumes and rootfs cleanup issues
There are several processes for container exit:

- Non-detach mode: `Wait` request is sent by containerd, then
  `wait_process()` will be called eventually.
- Detach mode: `Wait` request is not sent, the `wait_process()` won’t be
  called.
    - Killed by ctr: For example, a container runs `tail -f /dev/null`, and
      is killed by `sudo ctr t kill -a -s SIGTERM <CID>`. Kill request is
      sent, then `kill_process()` will be called. User executes `sudo ctr c
      rm <CID>`, `Delete` request is sent, then `delete_process()` will be
      called.
    - Exited on its own: For example, a container runs `sleep 1s`. The
      container’s state goes to `Stopped` after 1 second. User executes
      the delete command as below.

Where do we do container cleanup things?

- `wait_process()`: No, because it won’t be called in detach mode.
- `delete_process()`: No, because it depends on when the user executes the
  delete command.
- `run_io_wait()`: Yes. A container is considered exited once its IO ended.
  And this always be called once a container is launched.

Fixes: #7713

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-24 13:23:47 +08:00
Zhongtao Hu
d90f7ac689 runtime-rs: add unit test for block driver
add unit test for block driver

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:45:27 +08:00
Zhongtao Hu
e44919f0da runtime-rs: add load_test_config for unit test
add load_test_config for unit test

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:56 +08:00
Zhongtao Hu
7f48a69379 runtime-rs: add driver option
add driver option when handle linux devices

Fixes:#7539
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-08-16 11:32:49 +08:00
Chao Wu
b098960442 Merge pull request #7581 from justxuewei/bump-versions
deps: Bump dependent crate versions
2023-08-08 15:16:57 +08:00
Chao Wu
24bf637835 Merge pull request #7500 from pmores/fix-queue-num-in-dragonball-share-fs
fix number of queues handling in dragonball share fs device
2023-08-08 12:07:25 +08:00
Xuewei Niu
b23c5ed155 deps: Bump dependent crate versions
This pull request is mainly for updating vm-memory and vmm-sys-util.

The affacted crates include:

- vm-memory: from 0.9.0 to 0.10.0
- vmm-sys-util: from 0.10.0 to 0.11.0
- virtio-queue: from 0.6.0 to 0.7.0
- fuse-backend-rs: from 0.10.4 to 0.10.5
- linux-loader: from 0.6.0 to 0.8.0
- nydus-api: from 0.3.0 to 0.3.1
- nydus-rafs: from 0.3.1 to 0.3.2
- nydus-storage: from 0.6.3 to 0.6.4

Fixes: #0000

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-08 11:54:09 +08:00
Xuewei Niu
3958a39d07 runtime-rs: Introduce directly attachable network
Kata containers as VM-based containers are allowed to run in the host
netns. That is, the network is able to isolate in the L2. The network
performance will benefit from this architecture, which eliminates as many
hops as possible. We called it a Directly Attachable Network (DAN for
short).

The network devices are placed at the host netns by the CNI plugins. The
configs are saved at {dan_conf}/{sandbox_id}.json in the format of JSON,
including device name, type, and network info. At the very beginning stage,
the DAN only supports host tap devices. More devices, like the DPDK, will
be supported in later versions.

The format of file looks like as below:

```json
{
	"netns": "/path/to/netns",
	"devices": [{
		"name": "eth0",
		"guest_mac": "xx:xx:xx:xx:xx",
		"device": {
			"type": "vhost-user",
			"path": "/tmp/test",
			"queue_num": 1,
			"queue_size": 1
		},
		"network_info": {
			"interface": {
				"ip_addresses": ["192.168.0.1/24"],
				"mtu": 1500,
				"ntype": "tuntap",
				"flags": 0
			},
			"routes": [{
				"dest": "172.18.0.0/16",
				"source": "172.18.0.1",
				"gateway": "172.18.31.1",
				"scope": 0,
				"flags": 0
			}],
			"neighbors": [{
				"ip_address": "192.168.0.3/16",
				"device": "",
				"state": 0,
				"flags": 0,
				"hardware_addr": "xx:xx:xx:xx:xx"
			}]
		}
	}]
}
```

Fixes: #1922

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-08-03 15:33:34 +08:00
Chelsea Mafrica
a81ad3b587 runtime-rs: Add block device handling in cloud hypervisor
Add functions for adding a block device to a container for CH.

Fixes #6690

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-08-02 09:18:48 -07:00
Fupan Li
1a6b27bf6a Merge pull request #5797 from Yuan-Zhuo/add-metrics-for-runtime-rs
runtime-rs: add support for gather metrics in runtime-rs
2023-08-02 13:40:22 +08:00
Pavel Mores
28e5e9c86e runtime-rs: fix number of queues handling in dragonball share fs device
Looks like a copy/paste error...

Fixes #7501

Signed-off-by: Pavel Mores <pmores@redhat.com>
2023-07-31 17:25:47 +02:00
Jiang Liu
b3901c46d6 runtime-rs: ignore errors during clean up sandbox resources
Ignore errors during clean up sandbox resources as much as we can.

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-31 13:07:43 +08:00
Jiang Liu
62e328ca5c runtime-rs: refine implementation of TaskService
Refine implementation of TaskService, making handler_message() as a
method.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:33 +08:00
Jiang Liu
458e1bc712 runtime-rs: make send_message() as an method of ServiceManager
Simplify implementation by making send_message() as an method of
ServiceManager.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:31 +08:00
Jiang Liu
1cc1c81c9a runtime-rs: fix possibe bug in ServiceManager::run()
Multiple instances of task service may get registered by
ServiceManager::run(), fix it by making operation symmetric.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:30 +08:00
Jiang Liu
1a5f90dc3f runtime-rs: simplify implementation of service crate
Simplify implementation of service crate.

Fixes: #7479

Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
2023-07-29 00:47:28 +08:00
Yuan-Zhuo
02cc4fe9db runtime-rs: add support for gather metrics in runtime-rs
1. Implemented metrics collection for runtime-rs shim and dragonball hypervisor.
2. Described the current supported metrics in runtime-rs.(docs/design/kata-metrics-in-runtime-rs.md)

Fixes: #5017

Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
2023-07-28 17:16:51 +08:00
Zhongtao Hu
c8fcd29d9b runtime-rs: use device manager to handle virtio-pmem
use device manager to handle virtio-pmem device

Fixes: #7119
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-07-27 20:18:49 +08:00
Zhongtao Hu
901c192251 runtime-rs: support configure vm_rootfs_driver
support configure vm_rootfs_driver in toml config

Fixes: #7119
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-07-27 20:12:53 +08:00
Zhongtao Hu
5d6199f9bc runtime-rs: use device manager to handle vm rootfs
use device manager to handle vm rootfs, after attach the block device of
vm rootfs, we need to increase index number

Fixes: #7119
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-07-27 20:12:45 +08:00
James O. D. Hunt
20f1f62a2a runtime-rs: change block index to 0
Change block index in SharedInfo to 0 for vda.

Fixes #7119

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-07-27 20:11:44 +08:00
Fabiano Fidêncio
9792ac49fe Merge pull request #7425 from jongwu/remove_mut
runtime-rs: remove unneeded 'mut' keywords
2023-07-26 21:24:40 +02:00
Jianyong Wu
2c8f83424d runtime-rs: remove unneeded 'mut' keywords
These unneeded 'mut' keywords blocks built by rust 1.71.0. Remove them.

Fixes: #7424
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-07-24 08:47:15 +00:00
Chao Wu
bbd3c1b6ab Dragonball: migrate dragonball-sandbox crates to Kata
In order to make it easier for developers to contribute to Dragonball,
we decide to migrate all dragonball-sandbox crates to Kata.

fixes: #7262

Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
2023-07-19 19:41:57 +08:00
Zhongtao Hu
d50f3888af Merge pull request #7219 from Apokleos/network-refactor
runtime-rs: enhancement of Device Manager for network endpoints.
2023-07-17 14:13:51 +08:00
QuanweiZhou
ce14f26d82 Merge pull request #5450 from openanolis/trace_rs
feat(Tracing): tracing in Rust runtime
2023-07-17 09:27:13 +08:00
Zhongtao Hu
419f8a5db7 Merge pull request #7021 from cheriL/7020/ignore-unconfigured-netinterface
runtime-rs: ignore unconfigured network interfaces
2023-07-16 10:11:15 +08:00
soup
150e54d02b runtime-rs: ignore unconfigured network interfaces
Fixes: #7020

Signed-off-by: soup <lqh348659137@outlook.com>
2023-07-14 14:16:03 +08:00
Anastassios Nanos
6787c63900 runtime-rs: add parameter for propagation of (u)mount events
Add an extra parameter in `bind_mount_unchecked` to specify
the propagation type: "shared" or "slave".

Fixes: #7017

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2023-07-13 15:58:22 +00:00
alex.lyn
283f809dda runtime-rs: Enhancing Device Manager for network endpoints.
Currently, network endpoints are separate from the device manager
and need to be included for proper management. In order to do so,
we need to refactor the implementation of the network endpoints.

The first step is to restructure the NetworkConfig and NetworkDevice
structures.
Next, we will implement the virtio-net driver and add the Network
device to the Device Manager.
Finally, we'll unify entries with do_handle_device for each endpoint.

Fixes: #7215

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-07-12 11:27:12 +08:00
Xuewei Niu
6822029c81 runtime-rs: Do not scan network if network model is "none"
Skip to scan network from netns if the network model is specified to
"none".

Fixes: #7305

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-07-12 10:00:50 +08:00
Yushuo
28c29b248d bugfix: plus default_memory when calculating mem size
We've noticed this caused regressions with the k8s-oom tests, and then
decided to take a step back and do this in the same way it was done
before 67972ec48a.

Moreover, this step back is also more reasonable in terms of the
controlling logic.

And by doing this we can re-enable the k8s-oom.bats tests, which is done
as part of this PR.

Fixes: #7271
Depends-on: github.com/kata-containers/tests#5705

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-07-10 15:53:04 +08:00
Ji-Xinyou
ed23b47c71 tracing: Add tracing to runtime-rs
Introduce tracing into runtime-rs, only some functions are instrumented.

Fixes: #5239

Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-07-09 22:09:43 +08:00
Fabiano Fidêncio
96e9374d4b dragonball: Don't fail if a request asks for more CPUs than allowed
Let's take the same approach of the go runtime, instead, and allocate
the maximum allowed number of vcpus instead.

Fixes: #7270

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 15:50:23 +02:00
Fupan Li
4288b935e1 Merge pull request #7104 from openanolis/physical/endpoint
runtime-rs:  support physical endpoint using device manager
2023-06-29 14:43:44 +08:00
GabyCT
19890133e9 Merge pull request #7189 from Apokleos/direct-vol-bugfix
runtime-rs: bugfix for direct volume path's validation.
2023-06-28 12:26:22 -06:00
Jianyong Wu
1f3e837e4b runtime-rs: fix build error on AArch64
Vfio support introduce build error on AArch64. Remove arch related
annotation can avoid this error.

Fixes: #7187
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-06-28 07:10:43 +00:00
alex.lyn
6fd25968c6 runtime-rs: bugfix for direct volume path's validation.
The failure mainly caused by the encoded volume path and
the mount/src. As the src will be validated with stat,but
it's not a full path and encoded, which causes the stat
mount source failed.

Fixes: #7186

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-28 10:07:07 +08:00
Zhongtao Hu
bff4672f7d runtime-rs: support physical endpoint using device manager
use device manager to attach physical endpoint

Fixes: #7103
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-06-27 10:25:51 +08:00
alex.lyn
0df2fc2702 runtime-rs: add support spdk/vhost-user based volume.
Unlike the previous usage which requires creating
/dev/xxx by mknod on the host, the new approach will
fully utilize the DirectVolume-related usage method,
and pass the spdk controller to vmm.

And a user guide about using the spdk volume when run
a kata-containers. it can be found in docs/how-to.

Fixes: #6526

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-25 16:23:19 +08:00
Fupan Li
469c678425 Merge pull request #7058 from Apokleos/vfio-dev
add support vfio device manager
2023-06-22 17:51:22 -06:00
soup
09720babc3 docs: fix spelling of "crate"
Fixes: #7153

Signed-off-by: soup <lqh348659137@outlook.com>
2023-06-21 16:10:54 +08:00
alex.lyn
59510cfee0 runtime-rs: add support vfio device based volume
A new choice of using vfio devic based volume for kata-containers.
With the help of kata-ctl direct-volume, users are able to add a
specified device which is BDF or IOMMU group ID.

To help users to use it smoothly, A doc about howto added in
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.

Fixes: #6525

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-18 14:07:05 +08:00