Commit Graph

3654 Commits

Author SHA1 Message Date
Archana Shinde
1611723465 Merge pull request #8379 from likebreath/1103/clh_v36.0
Upgrade to Cloud Hypervisor v36.0
2023-11-08 21:10:41 -08:00
Archana Shinde
268d4d622f Merge pull request #8389 from justxuewei/vm-capable-test
runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue
2023-11-08 12:14:04 -08:00
Archana Shinde
92a517156c Merge pull request #8367 from amshinde/add-nerdctl-ipvlan-test
network: Fix network hotplug for ipvlan and macvlan endpoints for qemu and add tests
2023-11-08 11:45:13 -08:00
Chelsea Mafrica
83e731328f Merge pull request #8023 from cmaf/runtime-rs-ch-pause-resume
runtime-rs: Update status for pause and resume
2023-11-08 11:34:47 -08:00
Xuewei Niu
acd9057c7b runtime: Fix TestCheckHostIsVMContainerCapable unstablity issue
TestCheckHostIsVMContainerCapable removes sysModuleDir to simulate a
case that the kernel modules are not loaded. However,
checkKernelModules() executes modprobe <module> if a module not
found in that directory. Loading those modules is required to be denied
temporarily.

Fixes: #8390

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 22:40:08 +08:00
Fupan Li
100a73d2fd Merge pull request #7531 from justxuewei/device-cgroup
agent: Restrict device access at upper node of container's cgroup
2023-11-08 22:01:48 +08:00
Xuewei Niu
023d8dc01e agent: Changes according to Pan's comments
- Disable device cgroup restriction while pod cgroup is not available.
- Remove balcklist-related names and change whitelist-related names to
  allowed_all.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:08 +08:00
Xuewei Niu
b5f3a8cb39 agent: Fix container launching failure with systemd cgroup
FSManager of systemd cgroup manager is responsible for setting up cgroup
path. The container launching will be failed if the FSManager is in
read-only mode.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
6477825195 agent: Minor changes according to Zhou's comments
The changes include:

- Change to debug logging level for resources after processed.
- Remove a todo for pod cgroup cleanup.
- Add an anyhow context to `get_paths_and_mounts()`.
- Remove code which denys access to VMROOTFS since it won't take effect. If
  blackmode is in use, the VMROOTFS will be denyed as default. Otherwise,
  device cgroups won't be updated in whitelist mode.
- Add a unit test for `default_allowed_devices()`.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
cec8044744 agent: Make devcg_info optional for LinuxContainer::new()
The runk is a standard OCI runtime that isnt' aware of concept of sandbox.
Therefore, the `devcg_info` argument of `LinuxContainer::new()` is
unneccessary to be provided.

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Xuewei Niu
ef4c3844a3 agent: Restrict device access at upper node of container's cgroup
The target is to guarantee that containers couldn't escape to access extra
devices, like vm rootfs, etc.

Assume that there is a cgroup, such as `/A/B`. The `B` is container cgroup,
and the `A` is what we called pod cgroup. No matter what permissions are
set for the container (`B`), the `A`'s permission is always `a *:* rwm`. It
leads that containers could acquire permission to access to other devices
in VM that not belongs to themselves.

In order to set devices cgroup properly, the order of setting cgroups is
that the pod cgroup comes first and the container cgroup comes after.

The `Sandbox` has a new field, `devcg_info`, to save cgroup states. To
avoid setting container cgroup too early, an initialization should be done
carefully. `inited`, one of the states, is a boolean to indicate if the pod
cgroup is initialized. If no, the pod cgroup should be created firstly, and
set default permissions. After that, the pause container cgroup is created
and inherits the permissions from the pod cgroup.

If whitelist mode which allows containers to access all devices in VM is
enabled,  then device resources from OCI spec are ignored.

This feature not supports systemd cgroup and cgroup v2, since:

- Systemd cgroup implemented on Agent hasn't supported devices subsystem so
  far, see: https://github.com/kata-containers/kata-containers/issues/7506.
- Cgroup v2's device controller depends on eBPF programs, which is out of
  scope of cgroup.

Fixes: #7507

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-11-08 09:39:07 +08:00
Archana Shinde
a6272733e7 network: Fix network hotplug for ipvlan and macvlan endpoints.
Since moving from network coldplug to hotplug, the only case verified
was veth endpoints. Support for network hotplug for ipvlan and macvlan was
broken/not added. Fix it.

Fixes: #8391

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-07 10:13:51 -08:00
James O. D. Hunt
59d0d4caff runtime-rs: ch: Simplify VSOCK error handling
Remove the redundant `VmConfigError::EmptyVsockSocketPath` error from
the Cloud Hypervisor config crate since this scenario is already handled
by the `VsockConfigError::NoVsockSocketPath` error.

Fixes: #8385.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-07 17:45:38 +00:00
James O. D. Hunt
bdb83f8282 runtime-rs: ch: Remove unused function
Remove the redundant `parse_mac()` function: this was never used and we
already have an implementation in `crates/resource/src/network/utils/mod.rs`.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-11-07 17:45:38 +00:00
Beraldo Leal
dd530ba8ee tests: fixes AMD errors
TestCheckHostIsVMContainerCapable is failing on AMD machines.
kata-check_amd64_test.go:96 has no AMD modules, also getCPUType is
missing.

Fixes #8384.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
7641c19f74 runtime: bump containerd for gogo deprecation
This update includes necessary changes due to the version bump of
containerd and its dependencies. It's part of a broader initiative to
phase out gogo protobuf, which has been deprecated, and to align with
the current supported libraries.

Fixes #7420.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:59 +00:00
Beraldo Leal
16fa2c39e6 protocols: replace gogo/types.Empty and Any
by Google versions.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c61f4a8592 protocols: remove unused fieldpath option
The +fieldpath option, specific to gogoprotobuf, enabled dynamic field
access in protobuf messages, allowing nested fields to be accessed via
string paths.

This change is part of a larger effort to transition to the official Go
protobuf library for better maintainability and community support.
Upon review, no instances of dynamic field access were found in the
codebase, confirming that the feature is not in use.

By removing this unused feature, we simplify the build process and make
it easier to complete the transition away from gogoprotobuf.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c87bc60ea0 protocols: removing unused mappings
Those mappings are not used by our .proto files and there is no
difference between .pb.go files generated.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
c5d845b30a agent: updating Cargo.lock files
Probably previous changes missed updating Cargo.lock.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Beraldo Leal
5d88c78a6e protocols: generating agent.pb.go
a3b003c345 modified agent but agent.pb.go
was not updated.

Signed-off-by: Beraldo Leal <bleal@redhat.com>
2023-11-06 16:49:58 +00:00
Archana Shinde
036b7787dd runtime-rs: Use PCI path from hypervisor for vfio devices
Remove earlier functionality that tries to assign PCI path to vfio
devices from the host assuming pci slots to start from 1.
Get this from the hypervisor instead.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
c3ce6a1d15 runtime-rs: Provide PCI path to the agent for virtio-block
If PCI path for block device is not empty for a block device, use
that as identifier for agent instead of virt path which is valid only
for mmio devices.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Archana Shinde
a2bbbad711 runtime-rs: change hypervisor add_device trait to return device copy
Block(virtio-blk) and vfio devices are currently not handled correctly
by the agent as the agent is not provided with correct PCI paths for
these devices.

The PCI paths for these devices can be inferred from the PCI information
provided by the hypervisor when the device is added.
Hence changing the add_device trait function to return a device copy
with PCI info potentially provided by the hypervisor. This can then be
provided to the agent to correctly detect devices within the VM.

This commit includes implementation for PCI info update for
cloud-hupervisor for virtio-blk devices with stubs provided for other
hypervisors.

Removing Vsock from the DeviceType enum as Vsock currently does not
implement the Device Trait, it has no attach and detach trait functions
among others. Part of the reason is because these functions require Vsock
to implement Clone trait as these functions need cloned copies to be
passed down the hypervisor.

The change introduced for returning a device copy from the add_device
hypervisor trait explicitly requires a device to implement
Copy trait. Hence removing Vsock from the DeviceType enum for now, as
its implementation is incomplete and not currently used.

Note, one of the blockers for adding the Clone trait to Vsock is that it
currently includes a file handle which cannot be cloned. For Clone and
Device Traits to be implemented for Vsock, it requires an implementation
change in the future for it to be cloneable.

Fixes: #8283

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-11-05 21:59:44 -08:00
Bo Chen
071667f1ca runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v35.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #8378

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-11-03 10:47:06 -07:00
Fabiano Fidêncio
40cc397218 Merge pull request #8255 from cmaf/migrate-checks-fixes-links
docs: Fix broken links
2023-11-01 14:46:30 +01:00
Fabiano Fidêncio
53cda12a71 Merge pull request #8311 from TimePrinciple/log-system-enhancement
runtime-rs: Log system enhancement
2023-10-31 10:14:41 +01:00
Archana Shinde
148c565b2f Merge pull request #8289 from BbolroC/skip-create-tmpfs-s390x
agent: Skip flaky create_tmpfs on s390x
2023-10-30 22:26:28 -07:00
Ruoqing He
4ad2cfe0c2 runtime-rs: Log system enhancement
By modifying RuntimeLevelFilter drain to improve logging control,
enabling isolation of change effect of the loggers between components,
tuning clh logs to be logged according to their log levels
given by cloud-hypervisor.

Fixes: #8310

Signed-off-by: Ruoqing He <linuxwatcher@outlook.com>
2023-10-31 04:57:46 +00:00
David Esparza
2a17d3889e Merge pull request #8334 from amshinde/ipvlan-nerdctl-fix
network: Fix network attach for ipvlan and macvlan
2023-10-30 16:00:32 -06:00
Chao Wu
7d26604061 Merge pull request #7831 from lisongqian/feat/dragonball_trace
dragonball: add tracing feature for dragonball
2023-10-30 17:27:30 +08:00
James O. D. Hunt
d7e410ad2b Merge pull request #8314 from jodh-intel/kata-ctl-show-confidential-guest
kata-runtime/kata-ctl: Add security details to output
2023-10-30 07:41:22 +00:00
Songqian Li
2f533c3003 dragonball: add tracing feature for dragonball
This PR adds the tracing capability for dragonball and it depends on the tracing::Subscriber of the upper layer.

Fixes: #7249

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-28 19:52:24 +08:00
Chao Wu
f1f4410537 Merge pull request #7695 from lisongqian/feat/legacy_metrics
dragonball: add metrics support for legacy device
2023-10-28 16:48:57 +08:00
Archana Shinde
f53f86884f network: Fix network attach for ipvlan and macvlan
We used the approach of cold-plugging network interface for pre-shimv2
support for docker.Since the hotplug approach was not required,
we never really got to implementing hotplug support for certain network
endpoints, ipvlan and macvlan being among them.

Since moving to shimv2 interface as the default for
runtime, we switched to hotplugging the network interface for supporting
docker and nerdctl. This was done for veth endpoints only.

Implement the hot-attach apis for ipvlan and macvlan as well to support
ipvlan and macvlan networks with docker and nerdctl.

Fixes: #8333

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-27 21:42:37 -07:00
Peng Tao
52a014d9cd Merge pull request #8033 from h56983577/6715/shared-mount
agent: use open_tree()/move_mount() to set up bind mounts between containers directly.
2023-10-28 10:57:34 +08:00
Songqian Li
da77b19449 dragonball: output legacy device metrics to runtime
Legacy device manager adds device metrics to METRICS when a device is created and removes metrics when a device is dropped.

Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-27 14:09:42 +08:00
Songqian Li
65213e9fbe dragonball: unify the metric interface of legacy device
Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-27 14:09:42 +08:00
Archana Shinde
f5c17f89a3 Merge pull request #8250 from amshinde/runtime-rs-clh-config
runtime-rs: Add default configuration file for cloud-hypervisor
2023-10-26 14:54:47 -07:00
Chelsea Mafrica
0608e20a01 docs: Fix broken links
Update broken links so that static checks pass.

Fixes #8254

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2023-10-26 10:17:01 -07:00
HanZiyao
a3b003c345 agent: support bind mounts between containers
This feature supports creating bind mounts directly between containers through annotations.

Fixes: #6715

Signed-off-by: HanZiyao <h56983577@126.com>
2023-10-26 16:34:50 +08:00
James O. D. Hunt
d707fa2c0d kata-runtime/kata-ctl: Add security details to output
Add the hypervisor security details to the output of the `kata-runtime
env` and `kata-ctl env` commands so the user can see, amongst other
things, the value of `confidential_guest`.

Fixes: #8313.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2023-10-25 16:34:42 +01:00
Chao Wu
29d863350f Merge pull request #7697 from lisongqian/feat/balloon_metrics
dragonball: add metrics support for balloon device
2023-10-25 02:42:14 -05:00
Fabiano Fidêncio
328ba0da99 Merge pull request #7647 from jongwu/use_pcie_virt
AArch64: runtime: use pcie root port to do pci/pcie device hotplug
2023-10-25 09:17:13 +02:00
Archana Shinde
f99de4d5a1 runtime-rs: Make default kernel params as empty
The default kernel params passed to any hypervisor except dragonball is
empty.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-24 15:50:12 -07:00
Archana Shinde
a813012785 runtime-rs: Add default configuration file for clouf-hypervisor
The config template file for clh is in the new format for runtime-rs.
It is a result of merging the new format file and options supportted by
cloud-hypervisor.

Some config options from the golang runtime are missing as they may not
be currently supported by the rust runtime. An example of this is the
selinux options, rate limiting options as these are not currently
supported or verified with the rust runtime.

Fixes: #8249

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-10-24 15:17:24 -07:00
Songqian Li
dce365d5b4 dragonball: add conditional compilation for BalloonDeviceMetrics
Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-24 13:33:39 +08:00
Songqian Li
3819f0ee6f dragonball: output balloon device metrics to runtime
Balloon device manager adds balloon device metrics to METRICS when a device is created and remove metrics when a device is dropped.

Fixes: #7248

Signed-off-by: Songqian Li <mail@lisongqian.cn>
2023-10-23 21:15:22 +08:00
Zizheng Bian
7d7c25c1d6 runtime-rs: fix a typo in device manager
Fixes: #8293
Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com>
2023-10-23 20:33:47 +08:00
Hyounggyu Choi
a0746c8d7b agent: Skip flaky create_tmpfs on s390x
This is to skip a flaky test `create_tmpfs()` on s390x until a root cause is identified and fixed.

Fixes: #4248

Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-10-23 11:22:14 +02:00