Commit Graph

3398 Commits

Author SHA1 Message Date
Archana Shinde
dacdf7c282 kata-ctl: Remove cpu related functions from kata-ctl
Remove cpu related functions which have been moved to kata-sys-util.
Change invocations in kata-ctl to make use of functions now moved to
kata-sys-util.

Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Archana Shinde
f5d1957174 kata-sys-util: Move additional functionality to cpu.rs
Make certain imports architecture specific as these are not used on all
architectures.
Move additional constants and functionality to cpu.rs.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Nathan Whyte
304b9d9146 kata-sys-util: Move CPU info functions
Move get_single_cpu_info and get_cpu_flags into kata-sys-util.
Add new functions that get a list of flags and check if a flag
exists in that list.

Fixes #6383

Signed-off-by: Nathan Whyte <nathanwhyte35@gmail.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-07-13 20:08:13 +05:30
Zhongtao Hu
b69cdb5c21 Merge pull request #7286 from xuejun-xj/xuejun/up-fix
dragonball/agent: Add some optimization for Makefile and bugfixes of unit tests on aarch64
2023-07-13 09:39:23 +08:00
alex.lyn
283f809dda runtime-rs: Enhancing Device Manager for network endpoints.
Currently, network endpoints are separate from the device manager
and need to be included for proper management. In order to do so,
we need to refactor the implementation of the network endpoints.

The first step is to restructure the NetworkConfig and NetworkDevice
structures.
Next, we will implement the virtio-net driver and add the Network
device to the Device Manager.
Finally, we'll unify entries with do_handle_device for each endpoint.

Fixes: #7215

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-07-12 11:27:12 +08:00
xuejun-xj
a65291ad72 agent: rustjail: update test_mknod_dev
When running cargo test in container, test_mknod_dev may fail sometimes
because of "Operation not permitted". Change the device path to
"/dev/fifo-test" to avoid this case.

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:32 +08:00
xuejun-xj
46b81dd7d2 agent: clippy: fix cargo clippy warnings
Replace "if let Ok(_) = ..." with ".is_ok()" method.

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:32 +08:00
xuejun-xj
c4771d9e89 agent: Makefile: enable set SECCOMP dynamically
Change ":=" to "?:".

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:32 +08:00
xuejun-xj
883b4db380 dragonball: fix cargo test on aarch64
1. Update memory end assert because address space layout differs between
x86 and arm.
2. Set guest_addr for aarch64 in test_handler_insert_region case.

Fixes: #7284
TODO: #7290

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-12 11:22:31 +08:00
Xuewei Niu
6822029c81 runtime-rs: Do not scan network if network model is "none"
Skip to scan network from netns if the network model is specified to
"none".

Fixes: #7305

Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
2023-07-12 10:00:50 +08:00
xuejun-xj
aedc586e14 dragonball: Makefile: add coverage target
Add "coverage" target to compute code coverage for dragonball.

Fixes: #7284

Signed-off-by: xuejun-xj <jiyunxue@linux.alibaba.com>
2023-07-11 14:36:25 +08:00
Yushuo
28c29b248d bugfix: plus default_memory when calculating mem size
We've noticed this caused regressions with the k8s-oom tests, and then
decided to take a step back and do this in the same way it was done
before 67972ec48a.

Moreover, this step back is also more reasonable in terms of the
controlling logic.

And by doing this we can re-enable the k8s-oom.bats tests, which is done
as part of this PR.

Fixes: #7271
Depends-on: github.com/kata-containers/tests#5705

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-07-10 15:53:04 +08:00
Ji-Xinyou
ed23b47c71 tracing: Add tracing to runtime-rs
Introduce tracing into runtime-rs, only some functions are instrumented.

Fixes: #5239

Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-07-09 22:09:43 +08:00
Fabiano Fidêncio
96e9374d4b dragonball: Don't fail if a request asks for more CPUs than allowed
Let's take the same approach of the go runtime, instead, and allocate
the maximum allowed number of vcpus instead.

Fixes: #7270

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 15:50:23 +02:00
Fabiano Fidêncio
275c84e7b5 Revert "agent: fix the issue of exec hang with a backgroud process"
This reverts commit 25d2fb0fde.

The reason we're reverting the commit is because it to check whether
it's the cause for the regression on devmapper tests.

Fixes: #7253
Depends-on: github.com/kata-containers/tests#5705

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 14:27:40 +02:00
Zvonko Kaiser
f72cb2fc12 agent: Remove shadowed function, add slog-term
Remove shadowed get_mounts(), added slog-term as a new crate,
slog can directly log to stdout and we can capture output
in the test-cases that are created in the function to be tested.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-07 11:28:14 +00:00
Zvonko Kaiser
07810bf71f agent: Ignore already mounted dev/fs/pseudo-fs
Using an initrd and setting KATA_INIT=yes meaning we're using the kata-agent
as the init process we need to make sure that the agent is not segfaulting
if mounts are already happened. Some workloads need to configure several
things in the initrd before the kata-agent starts which involves having
/proc or /sys already mounted.

Fixes: #6992

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-07 07:36:04 +00:00
Bin Liu
f214058b07 Merge pull request #7202 from wedsonaf/macros
Convert `is_allowed`, `ttrpc_error` and `sl` to functions
2023-07-04 14:23:08 +08:00
Peng Tao
581be92b25 Merge pull request #4492 from zvonkok/pcie-topology
runtime: fix PCIe topology for GPUDirect use-case
2023-07-03 09:17:12 +08:00
Fabiano Fidêncio
6a21e20c63 runtime: Add "none" as a shared_fs option
Currently, even when using devmapper, if the VMM supports virtio-fs /
virtio-9p, that's used to share a few files between the host and the
guest.

This *needed*, as we need to share with the guest contents like secrets,
certificates, and configurations, via Kubernetes objects like configMaps
or secrets, and those are rotated and must be updated into the guest
whenever the rotation happens.

However, there are still use-cases users can live with just copying
those files into the guest at the pod creation time, and for those
there's absolutely no need to have a shared filesystem process running
with no extra obvious benefit, consuming memory and even increasing the
attack surface used by Kata Containers.

For the case mentioned above, we should allow users, making it very
clear which limitations it'll bring, to run Kata Containers with
devmapper without actually having to use a shared file system, which is
already the approach taken when using Firecracker as the VMM.

Fixes: #7207

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-30 20:45:00 +02:00
Zvonko Kaiser
0f454d0c04 gpu: Fixing typos for PCIe topology changes
Some comments and functions had typos and wrong capitalization.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-30 08:42:55 +00:00
Fupan Li
4288b935e1 Merge pull request #7104 from openanolis/physical/endpoint
runtime-rs:  support physical endpoint using device manager
2023-06-29 14:43:44 +08:00
GabyCT
19890133e9 Merge pull request #7189 from Apokleos/direct-vol-bugfix
runtime-rs: bugfix for direct volume path's validation.
2023-06-28 12:26:22 -06:00
Wedson Almeida Filho
0504bd7254 agent: convert the sl macros to functions
There is nothing in them that requires them to be macros. Converting
them to functions allows for better error messages.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
0860fbd410 agent: convert the ttrpc_error macro to a function
There is nothing in it that requires it to be a macro. Converting it to
a function allows for better error messages.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
0e5d6ce6d7 agent: convert the is_allowed macro to a function
Having a function allows for better error messages from the type checker
and it makes it clearer to callers what can happen. For example:

is_allowed!(req);

Gives no indication that it may result in an early return, and no simple
way for callers to modify the behaviour. It also makes it look like
ownership of `req` is being transferred.

On the other hand,

is_allowed(&req)?;

Indicates that `req` is being borrowed (immutably) and may fail. The
question mark indicates that the caller wants an early return on
failure.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
f680fc52be agent: change AGENT_CONFIG's lazy type to just AgentConfig
Since it is never modified, it doesn't really need a lock of any kind.
Removing the `RwLock` wrapper allows us to remove all `.read().await`
calls when accessing it.

Additionally, `AGENT_CONFIG` already has a static lifetime, so there is
no need to wrap it in a ref-counted heap allocation.

Fixes: #5409

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:27 -03:00
Jianyong Wu
1f3e837e4b runtime-rs: fix build error on AArch64
Vfio support introduce build error on AArch64. Remove arch related
annotation can avoid this error.

Fixes: #7187
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-06-28 07:10:43 +00:00
alex.lyn
6fd25968c6 runtime-rs: bugfix for direct volume path's validation.
The failure mainly caused by the encoded volume path and
the mount/src. As the src will be validated with stat,but
it's not a full path and encoded, which causes the stat
mount source failed.

Fixes: #7186

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-28 10:07:07 +08:00
Zhongtao Hu
bff4672f7d runtime-rs: support physical endpoint using device manager
use device manager to attach physical endpoint

Fixes: #7103
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-06-27 10:25:51 +08:00
alex.lyn
0df2fc2702 runtime-rs: add support spdk/vhost-user based volume.
Unlike the previous usage which requires creating
/dev/xxx by mknod on the host, the new approach will
fully utilize the DirectVolume-related usage method,
and pass the spdk controller to vmm.

And a user guide about using the spdk volume when run
a kata-containers. it can be found in docs/how-to.

Fixes: #6526

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-25 16:23:19 +08:00
GabyCT
388b55175e Merge pull request #7056 from FuuuOverclocking/fuu/fix-console_manager
dragonball: avoid obtaining lock twice in create_stdio_console
2023-06-23 16:47:00 -06:00
Zvonko Kaiser
8330fb8ee7 gpu: Update unit tests
Some tests are now failing due to the changes how PCIe is
handled. Update the test accordingly.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-23 11:16:25 +00:00
Fupan Li
469c678425 Merge pull request #7058 from Apokleos/vfio-dev
add support vfio device manager
2023-06-22 17:51:22 -06:00
Archana Shinde
2d329125fd Merge pull request #6800 from amshinde/check-vm-capability
kata-ctl: Check for vm capability
2023-06-21 23:52:46 -07:00
Archana Shinde
610f7986e4 check: Relax the unrestricted_guest check when running in a VM
When running on a VM, the kernel parameter "unrestricted_guest" for
kernel module "kvm_intel" is not required. So, return success when running
on a VM without checking value of this kernel parameter.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-06-21 07:30:35 -07:00
Archana Shinde
1b406b9d0c kata-ctl:Implement functionality to check host is capable of running VM
Implement functionality to add to the env output if the host is capable
of running a VM.

Fixes: #6727

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-06-21 07:30:22 -07:00
soup
09720babc3 docs: fix spelling of "crate"
Fixes: #7153

Signed-off-by: soup <lqh348659137@outlook.com>
2023-06-21 16:10:54 +08:00
alex.lyn
59510cfee0 runtime-rs: add support vfio device based volume
A new choice of using vfio devic based volume for kata-containers.
With the help of kata-ctl direct-volume, users are able to add a
specified device which is BDF or IOMMU group ID.

To help users to use it smoothly, A doc about howto added in
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.

Fixes: #6525

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-18 14:07:05 +08:00
alex.lyn
1e3b372bbb runtime-rs: add support vfio device manager
Limitations:
As no ready rust vmm's vfio manager is ready, it only supports
part of vfio in runtime-rs. And the left part is to call vmm
interfaces related to vfio add/remove.

So when vmm/vfio manager ready, a new PR will be pushed to
narrow the gap.

Fixes: #6525

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-18 14:05:59 +08:00
Greg Kurz
a43ea24dfc virtiofsd: Convert legacy -o sub-options to their -- replacement
The `-o` option is the legacy way to configure virtiofsd, inherited
from the C implementation. The rust implementation honours it for
compatibility but it logs deprecation warnings.

Let's use the replacement options in the go shim code. Also drop
references to `-o` from the configuration TOML file.

Fixes #7111

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-06-16 11:42:54 +02:00
Greg Kurz
8e00dc6944 virtiofsd: Drop -o no_posix_lock
The C implementation of virtiofsd had some kind of limited support
for remote POSIX locks that was causing some workflows to fail with
kata. Commit 432f9bea6e hard coded `-o no_posix_lock` in order
to enforce guest local POSIX locks and avoid the issues.

We've switched to the rust implementation of virtiofsd since then,
but it emits a warning about `-o` being deprecated.

According to https://gitlab.com/virtio-fs/virtiofsd/-/issues/53 :

   The C implementation of the daemon has limited support for
   remote POSIX locks, restricted exclusively to non-blocking
   operations. We tried to implement the same level of
   functionality in #2, but we finally decided against it because,
   in practice most applications will fail if non-blocking
   operations aren't supported.

   Implementing support for non-blocking isn't trivial and will
   probably require extending the kernel interface before we can
   even start working on the daemon side.

There is thus no justification to pass `-o no_posix_lock` anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-06-16 11:42:39 +02:00
Greg Kurz
2a15ad9788 virtiofsd: Stop using deprecated -f option
The rust implementation of virtiofsd always runs foreground and
spits a deprecation warning when `-f` is passed.

Signed-off-by: Greg Kurz <groug@kaod.org>
2023-06-16 10:30:40 +02:00
Zvonko Kaiser
72f2cb84e6 gpu: Reset cold or hot plug after overriding
If we override the cold, hot plug with an annotation
we need to reset the other plugging mechanism to NoPort
otherwise both will be enabled.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-15 17:51:01 +00:00
Zvonko Kaiser
fbacc09646 gpu: PCIe topology, consider vhost-user-block in Virt
In Virt the vhost-user-block is an PCIe device so
we need to make sure to consider it as well. We're keeping
track of vhost-user-block devices and deduce the correct
amount of PCIe root ports.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-15 17:39:55 +00:00
Zvonko Kaiser
b11246c3aa gpu: Various fixes for virt machine type
The PCI qom path was not deduced correctly added regex for correct
path walking.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:33:57 +00:00
Zvonko Kaiser
40101ea7db vfio: Added annotation for hot(cold) plug
Now it is possible to configure the PCIe topology via annotations
and addded a simple test, checking for Invalid and RootPort

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
8f0d4e2612 vfio: Cleanup of Cold and Hot Plug
Removed the configuration of PCIeRootPort and PCIeSwitchPort, those
values can be deduced in createPCIeTopology

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
b5c4677e0e vfio: Rearrange the bus assignemnt
Refactor the bus assignment so that the call to GetAllVFIODevicesFromIOMMUGroup
can be used by any module without affecting the topology.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00
Zvonko Kaiser
b1aa8c8a24 gpu: Moved the PCIe configs to drivers
The hypervisor_state file was the wrong location for the PCIe Port
settings, moved everything under device umbrella, where it can be
consumed more easily and we do not get into circular deps.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-14 08:20:24 +00:00