Commit Graph

3676 Commits

Author SHA1 Message Date
Fabiano Fidêncio
068e535b9d runtime: tdx: Adjust QEMU TDX path
We need to use qemu-system-x86_64-tdx-experimental instead of
qemu-system-x86_64-tdx.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-25 00:39:52 +02:00
Tobin Feldman-Fitzthum
c02b6713bc agent: update image-rs to v0.7.0
v0.7.0 of image-rs has been tagged. Update to it.

Fixes: #7400

Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com>
2023-07-20 15:49:35 -05:00
Fabiano Fidêncio
db0071422b Merge pull request #7332 from zvonkok/CCv0
CCv0: Adding CDI support for cold and hot-plug of VFIO devices
2023-07-19 22:34:28 +02:00
Zvonko Kaiser
3b9f8fdbcb CCv0: Adding CDI support for cold and hot-plug of VFIO devices
We need to do proper sandbox sizing when we're doing cold-plug introduce CDI,
the de-facto standard for enabling devices in containers. containerd
will pass-through annotations for accumulated CPU,Memory and now CDI
devices. With that information sandbox sizing can be derived correctly.

Fixes: #7331

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-07-19 06:55:58 +00:00
Jeremi Piotrowski
a192971d72 agent: Update image-rs to 0.7.0-rc
Update image-rs, which is part of the guest-components repo, to the commit that
will become the v0.7.0 tag.

Fixes: #7353
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
2023-07-18 15:17:56 +02:00
Wainer Moschetta
b2fdaf2e13 Merge pull request #7300 from stevenhorsman/CCv0-merge-10th-july
CCv0: Merge main into CCv0 branch
2023-07-18 09:42:43 -03:00
stevenhorsman
e16235584c agent: Update logger
`sl` was switched from a macro to a function,
so update the CoCo specifics uses of it

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-07-11 21:10:51 +01:00
stevenhorsman
68a364abfa agent: Reflect AGENT_CONFIG change
AGENT_CONFIG was changed to not be a lazy type, so
we need to remove the .read().await calls on it

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-07-11 20:12:50 +01:00
stevenhorsman
15647a000e runtime: Ignore cyclomatic complexity
Ignore cyclomatic complexity failure. I have fixed this in my PR waiting
to forward port remote-hypervisor support into main

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-07-11 19:55:36 +01:00
stevenhorsman
7188a60e25 runtime: Fix bad merge
- Fix the HotPlug type

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-07-11 19:47:45 +01:00
stevenhorsman
f4d7011f3b CCv0: Merge main into CCv0 branch
- Merge remote-tracking branch 'upstream/main' into CCv0
- Note excludes 532755ce31 due to incompatiblity

Fixes: #7278
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-07-11 14:45:58 +01:00
Xynnn007
70c4df6d47 agent: update image-rs version
Update image-rs dep version the same as attestation-agent

Fixes: #7285

Signed-off-by: Xynnn007 <xynnn@linux.alibaba.com>
2023-07-11 15:19:50 +08:00
Yushuo
28c29b248d bugfix: plus default_memory when calculating mem size
We've noticed this caused regressions with the k8s-oom tests, and then
decided to take a step back and do this in the same way it was done
before 67972ec48a.

Moreover, this step back is also more reasonable in terms of the
controlling logic.

And by doing this we can re-enable the k8s-oom.bats tests, which is done
as part of this PR.

Fixes: #7271
Depends-on: github.com/kata-containers/tests#5705

Signed-off-by: Yushuo <y-shuo@linux.alibaba.com>
2023-07-10 15:53:04 +08:00
Fabiano Fidêncio
96e9374d4b dragonball: Don't fail if a request asks for more CPUs than allowed
Let's take the same approach of the go runtime, instead, and allocate
the maximum allowed number of vcpus instead.

Fixes: #7270

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 15:50:23 +02:00
Fabiano Fidêncio
275c84e7b5 Revert "agent: fix the issue of exec hang with a backgroud process"
This reverts commit 25d2fb0fde.

The reason we're reverting the commit is because it to check whether
it's the cause for the regression on devmapper tests.

Fixes: #7253
Depends-on: github.com/kata-containers/tests#5705

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-08 14:27:40 +02:00
stevenhorsman
335a456425 config: Update remote hypervisor config
- Add annotation enablement for machine_type, default_memory and
default_vcpus
- Remove note that says that cpu and memory settings are ignored.

Fixes: #7256
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-07-07 08:37:46 +01:00
stevenhorsman
aadc68633e agent: Update image-rs
- Update location and version of image-rs after
the repo merge

Fixes: #7152
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-07-04 17:07:00 +01:00
Bin Liu
f214058b07 Merge pull request #7202 from wedsonaf/macros
Convert `is_allowed`, `ttrpc_error` and `sl` to functions
2023-07-04 14:23:08 +08:00
Peng Tao
581be92b25 Merge pull request #4492 from zvonkok/pcie-topology
runtime: fix PCIe topology for GPUDirect use-case
2023-07-03 09:17:12 +08:00
Fabiano Fidêncio
6a21e20c63 runtime: Add "none" as a shared_fs option
Currently, even when using devmapper, if the VMM supports virtio-fs /
virtio-9p, that's used to share a few files between the host and the
guest.

This *needed*, as we need to share with the guest contents like secrets,
certificates, and configurations, via Kubernetes objects like configMaps
or secrets, and those are rotated and must be updated into the guest
whenever the rotation happens.

However, there are still use-cases users can live with just copying
those files into the guest at the pod creation time, and for those
there's absolutely no need to have a shared filesystem process running
with no extra obvious benefit, consuming memory and even increasing the
attack surface used by Kata Containers.

For the case mentioned above, we should allow users, making it very
clear which limitations it'll bring, to run Kata Containers with
devmapper without actually having to use a shared file system, which is
already the approach taken when using Firecracker as the VMM.

Fixes: #7207

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-06-30 20:45:00 +02:00
Zvonko Kaiser
0f454d0c04 gpu: Fixing typos for PCIe topology changes
Some comments and functions had typos and wrong capitalization.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-30 08:42:55 +00:00
stevenhorsman
51eb0c5130 runtime: SEV sysconfig fix
- SEV and SNP need a different sysconfig path

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-06-29 20:52:57 +01:00
stevenhorsman
6fee9fbe4e CCv0: Merge main into CCv0 branch
Merge remote-tracking branch 'upstream/main' into CCv0

Fixes: #7083
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-06-29 10:05:59 +01:00
Fupan Li
4288b935e1 Merge pull request #7104 from openanolis/physical/endpoint
runtime-rs:  support physical endpoint using device manager
2023-06-29 14:43:44 +08:00
GabyCT
19890133e9 Merge pull request #7189 from Apokleos/direct-vol-bugfix
runtime-rs: bugfix for direct volume path's validation.
2023-06-28 12:26:22 -06:00
Wedson Almeida Filho
0504bd7254 agent: convert the sl macros to functions
There is nothing in them that requires them to be macros. Converting
them to functions allows for better error messages.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
0860fbd410 agent: convert the ttrpc_error macro to a function
There is nothing in it that requires it to be a macro. Converting it to
a function allows for better error messages.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
0e5d6ce6d7 agent: convert the is_allowed macro to a function
Having a function allows for better error messages from the type checker
and it makes it clearer to callers what can happen. For example:

is_allowed!(req);

Gives no indication that it may result in an early return, and no simple
way for callers to modify the behaviour. It also makes it look like
ownership of `req` is being transferred.

On the other hand,

is_allowed(&req)?;

Indicates that `req` is being borrowed (immutably) and may fail. The
question mark indicates that the caller wants an early return on
failure.

Fixes: #7201

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:32 -03:00
Wedson Almeida Filho
f680fc52be agent: change AGENT_CONFIG's lazy type to just AgentConfig
Since it is never modified, it doesn't really need a lock of any kind.
Removing the `RwLock` wrapper allows us to remove all `.read().await`
calls when accessing it.

Additionally, `AGENT_CONFIG` already has a static lifetime, so there is
no need to wrap it in a ref-counted heap allocation.

Fixes: #5409

Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
2023-06-28 14:05:27 -03:00
Jianyong Wu
1f3e837e4b runtime-rs: fix build error on AArch64
Vfio support introduce build error on AArch64. Remove arch related
annotation can avoid this error.

Fixes: #7187
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2023-06-28 07:10:43 +00:00
alex.lyn
6fd25968c6 runtime-rs: bugfix for direct volume path's validation.
The failure mainly caused by the encoded volume path and
the mount/src. As the src will be validated with stat,but
it's not a full path and encoded, which causes the stat
mount source failed.

Fixes: #7186

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-28 10:07:07 +08:00
Steve Horsman
70e6e40a8d Merge pull request #7134 from stevenhorsman/CCv0-merge-19th-june
CCv0: Merge main into CCv0 branch
2023-06-27 09:16:49 +01:00
Zhongtao Hu
bff4672f7d runtime-rs: support physical endpoint using device manager
use device manager to attach physical endpoint

Fixes: #7103
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
2023-06-27 10:25:51 +08:00
alex.lyn
0df2fc2702 runtime-rs: add support spdk/vhost-user based volume.
Unlike the previous usage which requires creating
/dev/xxx by mknod on the host, the new approach will
fully utilize the DirectVolume-related usage method,
and pass the spdk controller to vmm.

And a user guide about using the spdk volume when run
a kata-containers. it can be found in docs/how-to.

Fixes: #6526

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-25 16:23:19 +08:00
GabyCT
4b8229c252 Merge pull request #7141 from bpradipt/fix-7140
runtime: Add support for key annotations to remote hyp
2023-06-23 16:47:20 -06:00
GabyCT
388b55175e Merge pull request #7056 from FuuuOverclocking/fuu/fix-console_manager
dragonball: avoid obtaining lock twice in create_stdio_console
2023-06-23 16:47:00 -06:00
Zvonko Kaiser
8330fb8ee7 gpu: Update unit tests
Some tests are now failing due to the changes how PCIe is
handled. Update the test accordingly.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-06-23 11:16:25 +00:00
Fupan Li
469c678425 Merge pull request #7058 from Apokleos/vfio-dev
add support vfio device manager
2023-06-22 17:51:22 -06:00
Archana Shinde
2d329125fd Merge pull request #6800 from amshinde/check-vm-capability
kata-ctl: Check for vm capability
2023-06-21 23:52:46 -07:00
Steve Horsman
4ff3afc59d Merge pull request #6707 from Xynnn007/feat-policy-uri
agent: add container launch control parameters from kernel commandline
2023-06-21 17:02:46 +01:00
Pradipta Banerjee
004f07f076 runtime: Add support for key annotations to remote hyp
In order to support different pod VM instance type via
remote hypervisor implementation (cloud-api-adaptor),
we need to pass machine_type, default_vcpus
and default_memory annotations to cloud-api-adaptor.

The cloud-api-adaptor then uses these annotations to spin
up the appropriate cloud instance.

Reference PR for cloud-api-adaptor
https://github.com/confidential-containers/cloud-api-adaptor/pull/1088

Fixes: #7140
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2023-06-21 20:22:36 +05:30
Archana Shinde
610f7986e4 check: Relax the unrestricted_guest check when running in a VM
When running on a VM, the kernel parameter "unrestricted_guest" for
kernel module "kvm_intel" is not required. So, return success when running
on a VM without checking value of this kernel parameter.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-06-21 07:30:35 -07:00
Archana Shinde
1b406b9d0c kata-ctl:Implement functionality to check host is capable of running VM
Implement functionality to add to the env output if the host is capable
of running a VM.

Fixes: #6727

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2023-06-21 07:30:22 -07:00
soup
09720babc3 docs: fix spelling of "crate"
Fixes: #7153

Signed-off-by: soup <lqh348659137@outlook.com>
2023-06-21 16:10:54 +08:00
stevenhorsman
5a4a89c108 runtime: Remove duplicated variables
Remove duplicated variables that were in `CCv0` and merged in from main

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-06-20 15:01:54 +01:00
stevenhorsman
6350f49baf agent-ctl: Re-vendor
Re-vendor after bad merge

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-06-19 11:58:43 +01:00
stevenhorsman
7fc10b975f agent: re-vendor
Re-vendor after bad merge

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-06-19 11:37:35 +01:00
stevenhorsman
64a27d962b CCv0: Merge main into CCv0 branch
Merge remote-tracking branch 'upstream/main' into CCv0

Fixes: #7083
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-06-19 11:24:03 +01:00
alex.lyn
59510cfee0 runtime-rs: add support vfio device based volume
A new choice of using vfio devic based volume for kata-containers.
With the help of kata-ctl direct-volume, users are able to add a
specified device which is BDF or IOMMU group ID.

To help users to use it smoothly, A doc about howto added in
docs/how-to/how-to-run-kata-containers-with-kinds-of-Block-Volumes.

Fixes: #6525

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-18 14:07:05 +08:00
alex.lyn
1e3b372bbb runtime-rs: add support vfio device manager
Limitations:
As no ready rust vmm's vfio manager is ready, it only supports
part of vfio in runtime-rs. And the left part is to call vmm
interfaces related to vfio add/remove.

So when vmm/vfio manager ready, a new PR will be pushed to
narrow the gap.

Fixes: #6525

Signed-off-by: alex.lyn <alex.lyn@antgroup.com>
2023-06-18 14:05:59 +08:00