kata-containers

mirror of https://github.com/aljazceru/kata-containers.git synced 2026-02-17 20:44:20 +01:00

Author	SHA1	Message	Date
Georgina Kinge	8c3846d431	CCv0: Merge main into CCv0 branch Merge remote-tracking branch 'upstream/main' into CCv0 Fixes: #5327 Signed-off-by: Georgina Kinge <georgina.kinge@ibm.com>	2022-10-05 16:34:02 +01:00
Joana Pecholt	ded60173d4	runtime: Enable choice between AMD SEV and SNP This is based on a patch from @niteeshkd that adds a config parameter to choose between AMD SEV and SEV-SNP VMs as the confidential guest type in case both types are supported. SEV is the default. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Joana Pecholt	105eda5b9a	runtime: Initrd path option added to config Adds initrd configuration option to the configuration.toml that is generated for the setup using QEMU. Signed-off-by: Joana Pecholt <joana.pecholt@aisec.fraunhofer.de>	2022-09-16 17:51:41 +02:00
Fabiano Fidêncio	b9c0f7fb09	Merge pull request #5056 from GeorginaKin/CCv0 CCv0: Merge main into CCv0 branch	2022-09-02 13:40:23 +02:00
Ryan Savino	656d72bd74	config: Added SEV config Added default sev kata config template. Added required default variables in Makefile. Fixes #5012 Fixes #5008 Signed-Off-By: Ryan Savino <ryan.savino@amd.com>	2022-09-01 09:40:14 -05:00
Georgina Kinge	9931d4cbf0	CCv0: Merge main into CCv0 branch Merge remote-tracking branch 'upstream/main' into CCv0 Fixes: #5054 Signed-off-by: Georgina Kinge <georgina.kinge@ibm.com>	2022-08-31 15:01:40 +01:00
Archana Shinde	7d52934ec1	Merge pull request #4798 from amshinde/use-iouring-qemu Use iouring for qemu block devices	2022-08-26 04:00:24 +05:30
Fabiano Fidêncio	0f4b5c08fe	runtime: Add configuration file for CLH TDX Let's add a new configuration file for using a cloud hypervisor (and all the needed artefacts) that are TDX capable. This PR extends the Makefile in order to provide variables to be set during the build time that are needed for the proper configuration of the VMM, such as: * Specific kernel parameters to be used with TDX * Specific kernel features to be used when using TDX * Artefacts path for the artefacts built to be used with TDX * Kernel * TD-Shim The reason we don't hack into the current Cloud Hypervisor configuration file is because we want to ship both configurations, with for the non-TEE use case and one for the TDX use case. It's important to note that the Cloud Hypervisor used upstream is already built with TDX support. Fixes: #4831 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-25 23:42:03 +02:00
Fabiano Fidêncio	132d0e9927	clh: Lift the sharedFS restriction used with TDX When booting the TDX kernel with `tdx_disable_filter`, as it's been done for QEMU, VirtioFS can work without any issues. Whether this will be part of the upstream kernel or not is a different story, but it easily could make it there as Cloud Hypervisor relies on the VIRTIO_F_IOMMU_PLATFORM feature, which forces the guest to use the DMA API, making these devices compatible with TDX. See Sebastien Boeuf's explanation of this in the 3c973fa7ce208e7113f69424b7574b83f584885d commit: """ By using DMA API, the guest triggers the TDX codepath to share some of the guest memory, in particular the virtqueues and associated buffers so that the VMM and vhost-user backends/processes can access this memory. """ Fixes: #4977 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-25 09:49:27 +02:00
Fabiano Fidêncio	c142fa2541	clh: Lift the sharedFS restriction used with TDX When booting the TDX kernel with `tdx_disable_filter`, as it's been done for QEMU, VirtioFS can work without any issues. Whether this will be part of the upstream kernel or not is a different story, but it easily could make it there as Cloud Hypervisor relies on the VIRTIO_F_IOMMU_PLATFORM feature, which forces the guest to use the DMA API, making these devices compatible with TDX. See Sebastien Boeuf's explanation of this in the 3c973fa7ce208e7113f69424b7574b83f584885d commit: """ By using DMA API, the guest triggers the TDX codepath to share some of the guest memory, in particular the virtqueues and associated buffers so that the VMM and vhost-user backends/processes can access this memory. """ Fixes: #4977 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-24 17:14:05 +02:00
Fabiano Fidêncio	df7529ee18	Merge pull request #4270 from confidential-containers-demo/sev_attestation_pr runtime: Add support for SEV pre-attestation	2022-08-11 09:30:26 +02:00
Jim Cadden	a87698fe56	runtime: Add support for SEV pre-attestation AMD SEV pre-attestation is handled by the runtime before the guest is launched. Guest VM is started paused and the runtime communicates with a remote keybroker service (e.g., simple-kbs) to validate the attestation measurement and to receive launch secret. Upon validation, the launch secret is injected into guest memory and the VM is started. Fixes: #4280 Signed-off-by: Jim Cadden <jcadden@ibm.com> Signed-off-by: Tobin Feldman-Fitzthum <tobin@ibm.com> Signed-off-by: Dov Murik <dovmurik@linux.ibm.com>	2022-08-10 14:00:14 -04:00
Fabiano Fidêncio	89a5faef7a	runtime: Add configuration file for QEMU TDX Let's add a new configuration file for using a QEMU (and all the needed artefacts) that are TDX capable. This PR extends the Makefile in order to provide variables to be set during the build time that are needed for the proper configuration of the VMM, such as: * Specific kernel parameters to be used with TDX * Specific kernel features to be used when using TDX * Artefacts path for the artefacts built to be used with TDX * QEMU * Kernel * TDVF The reason we don't hack into the current QEMU configuration file is because we want to ship both configurations, with for the non-TEE use case and one for the TDX use case. Fixes: #4830 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-08-08 09:54:48 +02:00
Archana Shinde	ed0f1d0b32	config: Add "block_device_aio" as a config option for qemu This configuration will allow users to choose between different I/O backends for qemu, with the default being io_uring. This will allow users to fallback to a different I/O mechanism while running on kernels olders than 5.1. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2022-08-05 13:16:34 -07:00
Megan Wright	f4979a9aa5	CCv0: Merge main into CCv0 branch Merge remote-tracking branch 'upstream/main' into CCv0 Fixes: #4651 Signed-off-by: Megan Wright <megan.wright@ibm.com>	2022-07-13 14:32:08 +01:00
Georgina Kinge	9d524b29ad	CCv0: Merge main into CCv0 branch Merge remote-tracking branch 'upstream/main' into CCv0 Fixes: #4602 Signed-off-by: Georgina Kinge <georgina.kinge@ibm.com>	2022-07-06 14:27:15 +01:00
Manabu Sugimoto	4d89476c91	runtime: Fix DisableSelinux config Enable Kata runtime to handle `disable_selinux` flag properly in order to be able to change the status by the runtime configuration whether the runtime applies the SELinux label to VMM process. Fixes: #4599 Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>	2022-07-06 15:50:28 +09:00
Fabiano Fidêncio	d4d178359b	runtime: Expose DEFSERVICEOFFLOAD build option For the CC build we need to enable such a flag, and the cleaner way to do so is exposing it in the Makefile and, later on, making sure its correct value to the build script. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-30 12:21:10 +02:00
Fabiano Fidêncio	0939f5181b	config: Expose default_maxmemory Expose the newly added `default_maxmemory` to the project's Makefile and to the configuration files. Fixes: #4516 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-06-28 21:19:24 +02:00
Liang Zhou	ef925d40ce	runtime: enable sandbox feature on qemu Enable "-sandbox on" in qemu can introduce another protect layer on the host, to make the secure container more secure. The default option is disable because this feature may introduce some performance cost, even though user can enable /proc/sys/net/core/bpf_jit_enable to reduce the impact. Fixes: #2266 Signed-off-by: Feng Wang <feng.wang@databricks.com>	2022-06-17 15:30:46 -07:00
Snir Sheriber	c67b9d2975	qemu: allow using legacy serial device for the console This allows to get guest early boot logs which are usually missed when virtconsole is used. - It utilizes previous work on the govmm side: https://github.com/kata-containers/govmm/pull/203 - unit test added Fixes: #4237 Signed-off-by: Snir Sheriber <ssheribe@redhat.com>	2022-05-17 12:06:11 +03:00
Fabiano Fidêncio	b6467ddd73	clh: Expose disk rate limiter config With everything implemented, let's now expose the disk rate limiter configuration options in the Cloud Hypervisor configuration file. Fixes: #4139 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:28:29 +02:00
Fabiano Fidêncio	7580bb5a78	clh: Expose net rate limiter config With everything implemented, let's now expose the net rate limiter configuration options in the Cloud Hypervisor configuration file. Fixes: #4017 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-28 10:28:13 +02:00
bin	9d5b03a1b7	runtime: delete debug option in virtiofsd virtiofsd's debug will be enabled if hypervisor's debug has been enabled, this will generate too many noisy logs from virtiofsd. Unbind the relationship of log level between virtiofsd and hypervisor, if users want to see debug log of virtiofsd, can set it by: virtio_fs_extra_args = ["-o", "log_level=debug"] Fixes: #3303 Signed-off-by: bin <bin@hyper.sh>	2022-04-07 19:55:22 +08:00
Fabiano Fidêncio	98750d792b	clh: Expose service offload configuration This configuration option is valid for all the hypervisor that are going to be used with the confidential containers effort, thus exposing the configuration option for Cloud Hypervisor as well. Fixes: #4022 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-04-01 11:15:55 +02:00
Evan Foster	afc567a9ae	storage: make k8s emptyDir creation configurable This change introduces the `disable_guest_empty_dir` config option, which allows the user to change whether a Kubernetes emptyDir volume is created on the guest (the default, for performance reasons), or the host (necessary if you want to pass data from the host to a guest via an emptyDir). Fixes #2053 Signed-off-by: Evan Foster <efoster@adobe.com>	2022-03-04 12:02:42 -08:00
Fabiano Fidêncio	12af632952	Merge pull request #3814 from fidencio/wip/disable-block-device-use-minor-fixes Minor fixes for the `disable_block_device_use` comments	2022-03-03 23:26:05 +01:00
Fabiano Fidêncio	97951a2d12	clh: Don't use SharedFS with Confidential Guests kata-containers/pulls#3771 added TDX support for Cloud Hypervisor, but two big things got overlooked while doing that. 1. virtio-fs, as of now, cannot be part of the trust boundary, so the Confidential Guest will not be using it. 2. virtio-block hotplug should be enabled in order to use virtio-block for the rootfs (used with the devmapper plugin). When trying to use cloud-hypervisor with TDX using virtio-fs, we're facing the following error on the guest kernel: ``` virtiofs virtio2: device must provide VIRTIO_F_ACCESS_PLATFORM ``` After checking and double-checking with virtiofs and cloud-hypervisor developers, it happens as confidential containers might put some limitations on the device, so it can't access all of the guests' memory and that's where this restriction seems to be coming from. Vivek mentioned that virtiofsd do not support VIRTIO_F_ACCESS_PLATFORM (aka VIRTIO_F_IOMMU_PLATFORM) yet, and that for ecrypted guests virtiofs may not be the best solution at the moment. @sboeuf put this in a very nice way: "if the virtio-fs driver doesn't support VIRTIO_F_ACCESS_PLATFORM, then the pages corresponding to the virtqueues and the buffers won't be marked as SHARED, meaning the VMM won't have access to it". Interestingly enough, it works with QEMU, and it may be due to some change done on the patched QEMU that @devimc is packaging, but we won't take the path to figure out what was the change and patch cloud-hypervisor on the same way, because of 1. Fixes: #3810 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 12:49:40 +01:00
Fabiano Fidêncio	76e4f6a2a3	Revert "hypervisors: Confidential Guests do not support Device hotplug" This reverts commit `df8ffecde0`, as device hotplug is supported and, more than that, is very much needed when using virtio-blk instead of virtio-fs. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-03 09:59:55 +01:00
Fabiano Fidêncio	fa8b93927c	config: qemu: Fix disable_block_device_use comments virtio-fs, instead of virtio-9p, is the default shared file system type in case virtio-blk is not used. Fixes: #3813 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-02 20:43:36 +01:00
Fabiano Fidêncio	9615c8bc9c	config: fc: Don't expose disable_block_device_use Relying on virtio-block is the only way to use Firecracker with Kata Containers, as shared FS (virtio-{fs,fs-nydus,9p}) is not supported by Firecracker. As configuration doesn't make sense to be exposed, we hardcode the `false` value in the Firecracker configuration structure. Fixes: #3813 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-03-02 20:43:28 +01:00
Fabiano Fidêncio	de57466212	config: Expand confidential_guest comments Let's clarify that an error will be reported in case confidential_guest is enabled, but the hardware where Kata Containers is running doesn't provide the required feature set. Fixes: #3787 Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 11:57:42 +01:00
Fabiano Fidêncio	641d475fa6	config: clh: Use "Intel TDX" instead of just "TDX" Let's use "Intel TDX" rather than just "TDX", as it can ease the understanding of the terminology. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 10:27:21 +01:00
Fabiano Fidêncio	0bafa2def9	config: clh: Mention supported TEEs Let's mention the supported TEEs to be used with confidential guests. Right now, Cloud Hyperisor supports only Intel TDX, used together with TD Shim. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-28 10:24:33 +01:00
Tanweer Noor	082d538cb4	runtime: make selinux configurable removes --tags selinux handling in the makefile (part of it introduced here: `d78ffd6`) and makes selinux configurable via configuration.toml Fixes: #3631 Signed-off-by: Tanweer Noor <tnoor@apple.com>	2022-02-25 10:33:46 -08:00
Fabiano Fidêncio	a13b4d5ad8	clh: Add firmware to the config file "firmware" option was already present for a while, but it's never been exposed to the configuration file before. Let's do it now as it can be used, in combination with the newly added confidential_guest option, to boot a guest VM using the so called `td-shim`[0] with Cloud Hypervisor. [0]: https://github.com/confidential-containers/td-shim Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	a8827e0c78	hypervisors: Confidential Guests do not support NVDIMM NVDIMM is also not supported with Confidential Guests and Virtio Block devices should be used instead. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	f50ff9f798	hypervisors: Confidential Guests do not support Memory hotplug Similarly to VCPUs and Device hotplug, Confidential Guests also do not support Memory hotplug. Let's make it clear in the documentation and guard the code on both QEMU and Cloud Hypervisor side to ensure we don't advertise Memory hotplug as being supported when running Confidential Guests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	df8ffecde0	hypervisors: Confidential Guests do not support Device hotplug Similarly to VCPUs hotplug, Confidential Guests also do not support Device hotplug. Let's make it clear in the documentation and guard the code on both QEMU and Cloud Hypervisor side to ensure we don't advertise Device hotplug as being supported when running Confidential Guests. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	28c4c044e6	hypervisors: Confidential Guests do not support VCPUs hotplug As confidential guests do not support VCPUs hotplug, let's set the "DefaultMaxVCPUs" value to "NumVCPUs". The reason to do this is to ensure that guests will be started with the correct amount of VCPUs, without giving to the guest with all the possible VCPUs the host could provide. One clear side effect of this limitation is that workloads that would require more VCPUs on their yaml definition will not run on this scenario. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
Fabiano Fidêncio	29ee870d20	clh: Add confidential_guest to the config file ConfidentialGuest is an option already present and exposed for QEMU, which is used for using Kata Containers together with different sorts of Guest Protections, such as TDX and SEV for x86_64, PEF for ppc64le, and SE for s390x. Right now we error out in case confidential_guest is enabled, as we will be implementing the needed blocks for this as part of this series. Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>	2022-02-25 16:49:21 +01:00
luodaowen.backend	3175aad5ba	virtiofs-nydus: add lazyload support for kata with clh As kata with qemu has supported lazyload, so this pr aims to bring lazyload ability to kata with clh. Fixes #3654 Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>	2022-02-19 21:55:31 +08:00
luodaowen.backend	2d9f89aec7	feature(nydusd): add nydusd support to introduse lazyload ability Pulling image is the most time-consuming step in the container lifecycle. This PR introduse nydus to kata container, it can lazily pull image when container start. So it can speed up kata container create and start. Fixes #2724 Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>	2022-02-11 21:41:17 +08:00
Julio Montes	1f29478b09	runtime: suppport split firmware firmware can be split into FIRMWARE_VARS.fd (UEFI variables as configuration) and FIRMWARE_CODE.fd (UEFI program image). UEFI variables can be customized per each user while UEFI code is kept same. fixes #3583 Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-02-01 13:40:19 -06:00
Eric Ernst	8cde54131a	runtime: introduce static sandbox resource management There are software and hardware architectures which do not support dynamically adjusting the CPU and memory resources associated with a sandbox. For these, today, they rely on "default CPU" and "default memory" configuration options for the runtime, either set by annotation or by the configuration toml on disk. In the case of a single container (launched by ctr, or something like "docker run"), we could allow for sizing the VM correctly, since all of the information is already available to us at creation time. In the sandbox / pod container case, it is possible for the upper layer container runtime (ie, containerd or crio) could send a specific annotation indicating the total workload resource requirements associated with the sandbox creation request. In the case of sizing information not being provided, we will follow same behavior as today: start the VM with (just) the default CPU/memory. If this information is provided, we'll track this as Workload specific resources, and track default sizing information as Base resources. We will update the hypervisor configuration to utilize Base+Workload resources, thus starting the VM with the appropriate amount of CPU and memory. In this scenario (we start the VM with the "right" amount of CPU/Memory), we do not want to update the VM resources when containers are added, or adjusted in size. This functionality is introduced behind a configuration flag, `static_sandbox_resource_mgmt`. This is defaulted to false for all configurations except Firecracker, which is set to true. This'll greatly improve UX for folks who are utilizing Kata with a VMM or hardware architecture that doesn't support hotplug. Note, users will still be unable to do in place vertical pod autoscaling or other dynamic container/pod sizing with this enabled. Fixes: #3264 Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-26 09:04:38 -08:00
Eric Ernst	c3e97a0a22	config: updates to configuration clh, fc toml template There's some cruft -- let's update to reflect reality, and ensure that we match what is expected. Signed-off-by: Eric Ernst <eric_ernst@apple.com>	2022-01-26 09:45:50 -08:00
Julio Montes	49223e67af	runtime: remove enable_swap option `enable_swap` option was added long time ago to add `-realtime mlock=off` to the QEMU's command line. Kata now supports QEMU 6, `-realtime` option has been deprecated and `mlock=on` is causing unexpected behaviors in kata. This patch removes support for `enable_swap`, `-realtime` and `mlock=` since they are causing bugs in kata. Signed-off-by: Julio Montes <julio.montes@intel.com>	2022-01-18 11:12:29 -06:00
Gabriela Cervantes	ad16d75c07	runtime: Remove docker comments for kata 2.0 configuration.tomls This PR removes the reference of how to use disable_new_netns configuration with docker as for kata 2.0 we are not supporting docker and this information was used for kata 1.x Fixes #3400 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2022-01-06 16:08:10 +00:00
Gabriela Cervantes	591d4af1ea	runtime: Update comments for virtcontainers to use kata 2.0 This PR updates the comments in the configuration.toml to point to the current kata containers repository instead of the kata 1.x. Fixes #3163 Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>	2021-12-01 16:16:46 +00:00
bin	ddc68131df	runtime: delete netmon Netmon is not used anymore. Fixes: #3112 Signed-off-by: bin <bin@hyper.sh>	2021-11-24 15:08:18 +08:00

1 2

58 Commits