The device.Bus was reset if a specific combination of
configuration parameters were not met. With the new
PCIe topology this should not happen anymore
Fixes: #7381
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Currently, even when using devmapper, if the VMM supports virtio-fs /
virtio-9p, that's used to share a few files between the host and the
guest.
This *needed*, as we need to share with the guest contents like secrets,
certificates, and configurations, via Kubernetes objects like configMaps
or secrets, and those are rotated and must be updated into the guest
whenever the rotation happens.
However, there are still use-cases users can live with just copying
those files into the guest at the pod creation time, and for those
there's absolutely no need to have a shared filesystem process running
with no extra obvious benefit, consuming memory and even increasing the
attack surface used by Kata Containers.
For the case mentioned above, we should allow users, making it very
clear which limitations it'll bring, to run Kata Containers with
devmapper without actually having to use a shared file system, which is
already the approach taken when using Firecracker as the VMM.
Fixes: #7207
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The hypervisor_state file was the wrong location for the PCIe Port
settings, moved everything under device umbrella, where it can be
consumed more easily and we do not get into circular deps.
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Generalize VFIO devices to allow for adding AP in the next patch.
The logic for VFIOPciDeviceMediatedType() has been changed and IsAPVFIOMediatedDevice() has been removed.
The rationale for the revomal is:
- VFIODeviceMediatedType is divided into 2 subtypes for AP and PCI
- Logic of checking a subtype of mediated device is included in GetVFIODeviceType()
- VFIOPciDeviceMediatedType() can simply fulfill the device addition based
on a type categorized by GetVFIODeviceType()
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
For kata containers, rootfs is used in the read-only way.
EROFS can noticably decrease metadata overhead.
On the basis of supporting the EROFS file system, it supports using the config parameter to switch the file system used by rootfs.
Fixes: #6063
Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
We have starting to use golang 1.19, some features are
not supported later, so run `go fix` to fix them.
Fixes: #5750
Signed-off-by: Bin Liu <bin@hyper.sh>
The default vhost-user-fs queue-size of qemu is 128 now. Set it to 1024
by default which is same as clh. Also make this value configurable.
Fixes: #5694
Signed-off-by: liyuxuan.darfux <liyuxuan.darfux@bytedance.com>
This reverts commit df8ffecde0, as device
hotplug *is* supported and, more than that, is very much needed when
using virtio-blk instead of virtio-fs.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
- Mostly blank lines after `+build` -- see
https://pkg.go.dev/go/build@go1.14.15 -- this is, to date, enforced by
`gofmt`.
- 1.17-style go:build directives are also added.
- Spaces in govmm/vmm_s390x.go
Fixes: #3769
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Similarly to VCPUs and Device hotplug, Confidential Guests also do not
support Memory hotplug.
Let's make it clear in the documentation and guard the code on both QEMU
and Cloud Hypervisor side to ensure we don't advertise Memory hotplug as
being supported when running Confidential Guests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Similarly to VCPUs hotplug, Confidential Guests also do not support
Device hotplug.
Let's make it clear in the documentation and guard the code on both QEMU
and Cloud Hypervisor side to ensure we don't advertise Device hotplug as
being supported when running Confidential Guests.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Some of them (e.g. QEMU) can run on other OSes (e.g. Darwin) but the
current virtcontainers implementation is Linux specific.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
firmware can be split into FIRMWARE_VARS.fd (UEFI variables as
configuration) and FIRMWARE_CODE.fd (UEFI program image). UEFI
variables can be customized per each user while UEFI code is kept same.
fixes#3583
Signed-off-by: Julio Montes <julio.montes@intel.com>
Let's stop using govmm from kata-containers/govmm and let's start using
it from our own repo.
Fixes: #3495
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
This'll end up moving to hypervisors pkg, but let's stop using virtLog,
instead introduce hvLogger.
Fixes: #2884
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
The fact that we need to "bridge" the endpoint is a bit irrelevant. To
be consistent with the rest of the endpoints, let's just call this
"macvlan"
Fixes: #3050
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Protection types like tdxProtection or seProtection were marked nolint,
remove this. As a side effect, ARM needs dummy tests for these.
Fixes: #2801
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Show available guest protections in the
`kata-runtime env` output. Also bump the formatVersion.
Fixes: #1982
Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>
Add functions to return guestProtection as a string slice, which
can be then used in `kata-runtime env` output.
Signed-off-by: Yujia Qiao <rapiz3142@gmail.com>
Exclude from lint checking for it is ultimately only used in
architecture-specific code.
Fixes: #2273
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
qemuArchBase.appendBridges is never actually used, because the bare
qemuArchBase type is itself never used (outside of unit tests). Instead
*all* the subclasses of qemuArchBase override appendBridges() to call
the very similar, but not identical genericAppendBridges. So, we can
remove the qemuArchBase.appendBridges implementation.
Furthermore, all those subclasses override appendBridges() in exactly
the same way, and so we can remove *those* definitions and replace the
base class qemuArchBase appendBridges() with that version, calling
genericAppendBridges().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Improve security by making rootfs image read-only, nobody
will be able to modify it from the guest.
fixes#1916
Signed-off-by: Julio Montes <julio.montes@intel.com>
Keeping around two different x86 machines has no added value
and require more tests and maintenance. Prefer the q35 machine
since it has more features and drop the pc machine.
Fixes#1953
Depends-on: github.com/kata-containers/tests#3586
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
`memory_offset` is used to increase the maximum amount of memory
supported in a VM, this offset is equal to the NVDIMM/PMEM device that
is hot added, in real use case workloads such devices are bigger than
4G, which is the current limit (uint32).
fixes#2006
Signed-off-by: Julio Montes <julio.montes@intel.com>
Secure Execution is a confidential computing technology on s390x (IBM Z
& LinuxONE). Enable the correspondent virtualization technology in QEMU
(where it is referred to as "Protected Virtualization").
- Introduce enableProtection and appendProtectionDevice functions for
QEMU s390x.
- Introduce CheckCmdline to check for "prot_virt=1" being present on the
kernel command line.
- Introduce CPUFacilities and avilableGuestProtection for hypervisor
s390x to check for CPU support.
Fixes: #1771
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Define the structure and functions needed to support confidential
guests, this commit doesn't add support for any specific technology,
support for TDX, SEV, PEF and others will be added in following
commits.
Signed-off-by: Julio Montes <julio.montes@intel.com>
Allow and configure vhost-user-fs devices (virtio-fs) on s390x. As a
consequence, appendVhostUserDevice now takes a context, which affects
its signature for other architectures.
Fixes: #1753
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
This reverts commit 7f60911333.
Patch allowed other vhost user devices besides FS not supported on s390x
and failed to attach a CCW device number, which results in the
inavailability to use more devices after vhost-user-fs-ccw.
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
Remove the prohibition of vhost-user devices on s390x, which are by now
supported (e.g. vhost-user-fs-ccw). As a consequence,
appendVhostUserDevice no longer needs an error in its signature.
This enables virtio-fs support on s390x.
Fixes: #1469
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
#1389 has added a context for many signatures to improve trace spans.
Functions specific to s390x lack this. Add context where required. This
affects some common code signatures, since some functions that do not
require context on other architectures do require it on s390x.
Also remove an unnecessary import in test_qemu_s390x.go.
Fixes: #1562
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
A significant number of trace calls did not use a parent context that
would create proper span ordering in trace output. Add local context to
functions for use in trace calls to facilitate proper span ordering.
Additionally, change whether trace function returns context in some
functions in virtcontainers and use existing context rather than
background context in bindMount() so that span exists as a child of a
parent span.
Fixes#1355
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
acpi is enabled for kata 1.x, port and rebase code for 2.x
including:
runtime: enable pflash;
agent: add acpi support for pci bus path;
packaging: enable CONFIG_RTC_DRV_EFI;
Fixes: #1317
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>