Commit Graph

1092 Commits

Author SHA1 Message Date
stevenhorsman
66ca2f1bc4 qemu: static-check disable
Disable gocyclo on large complex function in CCv0 branch

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-25 17:05:16 +01:00
stevenhorsman
c87c8ffce5 runtime: Fix bad merge
- Re-add removed CC features from sandbox.go

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-25 16:30:01 +01:00
stevenhorsman
33143eb342 CCv0: Merge main into CCv0 branch
Merge remote-tracking branch 'upstream/main' into CCv0

Fixes: kata-containers#5645
Depends-on: github.com/kata-containers/kata-containers#6885

Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-05-25 16:17:59 +01:00
Fabiano Fidêncio
9aae333343 Merge pull request #6871 from kmjohansen/bugfix/ptmx
runtime: make debug console work with sandbox_cgroup_only
2023-05-23 22:24:51 +02:00
Archana Shinde
2c9efbe04c Merge pull request #6907 from likebreath/0519/clh_v32.0
Upgrade to Cloud Hypervisor v32.0
2023-05-22 09:53:05 -07:00
Bo Chen
35c3d7b4bc runtime: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v32.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.

Fixes: #6632

Signed-off-by: Bo Chen <chen.bo@intel.com>
2023-05-19 12:49:45 -07:00
Krister Johansen
eff6ed2d5f runtime: make debug console work with sandbox_cgroup_only
If a hypervisor debug console is enabled and sandbox_cgroup_only is set,
the hypervisor can fail to open /dev/ptmx, which prevents the sandbox
from launching.

This is caused by the absence of a device cgroup entry to allow access
to /dev/ptmx.  When sandbox_cgroup_only is not set, the hypervisor
inherits the default unrestrcited device cgroup, but with it enabled it
runs into allow / deny list restrictions.

Fix by adding an allowlist entry for /dev/ptmx when debug is enabled,
sandbox_cgroup_only is true, and no /dev/ptmx is already in the list of
devices.

Fixes: #6870

Signed-off-by: Krister Johansen <kjlx@templeofstupid.com>
2023-05-18 10:36:24 -07:00
Gabriela Cervantes
11a34a72e2 docs: Update container network model url
This PR updates the container network model url that is part of the
virtcontainers documentation.

Fixes #6889

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2023-05-18 15:08:08 +00:00
Fabiano Fidêncio
fe6e918ddc Revert "virtcontainers: Drop check for the tdx CPU flag"
This reverts commit 25b3cdd38c.

As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.

The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.

This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)

Fixes: #6884

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-18 12:55:20 +02:00
Fabiano Fidêncio
2962d8db45 Revert "runtime/qemu: Drop "kvm-type=tdx""
This reverts commit ed145365ec.

As the Jenkins TDX CI is running on a system with a TDX stack called
"2022ww44", we should keep the QEMU / kernel / OVMF versions matching
what's provided in that stack.

The reason we were able to update this on `main` is because the GHA TDX
CI is running on a TDX stack called "2023ww01", but we have decided to
NOT take the bullet, NOT updating the Jenkins CI in order to avoid
unexepected breakages.

This regression was introduced as part of the last CCv0 merge to main,
and would've been caught by the CI, and should've been caught by the
reviewer (myself :-)), but CI was having a hard time to even build the
compoenents and I wrote in the PR and I'm quoting it here: "I rather
deal with possible breakages on this later on, than block this PR to get
in." ... and here we are. :-)

Fixes: #6884

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-05-18 12:55:03 +02:00
Fabiano Fidêncio
f07b27d445 Merge pull request #6559 from stevenhorsman/CCv0-merge-30th-mar
CCv0: Merge main into CCv0 branch
2023-05-17 17:00:39 +02:00
Salvador Fuentes
b76058c979 Merge pull request #6721 from nedsouza/virtcontainers-qemu-go-coverage
virtcontainers/qemu_test.go: Improve coverage
2023-05-16 11:11:43 -06:00
James O. D. Hunt
a96fcfd5be Merge pull request #6735 from nedsouza/258/tests-coverage-compatoci
virtcontainers/pkg/compatoci/: Improved coverage for  for Kata 2.0
2023-05-16 15:36:35 +01:00
Tamas K Lengyel
20cb875087 virtcontainers/qemu_test.go: Improve test coverage
Rework TestQemuCreateVM routine to be a table driven test with
various config variations passed to it. After CreateVM a handful
of additional functions are exercised to improve code-coverage.
Also add partial coverage for StartVM routine.

Currently improving from 19.7% to 35.7%

Credit PR to Hackathon Team3

Fixes: #267

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
2023-05-15 15:26:35 -04:00
LiuWeijie
50cc9c582f tests: Improve coverage for virtcontainers/pkg/compatoci/ for Kata 2.0
Add test cases for ParseConfigJson function and GetContainerSpec function

Fixes: #258

Signed-off-by: LiuWeijie <weijie.liu@intel.com>
2023-05-15 11:58:17 +08:00
Archana Shinde
32b39ee347 Merge pull request #6763 from nedsouza/266/tests_coverage_virtcontainers_fc
virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
2023-05-12 11:53:27 -07:00
Feng Wang
4e0dce6802 Merge pull request #6738 from fengwang666/oss-fix-fd-leak
runtime: Fix virtiofs fd leak
2023-05-08 10:52:36 -07:00
Eduardo Berrocal
a4c0303d89 virtcontainers: Fixed static checks for improved test coverage for fc.go
Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%.
Fixed very simple static check fail on line 202.

Fixes: #266

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-05-07 00:17:36 -07:00
Peng Tao
65670e6b0a Merge pull request #6699 from zvonkok/cold-plug-vfio
gpu: cold plug VFIO devices
2023-05-05 10:04:29 +08:00
Archana Shinde
9443c4aea7 Merge pull request #6729 from nedsouza/259/tests_coverage_virtcontainers_persist
virtcontainers/persist: Improved test coverage 65% to 87.5%
2023-05-04 16:18:55 -07:00
Archana Shinde
09134c30de Merge pull request #6737 from nedsouza/265/virtcontainers-clh-go-coverage
virtcontainers/clh_test.go: improve unit test coverage
2023-05-04 16:15:43 -07:00
Pradipta Banerjee
1f0d709be6 CC: Add configurable context timeout for StopVM in remote hyp
Add configurable context timeout for StopVM in remote hypervisor
similar to StartVM

Fixes: #6730

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
2023-05-04 10:42:30 +00:00
Zvonko Kaiser
13d7f39c71 gpu: Check for VFIO port assignments
Bailing out early if the port is wrong, allowed port settings are
no-port, root-port, switch-port

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-05-03 12:32:33 +00:00
Eduardo Berrocal
03a8cd69c2 virtcontainers: Improved test coverage for fc.go from 4.6% to 18.5%
Expanded tests on fc_test.go to cover more lines of code. Coverage went from 4.6% to 18.5%.

Fixes: #266

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-04-28 15:40:45 -07:00
Eduardo Berrocal
6bf1fc6051 virtcontainers/factory: Improved test coverage
Expanded tests on factory_test.go to cover more lines of code. Coverage went from 34% to 41.5% in the case of user-mode run tests,
and from 77.7% to 84% in the case of priviledge-mode run tests.

Fixes: #260

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-04-27 13:08:35 -07:00
Zvonko Kaiser
f7ad75cb12 gpu: Cold-plug extend the api.md
Make the hypervisorconfig consistent in code and api.md

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-27 09:35:05 +00:00
Zvonko Kaiser
0fec2e6986 gpu: Add cold-plug test
Cold plug setting is now correctly decoded in toml

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-27 09:30:24 +00:00
stevenhorsman
dbe1fd9436 CCv0: Merge main into CCv0 branch
Merge remote-tracking branch 'upstream/main' into CCv0

Fixes: #6558
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
2023-04-27 09:42:44 +01:00
Feng Wang
205909fbed runtime: Fix virtiofs fd leak
The kata runtime invokes removeStaleVirtiofsShareMounts after
a container is stopped to clean up the stale virtiofs file caches.

Fixes: #6455
Signed-off-by: Feng Wang <fwang@confluent.io>
2023-04-26 15:53:39 -07:00
Tamas K Lengyel
0f45b0faa9 virtcontainers/clh_test.go: improve unit test coverage
Credit PR to Hackathon Team3

Fixes: #265

Signed-off-by: Tamas K Lengyel <tamas.lengyel@intel.com>
2023-04-26 19:12:51 +00:00
Zvonko Kaiser
dded731db3 gpu: Add OVMF setting for MMIO aperture
The default size of OVMFs aperture is too low to
initialized PCIe devices with huge BARs

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
2a830177ca gpu: Add fwcfg helper function
Added driver util function for easier handling of VFIO
devices outside of the VFIO module. At the sandbox level
we may need to set options depending if we have a VFIO/PCIe
device, like the fwCfg for confiential guests.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
c8cf7ed3bc gpu: Add ColdPlug of VFIO devices with devManager
If we have a VFIO device and cold-plug is enabled
we mark each device as ColdPlug=true and let the VFIO
module do the attaching.

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
e2b5e7f73b gpu: Add Rawdevices to hypervisor
RawDevics are used to get PCIe device info early before the sandbox
is started to make better PCIe topology decisions

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Zvonko Kaiser
377ebc2ad1 gpu: Add configuration option for cold-plug VFIO
Users can set cold-plug="root-port" to cold plug a VFIO device in QEMU

Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
2023-04-26 09:47:37 +00:00
Eduardo Berrocal
9c38204f13 virtcontainers/persist: Improved test coverage 65% to 87.5%
Expanded tests on manager_test.go to cover more lines of code.

Fixes: #259

Signed-off-by: Eduardo Berrocal <eduardo.berrocal@intel.com>
2023-04-25 23:53:46 +00:00
Fabiano Fidêncio
f478b9115e clh: tdx: Update timeouts for confidential guest
Booting up TDX takes more time than booting up a normal VM.  Those
values are being already used as part of the CCv0 branch, and we're just
bringing them to the `main` branch as well.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-13 10:18:07 +02:00
Fabiano Fidêncio
3b3656d96d Merge pull request #6522 from fidencio/topic/add-tdx-artefacts-from-2023ww01-to-main
tdx: Add artefacts from the latest TDX tools release into main
2023-04-11 20:43:02 +02:00
Fabiano Fidêncio
50ce33b02d Merge pull request #6205 from fengwang666/non-root-clh
runtime: support non-root for clh
2023-04-11 19:34:00 +02:00
Fabiano Fidêncio
ed145365ec runtime/qemu: Drop "kvm-type=tdx"
This is not supported since 22ww49.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
25b3cdd38c virtcontainers: Drop check for the tdx CPU flag
In the recent kernels provided by Intel the `tdx` CPU flag is not
present anymore.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Fabiano Fidêncio
01bdacb4e4 virtcontainers: Also check /sys/firmwares/tdx for TDX
Let's make sure we also check /sys/firmwares/tdx for TDX guest
protection, as the location may depend on whether TDX Seam is being used
or not.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-04-11 15:23:42 +02:00
Pradipta Banerjee
3081cd5f8e runtime: propagate configmap/secrets etc changes for remote-hyp
For remote hypervisor, the configmap, secrets, downward-api or project-volumes are
copied from host to guest. This patch watches for changes to the host files
and copies the changes to the guest.

Note that configmap updates takes significantly longer than updates via downward-api.
This is similar across runc and Kata runtimes.

Fixes: #6341

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Julien Ropé <jrope@redhat.com>
2023-03-30 03:14:52 +00:00
Bin Liu
75987aae72 Merge pull request #6408 from jongwu/nydus_rm_hybrid
nydus: upgrad to v2.2.0
2023-03-28 11:07:56 +08:00
Megan Wright
f31c907f46 Fix bad merge 2023-03-20 13:51:10 +00:00
Megan Wright
42978f3e83 CCv0: Merge main into CCv0 branch
Merge remote-tracking branch 'upstream/main' into CCv0

Fixes: #6504
Signed-off-by: Megan Wright <megan.wright@ibm.com>
2023-03-20 13:23:49 +00:00
Hyounggyu Choi
96baa83895 agent: Bring in VFIO-AP device handling again
This PR is a continuing work for (kata-containers#3679).

This generalizes the previous VFIO device handling which only
focuses on PCI to include AP (IBM Z specific).

Fixes: kata-containers#3678
Signed-off-by: Hyounggyu Choi <Hyounggyu.Choi@ibm.com>
2023-03-16 18:14:12 +09:00
Jakob Naucke
f666f8e2df agent: Add VFIO-AP device handling
Initial VFIO-AP support (#578) was simple, but somewhat hacky; a
different code path would be chosen for performing the hotplug, and
agent-side device handling was bound to knowing the assigned queue
numbers (APQNs) through some other means; plus the code for awaiting
them was written for the Go agent and never released. This code also
artificially increased the hotplug timeout to wait for the (relatively
expensive, thus limited to 5 seconds at the quickest) AP rescan, which
is impractical for e.g. common k8s timeouts.

Since then, the general handling logic was improved (#1190), but it
assumed PCI in several places.

In the runtime, introduce and parse AP devices. Annotate them as such
when passing to the agent, and include information about the associated
APQNs.

The agent awaits the passed APQNs through uevents and triggers a
rescan directly.

Fixes: #3678
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 10:07:48 +09:00
Jakob Naucke
b546eca26f runtime: Generalize VFIO devices
Generalize VFIO devices to allow for adding AP in the next patch.
The logic for VFIOPciDeviceMediatedType() has been changed and IsAPVFIOMediatedDevice() has been removed.

The rationale for the revomal is:

- VFIODeviceMediatedType is divided into 2 subtypes for AP and PCI
- Logic of checking a subtype of mediated device is included in GetVFIODeviceType()
- VFIOPciDeviceMediatedType() can simply fulfill the device addition based
on a type categorized by GetVFIODeviceType()

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 10:06:37 +09:00
Jakob Naucke
4c527d00c7 agent: Rename VFIO handling to VFIO PCI handling
e.g., split_vfio_option is PCI-specific and should instead be named
split_vfio_pci_option. This mutually affects the runtime, most notably
how the labels are named for the agent.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2023-03-16 07:43:39 +09:00