Commit Graph

1458 Commits

Author SHA1 Message Date
David Gibson
e50b05d93c agent/pci: Add type to represent PCI addresses
Add a new pci::Address type which represents a guest PCI address in
DDDD:BB:SS.F form.

fixes #2745

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-10-08 16:52:49 +11:00
David Gibson
8528157b9b agent/pci: Extend Slot type to represent PCI function as well
pci::Slot represents a PCI slot.  However, in all cases where we use it, we
actually care about addressing a specific PCI function.  So, at the moment
we can only refer to function 0 in each slot.

Replace pci::Slot with pci::SlotFn to represent both the slot and function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-10-08 16:52:49 +11:00
Fupan Li
988eb95621 Merge pull request #2760 from liubin/fix/2759-optimize-code-for-managing-temp-users
runtime: optimize code for managing temp users for rootless mode
2021-10-08 13:49:14 +08:00
bin
bf8f582c1d runtime: optimize code for managing temp users for rootless mode
This commit does two chagnes:

- move code for managing temp users to rootless.go.
- use common function in qemu.go when shutdown the VM.

Fixes: #2759

Signed-off-by: bin <bin@hyper.sh>
2021-10-08 11:04:21 +08:00
Samuel Ortiz
08360c981d agent: Add an agent configutation file example
With all endpoints allowed.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2021-10-07 04:04:52 +02:00
Samuel Ortiz
8a4e69d237 agent: rpc: Return UNIMPLEMENTED for not allowed endpoints
From the endpoints string described through the configuration file, we
build a hash set of allowed enpoints. If a configuration files does not
include an endpoints section, we assume all endpoints are not allowed.
If there is no configuration file, then all endpoints are allowed.

Then for every ttrpc request, we check if the name of the endpoint is
part of the hashset. If it is not, then we return ttrcp::UNIMPLEMENTED.

Fixes: #1837

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-10-07 04:04:32 +02:00
Samuel Ortiz
0ea2e3af07 agent: config: Allow for building the configuration from a file
When the kernel command line includes a agent.config_file=<path> entry,
then we will try to override the default confiuguration values with the
ones we parse from a TOML file at <path>.

As the configuration file overrides the default values, we need to go
through a simplified builder that convert a set of Option<> fields into
the actual AgentConfig structure.

Fixes: #1837

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-10-07 00:37:40 +02:00
Samuel Ortiz
63539dc9fd agent: config: Add allowed endpoints
They will define the list of endpoints that an agent supports.
They're empty and non actionable for now.

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-10-07 00:37:40 +02:00
Samuel Ortiz
a953fea324 agent: config: Simplify configuration creation
We dont need a constructor and derive directly from the command line
parsing.

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-10-07 00:37:40 +02:00
Samuel Ortiz
b888edc2fc agent: config: Implement Default
A single constructor setting default value is a typical pattern for a
Default implementation.

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-10-07 00:37:40 +02:00
Bin Liu
10ec4b133c Merge pull request #2742 from liubin/fix/2741-delete-file-code
Delete file virtcontainers-setup.sh
2021-10-07 11:54:47 +08:00
Fabiano Fidêncio
4cde619c68 Merge pull request #2797 from fidencio/wip/upgrade-vendored-containerd
vendor: Update containerd to v1.5.7
2021-10-06 21:05:44 +02:00
Chelsea Mafrica
6e3fcce2a2 Merge pull request #2748 from liubin/fix/2747-add-test
runtime: Optimize func noNeedForOutput and add test cases
2021-10-06 11:24:57 -07:00
Jianyong Wu
7eac2ec786 protection: add confidential compute frame for arm
Even CCA, which is the confidential compute archtecture, has not been
ready, add a empty implementation to avoid static check error.

Fixes: #2789
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Suggested-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-10-06 15:53:36 +02:00
Jianyong Wu
8acfc154de check: fix typecheck failure in qemu_arm64_test.go
fix typecheck failure in qemu_arm64_test.go

Fixes: #2789
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-10-06 15:53:35 +02:00
Amulya Meka
5b02d54e23 virtcontainers: fix lint failure on ppc64le
Add nolint for arch specific code to exclude
from lint check.

Fixes: #2773

Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
2021-10-06 15:53:35 +02:00
Jakob Naucke
ff9728f032 virtcontainers: nolint guestProtection
Exclude from lint checking for it is ultimately only used in
architecture-specific code.

Fixes: #2273
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-10-06 15:53:35 +02:00
Jakob Naucke
5c138c8f12 runtime: Fix field alignment on s390x
Follow-up of #2237 for s390x -- field alignment isn't always minimal

Fixes: #2773
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-10-06 15:53:35 +02:00
Fabiano Fidêncio
191d001610 vendor: Update containerd to v1.5.7
Bump containerd to v1.5.7 in order to bring in a fix for CVE-2021-41103,
"insufficiently restricted permissions ons plugins directories
(https://github.com/advisories/GHSA-c2h3-6mxw-7mvq)".

dependabot found a potential security vulnerability and raised a PR to
fix it.  However, dependabot does not properly follows nor understands
the needed of our CIs (mainly related to formatting the PR and whatnot),
thus I'm re-raising it.

Fixes: #2796
Supersedes: #2787

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-10-06 10:40:43 +02:00
Eric Ernst
2bc7561561 Merge pull request #2769 from sameo/topic/agent-route
Pass the host route IP family to the guest
2021-10-05 07:20:33 -07:00
Samuel Ortiz
a44cde7e8d agent: netlink: Use the grpc IP family field when updating the route
Not all routes have either a gateway or a destination IP.
Interface routes, where the source, destination and gateway are undefined,
will default to IP v4 with the current is_ipv6() check even when they
are v6 routes.

We use the provided gRPC Route.Family field instead. This field is built
from the host netlink messages, and is a reliable way of finding out
a route's IP family.

Fixes: #2768

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2021-10-01 14:39:46 +02:00
Samuel Ortiz
71ce6cfe9e runtime: Pass the route IP family to the agent
When updating the guest routing table, we should forward the IP family
information up to the guest.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2021-10-01 14:35:17 +02:00
Samuel Ortiz
99450bd1f7 agent: protos: Add a Family field to the Route payload
Our check for the IP family is working as long as we have either a
gateway or a destination IP. Some routes are missing both.
The RT netlink messages provide the IP family information for each
route, so we can carry that piece of information up to the guest. That
will allow for a more reliable route IP family determination.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2021-10-01 14:35:17 +02:00
Samuel Ortiz
f85fe70231 runtime: vendor: Bump the netlink package dependency
We need to be able to get the IP family from the netlink route meesages,
and the Route.Family field only got recently added to the netlink
package.

The update generates static check warnings about the call for
nethandler.Delete() being deprecated in favor of a Close() call instead.
So we include the s/Delete()/Close()/ change as part of this PR.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2021-10-01 14:35:01 +02:00
Amulya Meka
e439cec7c5 cmd: fix field alignment on ppc64le
Optimising structure field alignment.

Fixes: #2779

Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
2021-10-01 11:45:27 +00:00
Amulya Meka
e5159ea755 cmd: get return value for setCPUtype
Accept and assert the return value in testSetCPUTypeGeneric.

Fixes: #2779

Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
2021-10-01 11:44:14 +00:00
James O. D. Hunt
2ce8d4263c clh: Suppress hypervisor output to make guest output visible
Reduce the cloud-hypervisor log level from `Debug` to `Info` when hypervisor
debug is enabled. This is required since `Debug` level:

- Is overkill for debugging hypervisor failures.
- Effectively hides the output from the guest kernel and userland: CLH
  generates so much output that the output from the guest gets "lost in
  the noise" (experiments show that for each full CLH debug message, at most
  1 _byte_ of guest output is displayed).

Fixes: #2726.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-09-30 14:22:09 +01:00
Jakob Naucke
8739a73dd3 Merge pull request #2736 from Amulyam24/kata-check-test
cmd: Fix mismatched types in testModuleData
2021-09-30 10:20:19 +02:00
bin
762922a521 runtime: delete func ConstraintsToVCPUs
ConstraintsToVCPUs is not used any more.

Fixes: #2741

Signed-off-by: bin <bin@hyper.sh>
2021-09-30 14:44:41 +08:00
bin
4f4854308a runtime: delete virtcontainers-setup.sh
This file is not used anymore.

Fixes: #2741

Signed-off-by: bin <bin@hyper.sh>
2021-09-30 14:44:30 +08:00
Chelsea Mafrica
96c033ba6c Merge pull request #2763 from liubin/fix/2762-update-gitignore
runtime: update .gitignore to ignore monitor_address file
2021-09-29 09:45:57 -07:00
Carlos Venegas
7183de47df Merge pull request #2766 from YchauWang/wyc-runtime-cmd
runtime: fix the make check-go-static command error
2021-09-29 10:53:02 -05:00
Bin Liu
4ac7199282 Merge pull request #2494 from rapiz1/clean-up-code
virtcontainers: clean up useless code
2021-09-29 22:56:13 +08:00
wangyongchao.bj
bb99bfb45d runtime: fix the make check-go-static command error
modify the make script of the check-go-static, changing the `./cli` path to `./cmd/kata-runtime`

Fixes: #2765

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2021-09-29 15:37:25 +08:00
David Gibson
b57613f53e Merge pull request #1682 from dgibson/rescan
Remove forced PCI rescans from agent
2021-09-29 13:03:55 +10:00
bin
870771d76d runtime: update .gitignore to ignore monitor_address file
Run tests sometimes generate pkg/containerd-shim-v2/monitor_address,
and `git status` will treat it as a new file.

Package containerd-shim-v2 has moved to pkg/containerd-shim-v2,
the monitor_address in .gitignore should be updated too.

Fixes: #2762

Signed-off-by: bin <bin@hyper.sh>
2021-09-29 09:24:14 +08:00
Fupan Li
823818cfbc Merge pull request #2744 from fengwang666/nil-bug
runtime: fix nil reference in cleanup rootless user
2021-09-28 22:43:24 +08:00
bin
18bff58487 runtime: Optimize func noNeedForOutput and add test cases
Optimize func noNeedForOutput and add test cases for this func.

Fixes: #2747

Signed-off-by: bin <bin@hyper.sh>
2021-09-28 16:58:44 +08:00
Feng Wang
e5fe53f0a9 runtime: fix nil reference in cleanup rootless user
It seems the client (crio) can send multiple requests to stop the Kata VM,
resulting a nil reference if the uid has already been cleaned up by a different thread.

Fixes #2743

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2021-09-27 21:28:47 -07:00
Francesco Giudici
2304a59601 runtime: set the sandbox storage path static
Since we now have "unix://" kind of socket returned by the
SocketAddress() function, there is no more need to build the sandbox
storage path dynamically to keep OS compatibility.

Fixes: #2738
Suggested-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
2021-09-27 15:57:34 +02:00
Francesco Giudici
315295e0ef runtime: rename GetSanboxesStoragePath() --> GetSandboxesStoragePath()
Add the missing 'd'.

Fixes: #2738
Suggested-by: Jakob Naucke <jakob.naucke@ibm.com>
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
2021-09-27 15:56:14 +02:00
Bin Liu
3217b03b17 Merge pull request #2522 from Bevisy/main-2515
virtcontainers: Fix incorrect scripts path
2021-09-27 21:14:40 +08:00
Bin Liu
39df808f6a Merge pull request #2695 from YchauWang/wyc-vc-cgroup
runtime: clear virtcontainers cgroup duplicated function
2021-09-27 21:12:39 +08:00
Amulya Meka
13e65f2ee8 cmd: Fix mismatched types in testModuleData
Rectify the values of testModuleData with the correct
types in TestCCCheckCLiFunction in kata-check_(!x86)_test.go

Fixes: #2735

Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
2021-09-27 07:17:07 +00:00
Peng Tao
05995632c3 Merge pull request #2566 from fgiudici/kata-monitor_improvements
Kata monitor: cache improvements
2021-09-27 12:29:13 +08:00
David Gibson
907459c1c1 agent/device: Don't force PCI rescans
The agent initiates a PCI rescan from two places.  One is triggered
for each virtio-blk PCI device, and one is triggered unconditionally
when we start a new container.

The PCI bus rescan code was added long time ago in Clear Containers due to
lack of ACPI support in QEMU 2.9 + q35.  Since Kata routinely plugs devices
under a PCIe-to-PCI bridge, that left SHPC as the only available hotplug
mechanism.

However, while Kata was using SHPC on the qemu side, it wasn't actually
using it on the guest side.  Due to a quirk of our guest kernel
configuration, the SHPC driver never bound to the bridge, and *no* hotplug
was working at all.  To work around that, Kata was forcing the rescan
manually, which would discover the new device.  That was very fragile (we
were arguably relying on a kernel bug).  Even if we were using SHPC
propertly, it includes a mandatory 5s delay during plug operations
(designed for physical cards and human operators), which makes it
unsuitable quick start up.

Worse, the forced PCI rescans could race with either SHPC or PCIe native
hotplug sequences, causing several problems.  In some cases this could put
the device into an entirely broken state where it wouldn't respond to
config space accesses at all.

Since pull request #2323 was merged, we have instead used ACPI hotplug
which is both fast, and more solid in terms of semantics and races.  So,
the forced PCI rescans are no longer necessary.  Remove them all.

fixes #683

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
75f426dd1e agent: Simplify do_add_swap()
do_add_swap() has some mildly complex code to translate the PCI path of
a virtio-blk device (where the swap will reside) into a /dev path. However,
the device module already has get_virtio_blk_pci_device_name() which does
exactly that.  The existing code has some further advantages: it uses
more precise matching of the sysfs paths, and if necessary it will wait for
the device to be added to the guest.

While we're there, remove an unnecessary 'as u8' from the PCI path
construction: pci::Path::new() already accepts anything which implements
TryInfo<u8>, which u32 certainly does.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
aad1a8734f runtime/device: Give the agent information about VFIO devices
We send information about several kinds of devices to the agent so
that it can apply specific handling.  We don't currently do this with
VFIO devices.  However we need to do that so that the agent can
properly wait for VFIO devices to be ready (previously it did that
using a PCI rescan which may not be reliable and has some very bad
side effects).

This patch collates and sends the relevant information.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
ebd7b61884 runtime: Don't repeat GetDeviceByID between appendDevices() and append*()
Both appendBlockDevice and appendVhostUserBlkDevice start by using
GetDeviceByID to lookup the api.Device object corresponding to their
ContainerDevice object.  However their common caller, appendDevices() has
already done this.

This changes it so the looked up api.Device is passed to the individual
append*Device() functions.  This slightly reduces duplicated work, but more
importantly it makes it clearer that append*Device() don't need to check
for a nil result from GetDeviceByID, since the caller has already done
that.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
ad45c52fbe runtime/device: Record guest PCI path for VFIO devices
For several device types which correspond to a PCI device in the guest
we record the device's PCI path in the guest.  We don't currently do
that for VFIO devices, but we're going to need to for better handling
of SR-IOV devices.

To accomplish this, we have to determine the guest PCI path from the
information the VMM gives us:

For qemu, we query the slot of the device and its bridge from QMP.

For cloud-hypervisor, the device add interface gives us a guest PCI
address.  In fact this represents a design error in the clh API -
there's no way it can really know the guest PCI address in general.
It works in this case, because clh doesn't use PCI bridges, so the
device will always be on the root bus.  Based on that, the PCI path is
simply the device's slot number.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00