Commit Graph

787 Commits

Author SHA1 Message Date
Julio Montes
5d2a82fbf9 Merge pull request #2323 from dgibson/acpi-pcihp
Replace SHPC with ACPI PCI hotplug for Kata guests
2021-09-23 09:55:31 -05:00
Fabiano Fidêncio
0ececc630f Merge pull request #2666 from cmaf/tracing-newContainer-logger
runtime: tracing: Fix logger passed in newContainer
2021-09-23 13:07:19 +02:00
Fabiano Fidêncio
e33c26ba18 Merge pull request #2622 from YchauWang/wyc-vc-api
virtcontainers: update VC SandboxConfig API add SandboxBindMounts field
2021-09-23 13:05:33 +02:00
Fabiano Fidêncio
47170e302a Merge pull request #2616 from Bevisy/main-2615
sandbox: Allow the device to be accessed,such as /dev/null and /dev/u…
2021-09-23 13:04:18 +02:00
David Gibson
8bbcb06af5 qemu: Disable SHPC hotplug
Under certain circumstances[0] Kata will attempt to use SHPC hotplug
for PCI devices on the guest.  In fact we explicitly enable SHPC on
our PCI to PCI bridges, regardless of the qemu default.

SHPC was designed a long, long time ago for physical hotplugging and
works very poorly for a virtual environment. In particular it has a
mandatory 5s delay to allow a (real, human) operator to back out the
operation if they press a button by mistake. This alone makes it
unusable for a fast start up application like Kata.

Worse, the agent forces a PCI rescan during startup.  That will race
with the SHPC hotplug operation causing the device to go into a bad
state where config space can't be accessed from the guest at all.

The only reason we've sort of gotten away with this is that our
default guest kernel configuration triggers what's arguably a kernel
bug effectively disabling SHPC.  That makes the agent rescan the only
reason we see the new device.

Now that we require a qemu >=6.1, which includes ACPI PCI hotplug on
the q35 machine, we can explicitly disable SHPC in all cases.  It's
nothing but trouble.

fixes #2174

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-23 10:27:26 +10:00
David Gibson
cc4983eeac runtime: Remove unused qemuArchBase.appendBridges definition
qemuArchBase.appendBridges is never actually used, because the bare
qemuArchBase type is itself never used (outside of unit tests).  Instead
*all* the subclasses of qemuArchBase override appendBridges() to call
the very similar, but not identical genericAppendBridges.  So, we can
remove the qemuArchBase.appendBridges implementation.

Furthermore, all those subclasses override appendBridges() in exactly
the same way, and so we can remove *those* definitions and replace the
base class qemuArchBase appendBridges() with that version, calling
genericAppendBridges().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-23 10:15:08 +10:00
David Gibson
e248de4616 vendor: Update govmm
Update to commit 1b60b536f3, in particular to get extensions to
allow IO and memory window reservations to be set on PCI bridges.

https://github.com/kata-containers/govmm/pull/201

Git log:

de039da govmm/qemu: Let IO/memory reservations be specified for bridge devices

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-23 10:14:29 +10:00
Fabiano Fidêncio
32c3fb71f2 Merge pull request #2546 from fengwang666/rootless-qemu-doc
docs: documentation for running non-root VMM
2021-09-21 22:45:33 +02:00
Fabiano Fidêncio
2bee8bc6bd Merge pull request #2432 from fengwang666/qemu-rootless
runtime: run the QEMU VMM process with a non-root user
2021-09-21 21:37:02 +02:00
Feng Wang
305afc8b70 docs: documentation for running non-root VMM
Documentation for running non-root QEMU VMM in Kata runtime

Fixes: #2545

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2021-09-21 11:20:37 -07:00
Samuel Ortiz
3a4aca4d67 Merge pull request #2671 from YchauWang/wyc-runtime-config
runtime: update .gitignore file cleare the vc shim config
2021-09-21 15:15:09 +02:00
Feng Wang
9a6d56f1ab runtime: fix empty cgroup path validation error
An empty cgroup path shouldn't fail cgroup creation

Fixes #2674

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2021-09-20 13:48:09 -07:00
Chelsea Mafrica
077b77c178 runtime: tracing: Fix logger passed in newContainer
Change logger in Trace call in newContainer from sandbox.Logger() to
nil. Passing nil will cause an error to be logged by kataTraceLogger
instead of the sandbox logger, which will avoid having the log message
report it as part of the sandbox subsystem when it is part of the
container subsystem.

The kataTraceLogger will not log it as related to the container
subsystem, but since the container logger has not been created at this
point, and we already use the kataTraceLogger in other instances where a
subsystem's logger has not been created yet, this PR makes the call
consistent with other code.

Fixes #2665

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2021-09-17 11:41:04 -07:00
Feng Wang
1cfe59304d runtime: Run QEMU using a non-root user/group
A random generated user/group is used to start QEMU VMM process.
The /dev/kvm group owner is also added to the QEMU process to grant it access.

Fixes #2444

Signed-off-by: Feng Wang <feng.wang@databricks.com>
2021-09-17 11:28:44 -07:00
wangyongchao.bj
fd98373850 runtime: update .gitignore file cleare the vc shim config
update .gitignore file, remove the follow configurations:
/virtcontainers/shim/mock/cc-shim/cc-shim
/virtcontainers/shim/mock/kata-shim/kata-shim
/virtcontainers/shim/mock/shim

Fixes: #2670

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2021-09-17 15:25:28 +08:00
Hui Zhu
fff82b4ef5 Merge pull request #2628 from bergwolf/runtime-reorg
runtime: refactor commandline code directory
2021-09-17 10:37:22 +08:00
Chelsea Mafrica
6159ef3499 Merge pull request #2626 from YchauWang/wyc-vc-api02
virtcontainers: update VC HypervisorConfig API add three lost fields
2021-09-16 16:46:27 -07:00
Peng Tao
067c44d0b6 runtime: fix UT build failure
storeContainer has been removed.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-09-16 19:42:02 +08:00
Peng Tao
e7c42fbc76 runtime: unify generated config
We don't need to maintain two generated config.go and even have
duplicates between them.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-09-16 17:19:18 +08:00
Peng Tao
4f7cc18622 runtime: refactor commandline code directory
Move all command line code to `cmd` and move containerd-shim-v2 to pkg.

Fixes: #2627
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-09-16 17:19:18 +08:00
Samuel Ortiz
7bf96d2457 Merge pull request #2604 from Amulyam24/container_tests
virtcontainers: add unit tests for container.go
2021-09-16 11:02:16 +02:00
Bo Chen
d00decc97d runtime: clh: Enable hugepages support
This patch adds the configuration option that allows to use hugepages
with Cloud Hypervisor guests.

Fixes: #2648

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-15 10:43:57 -07:00
David Gibson
64bb803fcf runtime/qemu: Move from query-cpus to query-cpus-fast
We recently updated to using qemu-6.1 (from qemu 5.2).  Unfortunately one
breaking change in qemu 6.0 wasn't caught by the CI.

The query-cpus QMP command has been removed, replaced by query-cpus-fast
(which has been available since qemu 2.12).  govmm already had support for
query-cpus-fast, we just weren't using it, so the change is quite easy.

fixes #2643

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-15 16:41:26 +10:00
Samuel Ortiz
4b7e4a4c70 runtime: Vendoring update
Due to the libcontainer dependencies removal.

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-09-14 07:09:34 +02:00
Samuel Ortiz
9bed2ade0f virtcontainers: Convert to the new cgroups package API
The new API is based on containerd's cgroups package.
With that conversion we can simpligy the virtcontainers sandbox code and
also uniformize our cgroups external API dependency. We now only depend
on containerd/cgroups for everything cgroups related.

Depends-on: github.com/kata-containers/tests#3805
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-09-14 07:09:34 +02:00
Samuel Ortiz
b42ed39349 virtcontainers: cgroups: Add a containerd API based cgroups package
Eventually, we will convert the virtcontainers and the whole Kata
runtime code base to only rely on that package.

This will make Kata only depends on the simpler containerd cgroups API.

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-09-14 07:09:34 +02:00
Samuel Ortiz
f17752b0dc virtcontainers: container: Do not create and manage container host cgroups
The only process we are adding there is the container host one, and
there is no such thing anymore.

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-09-14 07:09:33 +02:00
Samuel Ortiz
dc7e9bce73 virtcontainers: sandbox: Host cgroups partitioning
This is a simplification of the host cgroup handling by partitioning the
host cgroups into 2: A sandbox cgroup and an overhead cgroup.

The sandbox cgroup is always created and initialized. The overhead
cgroup is only available when sandbox_cgroup_only is unset, and is
unconstrained on all controllers. The goal of having an overhead cgroup
is to be more flexible on how we manage a pod overhead. Having such
cgroup will allow for setting a fixed overhead per pod, for a subset of
controllers, while at the same time not having the pod being accounted
for those resources.

When sandbox_cgroup_only is not set, we move all non vCPU threads
to the overhead cgroup and let them run unconstrained. When it is set,
all pod related processes and threads will run in the sandbox cgroup.

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-09-14 07:09:29 +02:00
Samuel Ortiz
f811026c77 virtcontainers: Unconditionally create the sandbox cgroup manager
Regardless of the sandbox_cgroup_only setting, we create the sandbox
cgroup manager and set the sandbox cgroup path at the same time.

Without doing this, the hypervisor constraint routine is mostly a NOP as
the sandbox state cgroup path is not initialized.

Fixes #2184

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-09-14 07:05:57 +02:00
wangyongchao.bj
a6066404f7 virtcontainers: update VC HypervisorConfig API add three lost fields
Sync the virtcontainers api.md document, add `ConfidentialGuest` `EntropySourceList` `GuestSwap` three
 fields to the HypervisorConfig API.

Fixes #2625

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2021-09-14 10:42:54 +08:00
wangyongchao.bj
bb18cd475c virtcontainers: update VC SandboxConfig API add SandboxBindMounts field
sync the virtcontainers api.md document, add SandboxBindMounts field to the SandboxConfig API.
And update the order of the SandboxConfig API fields.

Fixes #2621

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2021-09-14 09:56:47 +08:00
Eric Ernst
967db0cbcc Merge pull request #2544 from likebreath/0831/upgrade_clh_v18.0
versions: Upgrade to Cloud Hypervisor v18.0
2021-09-13 11:27:45 -07:00
Fabiano Fidêncio
9381f23ccf Merge pull request #2613 from sameo/topic/runtime-readme
runtime: Fix README link
2021-09-13 17:44:56 +02:00
Binbin Zhang
58e77a3c13 sandbox: Allow the device to be accessed,such as /dev/null and /dev/urandom
If the device has no permission, such as /dev/null, /dev/urandom,
it needs to be added into cgroup.

Fixes: #2615

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2021-09-13 20:47:16 +08:00
Samuel Ortiz
75ef8c243a Merge pull request #2603 from Bevisy/main-2539
sandbox: Add device permissions such as /dev/null to cgroup
2021-09-13 11:04:51 +02:00
Samuel Ortiz
13b8bb0c74 runtime: Fix README link
The LICENSE file lives in the project's root.

Fixes #2612

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2021-09-11 09:44:40 +02:00
Anastassios Nanos
62baa48ef5 virtcontainers: fc: parse vcpuID correctly
In getThreadIDs(), the cpuID variable is derived from a string that
already contains a whitespace. As a result, strings.SplitAfter returns
the cpuID with a leading space. This makes any go variant of string to int
fail (strconv.ParseInt() in our case). This patch makes sure that the
leading space character is removed so the string passed to
strconv.ParseInt() is "CPUID" and not " CPUID".

This has been caused by a change in the naming scheme of vcpu threads
for Firecracker after v0.19.1.

Fixes: #2592

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2021-09-10 09:39:56 +00:00
Bo Chen
f785ff0bf2 virtcontainers: clh: Revert the workaround incorrect default values
Given the fix to the bugs of the openapi spec file is included in the
Cloud Hypervisor v18.0 [1], this patch reverts the workaround we carried
in the CLH driver.

This reverts commit 932ee41b3f.

[1] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3029

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-09 14:52:53 -07:00
Bo Chen
0e0e59dc5f virtcontainers: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v18.0.
Note: The client code of cloud-hypervisor's (CLH) OpenAPI is
automatically generated by openapi-generator [1-2].

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-09-09 14:51:55 -07:00
Amulyam24
d865c80986 virtcontainers: add unit tests for container.go
Fixes: #268

Signed-off-by: Amulyam24 <amulmek1@in.ibm.com>
2021-09-09 13:09:38 +05:30
Binbin Zhang
71f915c63f sandbox: Add device permissions such as /dev/null to cgroup
adds the default devices for unix such as /dev/null, /dev/urandom to
the container's resource cgroup spec

Fixes: #2539

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2021-09-09 15:33:24 +08:00
bin
2abc450a4d test: enable running tests under root user
Add tests that run under root user to test special cases.

Fixes: #2446

Signed-off-by: bin <bin@hyper.sh>
2021-09-09 14:21:34 +08:00
Julio Montes
9bbaa66f39 Merge pull request #2480 from Bevisy/main
makefile: Fix error exit status code
2021-09-06 07:28:15 -05:00
Bin Liu
103fdd3f6c Merge pull request #2564 from Bevisy/main-2296
virtcontainers: Remove NewStoreFeature
2021-09-03 10:41:21 +08:00
James O. D. Hunt
f3a1bf3b45 Merge pull request #2552 from bergwolf/license
license: drop redundent license files
2021-09-02 14:31:18 +01:00
Binbin Zhang
e2a9e78c9e virtcontainers: Remove NewStoreFeature
remove NewStoreFeature

Fixes: #2296

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2021-09-02 21:28:36 +08:00
Peng Tao
256c3b2747 license: drop redundent license files
There is no need to keep multiple copies of the license file in
different directory. We can just use the top level one for the project.

Fixes: #2553
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-09-01 15:10:04 +08:00
Hui Zhu
bcc9fa3b35 hotplugAddBlockDevice: Use ExecuteBlockdevAddWithDriverCache with swap
Use ExecuteBlockdevAddWithDriverCache with swap in
hotplugAddBlockDevice to handle swap file cannot work OK with
ExecuteBlockdevAddWithCache issue.

Fixes: #2548

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-09-01 14:13:11 +08:00
Hui Zhu
bd85da0461 vendor: Update vendor/github.com/kata-containers/govmm
Update vendor/github.com/kata-containers/govmm for
ExecuteBlockdevAddWithDriverCache.

Fixes: #2548

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-09-01 13:59:19 +08:00
Peng Tao
c0daa4ebff Merge pull request #2513 from cmaf/tracing-tracingtags-consistency
tracing: Change runtime tracing tags to vars
2021-08-31 10:25:10 +08:00