486 Commits

Author SHA1 Message Date
Jianyong Wu
0366f6e817 template: disable template unit test on arm
Template is broken on arm. here we disable the template unit test
temporarily.

Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-11-05 11:34:35 +08:00
Jianyong Wu
7cb650abcf runtime: DefaultMaxVCPUs should not greater than defaultMaxQemuVCPUs
DefaultMaxVCPUs may be larger than the defaultMaxQemuVCPUs that should
be checked and avoided.

Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-11-05 11:34:22 +08:00
Bo Chen
b794a39401 virtcontainers: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v19.0.
Note: The client code of cloud-hypervisor's (CLH) OpenAPI is
automatically generated by openapi-generator [1-2].

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 8030b6caf0)
2021-10-22 16:39:03 -07:00
Bo Chen
eea2c0195f virtcontainers: clh: Use 'quiet' as the default kernel parameter
The 'quiet' kernel parameter can avoid guest kernel logs while booting,
which can reduce boot time.

Fix: #2820

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 7b2bfd4eca)
2021-10-14 08:53:50 +02:00
Bo Chen
1e798b96fd virtcontainers: clh: Turn-off serial and virtio-console by default
We will need to have console output from the guest only for debugging
purposes. As a result, we can turn-off both the serial and
virtio-console devices by default for better boot time.

Fixes: #2820

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 3e24e46c70)
2021-10-14 08:53:44 +02:00
Samuel Ortiz
893623dfbc runtime: Pass the route IP family to the agent
When updating the guest routing table, we should forward the IP family
information up to the guest.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
(cherry picked from commit 71ce6cfe9e)
2021-10-14 08:53:06 +02:00
Samuel Ortiz
503ce9c154 agent: protos: Add a Family field to the Route payload
Our check for the IP family is working as long as we have either a
gateway or a destination IP. Some routes are missing both.
The RT netlink messages provide the IP family information for each
route, so we can carry that piece of information up to the guest. That
will allow for a more reliable route IP family determination.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
(cherry picked from commit 99450bd1f7)
2021-10-14 08:53:01 +02:00
Samuel Ortiz
9932e76f27 runtime: vendor: Bump the netlink package dependency
We need to be able to get the IP family from the netlink route meesages,
and the Route.Family field only got recently added to the netlink
package.

The update generates static check warnings about the call for
nethandler.Delete() being deprecated in favor of a Close() call instead.
So we include the s/Delete()/Close()/ change as part of this PR.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
(cherry picked from commit f85fe70231)
2021-10-14 08:52:47 +02:00
Jianyong Wu
1f6b0f651e protection: add confidential compute frame for arm
Even CCA, which is the confidential compute archtecture, has not been
ready, add a empty implementation to avoid static check error.

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Suggested-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-10-13 11:06:42 +02:00
Jianyong Wu
112e0f6381 check: fix typecheck failure in qemu_arm64_test.go
fix typecheck failure in qemu_arm64_test.go

Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-10-13 11:06:42 +02:00
Amulya Meka
18820e31d9 virtcontainers: fix lint failure on ppc64le
Add nolint for arch specific code to exclude
from lint check.

Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
2021-10-13 11:06:42 +02:00
Jakob Naucke
8fafced9ff virtcontainers: nolint guestProtection
Exclude from lint checking for it is ultimately only used in
architecture-specific code.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-10-13 11:06:41 +02:00
Bl1tz23
79e0754a7b fc: fix version parsing for fc >= 0.25
Allows to use firecracker version >=0.25.

Fixes: #2471

Signed-off-by: Bl1tz23 <alex3angle@gmail.com>
(cherry picked from commit 87bbae1bd7)
2021-10-06 17:27:22 +02:00
Fabiano Fidêncio
e58fabfc20 Merge pull request #2598 from c3d/backport/2589-virtiofsd-perms-perms
stable-2.2 | virtiofs: Create shared directory with 0700 mode, not 0750
2021-09-24 09:16:59 +02:00
Peng Tao
feb06dad8a Merge pull request #2623 from Bevisy/stable-2.2-2615-bp
[backport]sandbox: Allow the device to be accessed,such as /dev/null and /dev/u…
2021-09-24 14:04:36 +08:00
Chelsea Mafrica
83f219577d Merge pull request #2668 from cmaf/tracing-newContainer-logger-bp-2.2
stable-2.2 | runtime: tracing: Fix logger passed in newContainer
2021-09-23 09:58:14 -07:00
Chelsea Mafrica
484af1a559 Merge pull request #2678 from nubificus/stable-2.2-fix_fc_vcpu_thread
stable-2.2 | virtcontainers: fc: parse vcpuID correctly
2021-09-20 09:46:07 -07:00
Snir Sheriber
2ca867da7b runtime: Add container field to logs
and unified field naming

Signed-off-by: Snir Sheriber <ssheribe@redhat.com>

Backport from commit 0c7789fad6
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2021-09-20 11:04:09 +02:00
Christophe de Dinechin
25c7e1181a virtiofs: Create shared directory with 0700 mode, not 0750
A discussion on the Linux kernel mailing list [1] exposed that virtiofsd makes a
core assumption that the file systems being shared are not accessible by any
non-privileged user. We currently create the `shared` directory in the sandbox
with the default `0750` permissions, which gives read and directory traversal
access to the group. There is no real good reason for a non-root user to access
the shared directory, and this is potentially dangerous.

Fixes: #2589

[1]: https://lore.kernel.org/linux-fsdevel/YTI+k29AoeGdX13Q@redhat.com/

Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2021-09-20 10:54:18 +02:00
Anastassios Nanos
4c5bf0576b virtcontainers: fc: parse vcpuID correctly
In getThreadIDs(), the cpuID variable is derived from a string that
already contains a whitespace. As a result, strings.SplitAfter returns
the cpuID with a leading space. This makes any go variant of string to int
fail (strconv.ParseInt() in our case). This patch makes sure that the
leading space character is removed so the string passed to
strconv.ParseInt() is "CPUID" and not " CPUID".

This has been caused by a change in the naming scheme of vcpu threads
for Firecracker after v0.19.1.

Fixes: #2592

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2021-09-18 08:10:13 +00:00
Chelsea Mafrica
b3e620dbcf runtime: tracing: Fix logger passed in newContainer
Change logger in Trace call in newContainer from sandbox.Logger() to
nil. Passing nil will cause an error to be logged by kataTraceLogger
instead of the sandbox logger, which will avoid having the log message
report it as part of the sandbox subsystem when it is part of the
container subsystem.

The kataTraceLogger will not log it as related to the container
subsystem, but since the container logger has not been created at this
point, and we already use the kataTraceLogger in other instances where a
subsystem's logger has not been created yet, this PR makes the call
consistent with other code.

Backport of #2666
Fixes #2667

Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2021-09-16 16:30:29 -07:00
Binbin Zhang
56920bc943 sandbox: Allow the device to be accessed,such as /dev/null and /dev/urandom
If the device has no permission, such as /dev/null, /dev/urandom,
it needs to be added into cgroup.

Fixes: #2615
Backport: #2616

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2021-09-14 10:33:49 +08:00
Bo Chen
a1874ccd62 virtcontainers: clh: Revert the workaround incorrect default values
Given the fix to the bugs of the openapi spec file is included in the
Cloud Hypervisor v18.0 [1], this patch reverts the workaround we carried
in the CLH driver.

This reverts commit 932ee41b3f.

[1] https://github.com/cloud-hypervisor/cloud-hypervisor/pull/3029

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit f785ff0bf2)
2021-09-13 14:17:58 -07:00
Bo Chen
c2c650500b virtcontainers: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v18.0.
Note: The client code of cloud-hypervisor's (CLH) OpenAPI is
automatically generated by openapi-generator [1-2].

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 0e0e59dc5f)
2021-09-13 14:17:58 -07:00
Binbin Zhang
807cc8a3a5 sandbox: Add device permissions such as /dev/null to cgroup
adds the default devices for unix such as /dev/null, /dev/urandom to
the container's resource cgroup spec

Fixes: #2539
Backports: #2603

Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
2021-09-10 17:33:26 +08:00
Peng Tao
0bdfdad236 runtime: drop qemu-lite support
As the project is not maintained and we have not been testing against it
for a long time.

Fixes: #2529
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-08-31 10:17:06 +08:00
Bo Chen
938b01aedc virtcontainers: clh: Workaround incorrect default values
Two default values defined in the 'cloud-hypervisor.yaml' have typo, and this
patch manually overwrites them with the correct value as a workaround
before the corresponding fix is landed to Cloud Hypervisor upstream.

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 932ee41b3f)
2021-08-27 13:37:47 -07:00
Bo Chen
abd708e814 virtcontainers: clh: Fix the unit test
This patch fixes the unit tests over clh.go with the updated client code.

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit bff38e4f4d)
2021-08-27 13:37:47 -07:00
Bo Chen
61babd45ed virtcontainers: clh: Use constructors to ensure proper default value
With the updated openapi-generator, the client code now handles optional
attributes correctly, and ensures to assign the right default
values. This patch enables to use those constructors to make sure the
proper default values being used.

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit d967d3cb37)
2021-08-27 13:37:47 -07:00
Bo Chen
59c51f6201 virtcontainers: clh: Migrate to use the updated client APIs
The client code (and APIs) for Cloud Hypervisor has been changed
dramatically due to the upgrade to `openapi-generator` v5.2.1. This
patch migrate the Cloud Hypervisor driver in the kata-runtime to use
those updated APIs.

The main change from the client code is that it now uses "pointer" type
to represent "optional" attributes from the input openapi specification
file.

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit a6a2e525de)
2021-08-27 13:37:47 -07:00
Bo Chen
c1f260cc40 virtcontainers: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor with the
updated `openapi-generator` v5.2.1.

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 46eb07e14f)
2021-08-27 13:37:47 -07:00
Bo Chen
4cd6909f18 virtcontainers: clh: Upgrade to the openapi-generator v5.2.1
To improve the quality and correctness of the auto-generated code, this
patch upgrade the `openapi-generator` to its latest stable release
v5.2.1.

Fixes: #2487

Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 80fba4d637)
2021-08-27 13:37:47 -07:00
wangyongchao.bj
99ab91df3d docs: update the docs project url from kata 1.x to 2.x
changed the document project url in the using-vpp-and-kata.md and
runtime experimental README.md files.

Fixes: #2418

Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
2021-08-10 13:51:54 +08:00
Anastassios Nanos
64dd35ba4f virtcontainers: fc: properly remove jailed block device
When running a firecracker instance jailed, block devices
are not removed correctly, as the jailerRoot path is not
stripped from the PATCH command sent to the FC API.

This patch differentiates the jailed case from the non-jailed
one and allows the firecracker instance to be properly
terminated.

Fixes #2387

Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
2021-08-04 16:31:56 +00:00
David Gibson
3165095669 runtime/qemu: Use explicit "on" for kernel_irqchip parameter
Kata uses the 'kernel_irqchip' machine option to qemu.  By default it
uses it in what qemu calls the "short-form boolean" with no parameter.
That style was deprecated by qemu between 5.2 and 6.0 (commit
ccd3b3b8112b) and effectively removed entirely between 6.0 and 6.1
(commit d8fb7d0969d5).

Update ourselves for newer qemus by using an explicit
"kernel_irqchip=on".

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-08-04 14:34:11 +10:00
Carlos Venegas
27b9a68189 Merge pull request #2365 from sameo/topic/clh-tracing
virtcontainers: clh: Do not use the default HTTP client
2021-08-03 12:54:09 -05:00
Hui Zhu
e6408fe670 Container: Add initConfigResourcesMemory and call it in newContainer
The swappiness is not right if just set
io.katacontainers.container.resource.swappiness:
$ pod_yaml=pod.yaml
$ container_yaml=container.yaml
$ image="quay.io/prometheus/busybox:latest"
$ cat << EOF > "${pod_yaml}"
metadata:
  name: busybox-sandbox1
EOF
$ cat << EOF > "${container_yaml}"
metadata:
  name: busybox-killed-vmm
annotations:
  io.katacontainers.container.resource.swappiness: "100"
image:
  image: "$image"
command:
- top
EOF
$ sudo crictl pull $image
$ podid=$(sudo crictl runp $pod_yaml)
$ cid=$(sudo crictl create $podid $container_yaml $pod_yaml)
$ sudo crictl start $cid
crictl exec $cid cat /sys/fs/cgroup/memory/memory.swappiness
60

The cause of this issue is there are two elements store the resources
infomation.  They are c.config.Resources for calculateSandboxMemory and
c.GetPatchedOCISpec() for agent.
This add initConfigResourcesMemory to Container and call it in
newContainer to handle the issue.

Fixes: #2372

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-08-02 16:02:12 +08:00
Fupan Li
fdc42ca7ff Merge pull request #2324 from jongwu/ro_nv
qemu/arm: remove nvdimm/"ReadOnly" option on arm64
2021-08-02 14:14:06 +08:00
Hui Zhu
ee90affc18 newContainer: Initialize c.config.Resources.Memory if it is nil
container start fail if
io.katacontainers.container.resource.swap_in_bytes and
memory_limit_in_bytes are not set.
$ pod_yaml=pod.yaml
$ container_yaml=container.yaml
$ image="quay.io/prometheus/busybox:latest"
$ cat << EOF > "${pod_yaml}"
metadata:
  name: busybox-sandbox1
EOF
$ cat << EOF > "${container_yaml}"
metadata:
  name: busybox-killed-vmm
annotations:
  io.katacontainers.container.resource.swappiness: "60"
image:
  image: "$image"
command:
- top
EOF
$ sudo crictl pull $image
$ podid=$(sudo crictl runp $pod_yaml)
$ cid=$(sudo crictl create $podid $container_yaml $pod_yaml)
$ sudo crictl start $cid
DEBU[0000] get runtime connection
DEBU[0000] connect using endpoint
'unix:///var/run/containerd/containerd.sock' with '10s' timeout
DEBU[0000] connected successfully using endpoint:
unix:///var/run/containerd/containerd.sock
DEBU[0000] StartContainerRequest:
&StartContainerRequest{ContainerId:4fea91d16f661931fe33acd247efe831ef9e571588ba18b5a16f04c278fd61b8,}
DEBU[0000] StartContainerResponse: nil
FATA[0000] starting the container
"4fea91d16f661931fe33acd247efe831ef9e571588ba18b5a16f04c278fd61b8": rpc
error: code = Unknown desc = failed to create containerd task: failed to
create shim: ttrpc: closed: unknown

The cause of fail if if c.config.Resources.Memory is nil, values of
io.katacontainers.container.resource.swappiness and
io.katacontainers.container.resource.swap_in_bytes will be store in
newContainer.

This commit initialize c.config.Resources.Memory if it is nil in
newContainer.

Fixes: #2367

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-08-01 10:03:27 +08:00
Hui Zhu
767a41ce56 updateResources: Log result after calculateSandboxMemory
Log result after calculateSandboxMemory in updateResources.

Fixes: #2367

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-08-01 09:57:44 +08:00
Samuel Ortiz
760ec4e58a virtcontainers: clh: Do not use the default HTTP client
When enabling tracing with Cloud Hypervisor, we end up establishing 2
connections to 2 different HTTP servers: The Cloud Hypervisor API one
that runs over a UNIX socket and the Jaeger endpoint running over UDP.

Both connections use the default HTTP golang client instance, and thus
share the same transport layer. As the Cloud Hypervisor implementation
sets it up to be over a Unix socket, the jaeger uploader ends up going
through that transport as well, and sending its spans to the Cloud
Hypervisor API server.

We fix that by giving the Cloud Hypervisor implementation its own HTTP
client instance and we avoid sharing it with anything else in the shim.

Fixes #2364

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-07-30 16:51:01 +02:00
James O. D. Hunt
4f0726bc49 docs: Remove table of contents
Removed all TOCs now that GitHub auto-generates them.

Also updated the documentation requirements doc removing the requirement
to add a TOC.

Fixes: #2022.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-07-30 10:58:22 +01:00
Bo Chen
cc0bb9aebc versions: Upgrade to Cloud Hypervisor v17.0
Highlights from the Cloud Hypervisor release v17.0: 1) ARM64 NUMA
support using ACPI; 2) `Seccomp` support for MSHV backend; 3) Hotplug of
macvtap devices; 4) Improved SGX support; 5) Inflight tracking for
`vhost-user` devices; 6) Bug fixes.

Details can be found: https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v17.0

Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by `openapi-generator` [1-2]. As the API changes do not
impact usages in Kata, no additional changes in kata's runtime are
needed to work with the current version of cloud-hypervisor.

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Fixes: #2333

Signed-off-by: Bo Chen <chen.bo@intel.com>
2021-07-27 11:56:29 -07:00
Jianyong Wu
77604de80b qemu/arm: remove nvdimm/"ReadOnly" option on arm64
There is a new "ReadOnly" option added to nvdimm device in qemu
and now added to kata. However, qemu used for arm64 is a little
old and has no this feature. Here we remove this feature for arm.

Fixes: #2320
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
2021-07-27 20:32:55 +08:00
Gabriela Cervantes
4fbae549e4 docs: Update experimental documentation
This PR updates the experimental documentation with the proper reference
to kata 2.x

Fixes #2317

Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
2021-07-26 20:29:21 +00:00
Fabiano Fidêncio
116c29c897 cgroups: manager's Set() now takes Resources as its parameter
Pior our bump to runc 1.0.1 the manager's Set() would take a Config as
its parameter.  Now it takes the Resources directly.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-07-26 11:34:27 +02:00
Fabiano Fidêncio
c0f801c0c4 rootless: RunningInUserNS() is now part of userns namespace
Previously part of the "system" namespace, the RunningInUserNS() has
been moved to the "userns" namespace.

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-07-26 11:34:23 +02:00
Julio Montes
2859600a6f runtime: virtcontainers: make rootfs image read-only
Improve security by making rootfs image read-only, nobody
will be able to modify it from the guest.

fixes #1916

Signed-off-by: Julio Montes <julio.montes@intel.com>
2021-07-23 13:20:42 -05:00
Julio Montes
aec530904b runtime: virtcontainers/utils: fix govet fieldalignment
Fix structures alignment

Signed-off-by: Julio Montes <julio.montes@intel.com>
2021-07-20 11:59:15 -05:00
Julio Montes
1e4f7faa77 runtime: virtcontainers/types: fix govet fieldalignment
Fix structures alignment

Signed-off-by: Julio Montes <julio.montes@intel.com>
2021-07-20 11:59:15 -05:00