Commit Graph

1041 Commits

Author SHA1 Message Date
Fabiano Fidêncio
6f9685fbf5 Merge pull request #3624 from mdlayher/mdl-vsock
runtime: use github.com/mdlayher/vsock@v1.1.0
2022-02-16 23:11:47 +01:00
Fabiano Fidêncio
be2e90469a Merge pull request #3669 from fidencio/wip/virtiofsd-use-announce-submounts
virtiofsd: Use "-o announce_submounts"
2022-02-16 16:43:18 +01:00
bin
81a8baa5e5 runtime: add hugepages support
Add hugepages support, port from:
b486387cba

Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: bin <bin@hyper.sh>
2022-02-16 15:14:53 +08:00
bin
7df677c01e runtime: Update calculateSandboxMemory to include Hugepages Limit
Support hugepages and port from:
96dbb2e8f0

Fixes: #3342

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: bin <bin@hyper.sh>
2022-02-16 15:14:37 +08:00
Fabiano Fidêncio
4bd945b67b virtiofsd: Use "-o announce_submounts"
German Maglione, one of the current virtio-fs developers, has brought to
our attention that using "announce-submounts" could help us to prevent
inode number collisions.

This feature was introduced a year ago or so by Hanna Reitz as part of
the 08dce386e77eb9ab044cb118e5391dc9ae11c5a8, and as we already mandate
QEMU >= 6.1.0, let's take advantage of that.

Fixes: #3507

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-15 08:52:03 +01:00
Fabiano Fidêncio
90fd625d0c versions: Udpate Cloud Hypervisor to 55479a64d237
Let's update cloud-hypervisor to a version that exposes the TDx support
via the OpenAPI's auto-generated code.

Fixes: #3663

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-02-14 17:32:30 +01:00
James O. D. Hunt
8f80dffead Merge pull request #3648 from yaoyinnan/index-in-for
runtime: The index variable is initialized multiple times in for
2022-02-14 12:36:46 +00:00
Bin Liu
cf53ec2c71 Merge pull request #2977 from luodw/support_nydus
feature(nydusd): add nydusd support to introduce lazyload ability
2022-02-14 13:08:50 +08:00
Matt Layher
c1ce67d905 runtime: use github.com/mdlayher/vsock@v1.1.0
Fixes #3625
Signed-off-by: Matt Layher <mdlayher@gmail.com>
2022-02-12 19:57:15 -05:00
yaoyinnan
42a878e6c1 runtime: The index variable is initialized multiple times in for
Change the variables `mountTypeFieldIdx := 8`, `mntDestIdx := 4` and `netNsMountType := "nsfs"` to const.

And unify the variable naming style, modify `mntDestIdx` to `mountDestIdx`.

Fixes: #3646

Signed-off-by: yaoyinnan <yaoyinnan@foxmail.com>
2022-02-12 11:10:10 +08:00
luodaowen.backend
2d9f89aec7 feature(nydusd): add nydusd support to introduse lazyload ability
Pulling image is the most time-consuming step in the container lifecycle. This PR
introduse nydus to kata container, it can lazily pull image when container start. So it
can speed up kata container create and start.

Fixes #2724

Signed-off-by: luodaowen.backend <luodaowen.backend@bytedance.com>
2022-02-11 21:41:17 +08:00
Daniel Höxtermann
b19b6938a8 docs: Fix relative links in Markdown
Relative links within this repository allow for easier navigation to
the corresponding file / directory in the current commit / for the
selected version.

Link text was slightly changed / fixed in
- docs/Unit-Test-Advice.md
- docs/how-to/how-to-run-docker-with-kata.md

Fixes #3045

Signed-off-by: Daniel Höxtermann <daniel@hxtm.dev>
2022-02-11 13:49:42 +01:00
Julio Montes
982f14fa66 runtime: support QEMU SGX
Enable SGX in QEMU when `sgx.intel.com/epc` annotation is defined

fixes #3436

Signed-off-by: Julio Montes <julio.montes@intel.com>
2022-02-10 09:45:48 -06:00
Samuel Ortiz
07b9d93f5f virtcontainer: Simplify the sandbox network creation flow
We don't need to call NewNetwork() twice, and we can have the VM factory
case return immediatly. That makes the code more readable.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
2c7087ff42 virtcontainers: Make all endpoints Linux only
All of the networking endpoints are Linux specific.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
49d2cde1e2 virtcontainers: Split network tests into generic and OS specific parts
Some unit tests are generic while others, mostly because they depend on
netlink, are Linux specific.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
0269077ebf virtcontainers: Remove the netlink package dependency from network.go
Move the netlink dependent code into network_linux.go.
Other OSes will have to provide the same functions.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
7fca5792f7 virtcontainers: Unify Network endpoints management interface
And only have AddEndpoints/RemoveEndpoints for all cases (single
endpoint vs all of them, hotplug or not).

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
c67109a251 virtcontainers: Remove the Network PostAdd method
It's used once by the sandbox code and can be implemented directly
there.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
e0b264430d virtcontainers: Define a Network interface
And move the Linux implementation into a GOOS specific file.

Fixes #3005

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
5e119e90e8 virtcontainers: Rename the Network structure fields and methods
We are converting the Network structure into an interface, so that
different host OSes can have different networking implementations for
Kata.
One step into that direction is to rename all the Network structure
fields and methods to something that is less Linux networking namespace
specific. This will make the Network interface naming consistent.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
b858d0dedf virtcontainers: Make all Network fields private
Prepare for making it a real interface.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
49eee79f5f virtcontainers: Remove the NetworkNamespace structure
It is now replaced with a single Network structure

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
844eb61992 virtcontainers: Have CreateVM use a Network reference
We are replacing the NetworkingNamespace structure with the Network
one, so we should have the hypervisor interface switching to it as well.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
d7b67a7d1a virtcontainers: Network API cleanups and simplifications
Remove unused parameters.
Reduce the number of parameters by deriving some of them (e.g. a
networking config) from their outer structure (e.g. a Sandbox
reference).

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
2edea88369 virtcontainers: Make the Network structure manage endpoints
Endpoints creations, attachement and hotplug are bound to the networking
namespace described through the Network structure.
Making them Network methods is natural and simplifies the code.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Samuel Ortiz
8f48e28325 virtcontainers: Expand the Network structure
For simplicity sake, there should only be one networking structure per
sandbox, as opposed to two (Network and NetworkingNamespace) currently.

This commit start expanding the Network structure in order to eventually
make it the single representation of a virtcontainers sandbox
networking.

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
2022-02-08 22:27:53 +01:00
Pierre Kohler
5ef522f7c3 runtime: check kvm module sev correctly
Runtime now accepts both `1` and `Y` as valid values for
kvm_amd module parameter kvm_amd.sev.

Fixes #3273

Signed-off-by: Pierre Kohler <pierre.kohler@cysec.systems>
2022-02-07 23:48:47 +01:00
Eric Ernst
e8eb5e8295 Merge pull request #3609 from egernst/rootless-linux
virtcontainers: Split the rootless package into OS specific parts
2022-02-03 12:19:31 -08:00
Julio Montes
1f29478b09 runtime: suppport split firmware
firmware can be split into FIRMWARE_VARS.fd (UEFI variables as
configuration) and FIRMWARE_CODE.fd (UEFI program image). UEFI
variables can be customized per each user while UEFI code is kept same.

fixes #3583

Signed-off-by: Julio Montes <julio.montes@intel.com>
2022-02-01 13:40:19 -06:00
Samuel Ortiz
14e7f52a91 virtcontainers: Split the rootless package into OS specific parts
Move the netns specific bits into a Linux specific file.

Fixes: #3607

Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-01-28 16:20:28 -08:00
James O. D. Hunt
7c956e0d27 virtcontainers: Enable initrd for Cloud Hypervisor
Since CH has supported booting with an initramfs since version 0.7.0
[1], allow an `initrd=` to be specified.

Fixes: #3566.

[1] - https://github.com/cloud-hypervisor/cloud-hypervisor/releases/tag/v0.7.0

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-01-28 10:49:10 +00:00
Eric Ernst
a5ebeb96c1 Merge pull request #2941 from egernst/sandbox-sizing-feature
Sandbox sizing feature
2022-01-27 09:37:57 -08:00
Eric Ernst
8cde54131a runtime: introduce static sandbox resource management
There are software and hardware architectures which do not support
dynamically adjusting the CPU and memory resources associated with a
sandbox. For these, today, they rely on "default CPU" and "default
memory" configuration options for the runtime, either set by annotation
or by the configuration toml on disk.

In the case of a single container (launched by ctr, or something like
"docker run"), we could allow for sizing the VM correctly, since all of
the information is already available to us at creation time.

In the sandbox / pod container case, it is possible for the upper layer
container runtime (ie, containerd or crio) could send a specific
annotation indicating the total workload resource requirements
associated with the sandbox creation request.

In the case of sizing information not being provided, we will follow
same behavior as today: start the VM with (just) the default CPU/memory.

If this information is provided, we'll track this as Workload specific
resources, and track default sizing information as Base resources. We
will update the hypervisor configuration to utilize Base+Workload
resources, thus starting the VM with the appropriate amount of CPU and
memory.

In this scenario (we start the VM with the "right" amount of
CPU/Memory), we do not want to update the VM resources when containers
are added, or adjusted in size.

This functionality is introduced behind a configuration flag,
`static_sandbox_resource_mgmt`. This is defaulted to false for all
configurations except Firecracker, which is set to true.

This'll greatly improve UX for folks who are utilizing
Kata with a VMM or hardware architecture that doesn't support hotplug.

Note, users will still be unable to do in place vertical pod autoscaling
or other dynamic container/pod sizing with this enabled.

Fixes: #3264

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-01-26 09:04:38 -08:00
Eric Ernst
c3e97a0a22 config: updates to configuration clh, fc toml template
There's some cruft -- let's update to reflect reality, and ensure that
we match what is expected.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2022-01-26 09:45:50 -08:00
Fabiano Fidêncio
f7c7dc8d33 Merge pull request #3504 from Jakob-Naucke/s390x-govmm-tests
Fix and re-enable s390x GoVMM tests
2022-01-26 12:57:38 +01:00
Archana Shinde
081a235efe Merge pull request #3540 from bradenrayhorn/fix-negative-memory-limit
runtime: fix handling container spec's memory limit
2022-01-25 05:17:05 -08:00
Braden Rayhorn
fc0e095180 runtime: fix handling container spec's memory limit
The OCI container spec specifies a limit of -1 signifies
unlimited memory. Update the sandbox memory calculator
to reflect this part of the spec.

Fixes: #3512

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-01-24 13:30:32 -06:00
Jakob Naucke
016569fd8e Merge pull request #3476 from bergwolf/runtime-dep
runtime: update runc and image-spec dependencies
2022-01-24 15:53:43 +01:00
Peng Tao
5643c6dcae runtime: update runc and image-spec dependencies
To address two depbot security warnings.

Fixes: #3475
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2022-01-24 11:49:05 +08:00
Bo Chen
94b343492d Merge pull request #3520 from likebreath/0120/clh_v21.0
Upgrade to Cloud Hypervisor v21.0
2022-01-21 08:08:13 -08:00
Jakob Naucke
2f37165f46 govmm: Unite VirtioNet tests
no explicit PCI test, just switch path depending on architecture
(CCW for s390x, PCI for others). Also fixes an unknown variable error.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2022-01-21 13:00:05 +01:00
Jakob Naucke
4a428fd1c5 govmm: readonly=on in s390x blkdev test
Forgotten in b17f07395c, also fixes a
test.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2022-01-21 13:00:05 +01:00
Jakob Naucke
79ecebb280 govmm: TestAppendPCIBridgeDevice et al. on !s390x
s390x uses CCW, also fixes a lint failure about undeclared variables on
s390x.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2022-01-21 13:00:05 +01:00
Jakob Naucke
dc285ab1d7 govmm: Remove unnecessary comma in iommu_platform
in FSDevice.QemuParams for VirtioCCW. Forgotten in
ff34d283db, also fixes a test.

Fixes: #3500
Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2022-01-21 13:00:05 +01:00
Jakob Naucke
d23f2eb0f0 govmm: Revert "govmm: s390x: Skip broken tests"
This reverts commit 5ce9011a36.

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2022-01-21 13:00:05 +01:00
Amulya Meka
f52ce302bc runtime: rectify passing empty options to -ldflags
When no options are passed to -ldflags, it passes
incorrect values(in this case, $BUILDFLAGS) to it.
Fix passing empty values by passing $KATA_LDFLAGS
in quotes.

Fixes: #3521

Signed-off-by: Amulya Meka <amulmek1@in.ibm.com>
2022-01-21 06:57:52 +00:00
Bo Chen
2d799cbfa3 virtcontainers: clh: Re-generate the client code
This patch re-generates the client code for Cloud Hypervisor v21.0.
Note: The client code of cloud-hypervisor's (CLH) OpenAPI is
automatically generated by openapi-generator [1-2].

[1] https://github.com/OpenAPITools/openapi-generator
[2] https://github.com/kata-containers/kata-containers/blob/main/src/runtime/virtcontainers/pkg/cloud-hypervisor/README.md

Signed-off-by: Bo Chen <chen.bo@intel.com>
2022-01-20 17:48:10 -08:00
Fabiano Fidêncio
5ce9011a36 govmm: s390x: Skip broken tests
For now a bunch of tests are simply not working.

Let's skip them all, and re-enable them once
kata-containers/kata-containers/issues/3500 gets fixed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-01-20 01:04:35 +01:00
Fabiano Fidêncio
8bcaed0b4f govmm: Adapt license headers to kata-containers
Both projects follow the same license, Apache-2.0, but the header saying
that comes from govmm is different from the one expected for the tests
present on the kata-containers repo.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-01-19 18:02:46 +01:00