Commit Graph

124 Commits

Author SHA1 Message Date
Fabiano Fidêncio
fa9dd46041 ci: k8s: Don't set cpu limit request for k8s-inotofy test
Without setting the cpu limit / request to 1, we can make this test run
in a smaller VM instance without any issue.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 92fff129fd)
2023-09-21 14:10:57 +02:00
Fabiano Fidêncio
6abf513f06 ci: docker: nerdtl: Use io.containerd.kata-${KATA_HYPERVISOR}.io
This will ensure that we're calling the correct binary for the
hypervisor.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 813bfdec01)
2023-09-21 13:50:15 +02:00
Fabiano Fidêncio
9a664ea8bb ci: nerdctl: Create the containerd config
Otherwise we'll fail to configure kata-containers in the `install-kata`
step.

This is mostly needed because the nerdctl-full tarball doesn't provide a
contaienrd configuration, just the binary, as contaienrd does not
actually require a configuration file to run with the default config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 46bc0b1c01)
2023-09-21 13:50:10 +02:00
Fabiano Fidêncio
5734c4cbca ci: nerdctl: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 13968aa7f6)
2023-09-21 13:50:00 +02:00
Fabiano Fidêncio
55c8a47a40 ci: docker: Switch to tcp port 80 ping
TIL that the Azure VMs we use are created without an explicit outbund
connectivity defined.

This leads us to issues using `ping ...` as part of our tests, and when
consulting Jeremi Piotrowski about the issue he pointed me out to two
interesting links:
* https://learn.microsoft.com/en-us/azure/virtual-network/ip-services/default-outbound-access
* https://learn.microsoft.com/en-us/archive/blogs/mast/use-port-pings-instead-of-icmp-to-test-azure-vm-connectivity

For your own sanity, do not read the comments, after all this is
internet. :-)

Anyways, the suggestion is to use nping instead, which is provided by
the nmap package, so we can explicitly switch to using the tcp port 80
for the ping.  With this in mind, I'm switching the image we use for the
test and using one that provided nping as a possible entry point, and
from now on (this part of) the tests should work.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e0c811678b)
2023-09-21 13:49:54 +02:00
Fabiano Fidêncio
e5e3951398 ci: docker: Also run the smoke test with runc
This will help us to make sure that the failure is actually related to
Kata Containers.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f536ef5ce1)
2023-09-21 13:46:53 +02:00
Fabiano Fidêncio
33430ad60c ci: Add a very basic nerdctl sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using nerdctl + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7911

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 12d833d07d)
2023-09-21 13:46:37 +02:00
Fabiano Fidêncio
69dd11f459 ci: Add a very basic docker sanity test
Let's add a very basic sanity test to check that we can spawn a
containers using docker + Kata Containers.

This will ensure that, at least, we don't regress to the point where
this feature doesn't work at all.

For now we're running this test against Cloud Hypervisor and QEMU only,
due to an already reported issue with dragonball:
https://github.com/kata-containers/kata-containers/issues/7912

In the future, we should also test all the VMMs with devmapper, but
that's for a follow-up PR after this test is working as expected.

Fixes: #7910

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 348b8644d6)
2023-09-21 13:46:28 +02:00
Fabiano Fidêncio
bb5dbfbbce k8s: ci: Skip "Pod quota" test with firecracker
The test is failing, and an issue has been opened to track it.
For now, let's skip it.

Issue:
https://github.com/kata-containers/kata-containers/issues/7873

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 9d74b7ccc9)

 Conflicts:
	tests/integration/kubernetes/k8s-pod-quota.bats
2023-09-21 13:43:16 +02:00
Fabiano Fidêncio
263ed4afd1 ci: k8s: Remove useless skip statement from tests
There's absolutely no need to have the skip check as part of the test
itself when it's already done as part of the setup function.

We're only touching the files here that were touched in the previous
commit.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit f6cd3930c5)
2023-09-21 13:42:24 +02:00
Fabiano Fidêncio
7e135294a7 ci: k8s: Also check for "fc" (for firecracker)
Let's keep both checks for now, but in the future we'll be able to
remove the check for "firecracker", as the hypervisor name used as part
of the GitHub Actions has to match what's used as part of the
kata-deploy stuff, which is `fc` (as in `kata-fc for the runtime class)
instead of `firecracker`.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3cc20b47a6)
2023-09-21 13:42:17 +02:00
Fabiano Fidêncio
8892d9a7b2 ci: k8s: Add clean-up-garm argument for gha-run.sh
The tests are failing to finish as the argument is invalid.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b5bad3cb0f)
2023-09-21 13:42:06 +02:00
Fabiano Fidêncio
aee6f36c86 ci: k8s: Add a kata-deploy-garm target
We've been using the `kata-deploy-tdx` target as that also uses k3s as
base, but it's better to just have a specific garm target.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 27fa7d828d)
2023-09-21 13:41:52 +02:00
Fabiano Fidêncio
5bb77b628d ci: k8s: Export KUBERNETES env var
So we have a better control on which flavour of kubernetes kata-deploy
is expected to be targetting.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit fa62a4c01b)
2023-09-21 13:41:45 +02:00
Fabiano Fidêncio
9fb291d88a ci: k8s: Wait some time after restarting k3s
Let's put a 1 minute sleep, just to make sure everything is back up
again.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 3de23034f8)
2023-09-21 13:41:31 +02:00
Fabiano Fidêncio
89345b6731 ci: k8s: Append, instead of overwrite, the devmapper config
As we were using `tee` without the `-a` (or `--apend`) aptton, the
containerd config would be overwritten, leading to a NotReady state of
the Node.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 2df183fd99)
2023-09-21 13:41:01 +02:00
Fabiano Fidêncio
bb675f8101 ci: k8s: Decrease k3s sleep from 4 to 2 minutes
It should be plenty, and worked well in local tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 369a8af8f7)
2023-09-21 13:40:56 +02:00
Fabiano Fidêncio
695c7162ef ci: k8s: Use vanilla kubectl with k3s
Let's download the vanilla kubectl binary into `/usr/bin/`, as we need
to avoid hitting issues like:
```sh
error: open /etc/rancher/k3s/k3s.yaml.lock: permission denied
```

The issue basically happens because k3s links `/usr/local/bin/kubectl`
to `/usr/local/bin/k3s`, and that does extra stuff that vanilla
`kubectl` doesn't do.

Also, in order to properly use the k3s.yaml config with the vanilla
kubectl, we're copying it to ~/.kube/config.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ada65b988a)
2023-09-21 13:40:51 +02:00
Fabiano Fidêncio
7f865be398 ci: k8s: Ensure k3s is deploy with --write-kubeconfig-mode=644
Otherwise the /etc/rancher/k3s/k3s.yaml is not readable by other users
than root.

As --write-config-mode is being passed, and that's an option that has to
be passed to the `server`, -s is also added to the command line.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ad45ab5d33)
2023-09-21 13:40:45 +02:00
Fabiano Fidêncio
7a96d0a589 ci: k8s: Use the proper command for sleep
`wait` waits for a job to complete, not a number of seconds.  Not sure
how I got that wrong in the first place, but it's what it's.

Fixes: #6542

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 028a97e0d5)
2023-09-21 13:40:38 +02:00
Fabiano Fidêncio
a41a56e326 ci: k8s: Add a function to configure devmapper for containerd
This function right now is completely based on what's part of the tests
repo[0], and that's the reason I'm keeping the `Signed-off-by` of all
the contributors to that file.

This is not perfect, though, as it changes the default snapshotter to
devmapper, instead of only doing so for the Kata Containers specific
runtime handlers.  OTOH, this is exactly what we've always been doing as
part of the tests.

We'll improve it, soon enough, when we get to also add a way for
kata-deploy to set up different snapshotters for different handlers.
But, for now, this is as good (or as bad) as it's always been.

It's important to note that the devmapper setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Signed-off-by: Shiming Zhang <wzshiming@foxmail.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Gabriela Cervantes <gabriela.cervantes.tellez@intel.com>
(cherry picked from commit b28b54df04)
2023-09-21 13:39:05 +02:00
Fabiano Fidêncio
315288a000 ci: k8s: Add a function to deploy k3s
One can use different kubernetes flavours for getting a kubernetes
cluster up and running.

As part of our CI, though, I really would like to avoid contributors
spending time maintaining and updating kubernetes dependencies, as done
with the tests repo, and which has been proven to be really good on
getting things rotten.

With this in mind, I'm taking the bullet and using "k3s" as the way to
deploy kubernetes for the devmapper related tests, and that's the reason
I'm adding a function to do so, and this will be used later on as part
of this series.

It's important to note that the k3s setup doesn't take into
consideration a BM machine, and this is not suitable for that.  We're
really only targetting GHA runners which will be thrown away after the
run is over.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 54f7117212)
2023-09-21 13:38:56 +02:00
Fabiano Fidêncio
2684b267f7 tests: Expand confidential test to support TDX
Let's expand the confidential test to also support TDX.

The main difference on the test, though, is that we're not grepping for
a string in the `dmesg` output, but rather relying on `cpuid` to detect
a TDX guest.

Fixes: #7184

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit e286e842c1)
2023-09-21 13:27:24 +02:00
Unmesh Deodhar
4976629aee tests: Expand confidential test to support SNP
Let's expand the confidential test to also support SNP.

Fixes: #7184

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
(cherry picked from commit e31f099be1)
2023-09-21 13:27:18 +02:00
Unmesh Deodhar
019849071e tests: Add confidential test for SEV
Add a test case for the launch of unencrypted confidential
container, verifying that we are running inside a TEE.

Right now the test only works with SEV, but it'll be expanded in the
coming commits, as part of this very same series.

Fixes: #7184

Signed-Off-By: Unmesh Deodhar <udeodhar@amd.com>
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit c3b9d4945e)
2023-09-21 13:27:11 +02:00
Dan Mihai
17d22cae34 tests: use unique test name
k8s-pid-ns.bats was already using the test name from
k8s-kill-all-process-in-container.bats - probably a copy/paste bug.

Fixes: #7753

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 183f51d6f6)
2023-09-21 13:26:18 +02:00
Dan Mihai
e8c24fa0b9 tests: delete k8s deployment at the test's end
At the end of k8s-kill-all-process-in-container.bats, delete the
deployment it created.

Fixes: #7752

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 6a974679f2)
2023-09-21 13:26:06 +02:00
Dan Mihai
508f1bba15 gha: capture additional kata-deploy output
10 lines can be insufficient for diagnostics.

Fixes: #7707

Signed-off-by: Dan Mihai <dmihai@microsoft.com>
(cherry picked from commit 400eb88743)
2023-09-21 13:23:48 +02:00
Fabiano Fidêncio
b38624e2b3 tests: common: Ensure test_type is used as part of the cluster's name
By doing this we can make sure there won't be any clash on the cluster
name created for either the k8s or the kata-deploy tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 285e616b5e)
2023-09-21 13:22:40 +02:00
Fabiano Fidêncio
d7130f48b0 gha: k8s: Stop running kata-deploy tests as part of the k8s suite
In a follow-up series, we'll add a whole suite for the kata-deploy
tests.  With this in mind, let's already get rid of this one and avoid
more kata-deploy tests to land here.

Fixes: #7642

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit cfc29c11a3)
2023-09-21 13:22:21 +02:00
Aurélien Bombo
810507e8a3 tests: k8s: Call ensure_yq() in setup.sh
It wasn't the `common.bash` import in `run_kubernetes_tests.sh` causing
the yq error so let's try this instead.

Reference: https://github.com/kata-containers/kata-containers/actions/runs/5674941359/job/15379797568#step:10:341

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
(cherry picked from commit f4dd152863)
2023-09-21 13:22:10 +02:00
Aurélien Bombo
915bace795 kata-deploy: Properly create default runtime class
The default `kata` runtime class would get created with the `kata`
handler instead of `kata-$KATA_HYPERVISOR`. This made Kata use the wrong
hypervisor and broke CI.

Fixes: #7663

Signed-off-by: Aurélien Bombo <abombo@microsoft.com>
(cherry picked from commit 339569b69c)
2023-09-21 13:22:00 +02:00
Fabiano Fidêncio
7474e50ae2 gha: cri-containerd: Enable tests
As the cri-containerd tests have been fully migrated to GHA, let's make
sure we get them running.

Fixes: #6543

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b3592ab25c)
2023-09-21 13:19:36 +02:00
Fabiano Fidêncio
20be3d93d5 gha: cri-containerd: Add timeout to the crictl calls on testContainerStop
As part of the runners, we're hitting a timeout that I cannot reproduce,
at all, when allocating the same instance and running the tests
manually.

The default timeout to connect to the server is 2s when using `crictl`.
Let's increase this to 20s.

It's fairly important to mention that in the first tests I used a
timeout of 10s, and that helped but we still hit issues every now and
then.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 84dd02e0f9)
2023-09-21 13:19:28 +02:00
Fabiano Fidêncio
10058f718a gha: cri-containerd: Show pod before deleting it
It'll help us to debug failures with the pod stop / pod delete.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit b29782984a)
2023-09-21 13:19:22 +02:00
Fabiano Fidêncio
585d5fba03 gha: cri-containerd: Print kata logs in case of error
We need this to fully understand what are the issues we're facing.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ae0930824a)
2023-09-21 13:19:17 +02:00
Fabiano Fidêncio
2fea5a5f8b gha: cri-containerd: Group containerd logs
This improves readability in case of failures by a lot.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 6c8b2ffa60)
2023-09-21 13:19:11 +02:00
Fabiano Fidêncio
3c7597f4ba gha: cri-containerd: Ensure RUNTIME takes KATA_HYPERVISOR into account
Short commit log says it all.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 9e898701f5)
2023-09-21 13:19:04 +02:00
Fabiano Fidêncio
c19cebfa80 tests: Add gha-run-k8s-common.sh
Let's split a good portion of `tests/integration/kuberentes/gha-run.sh`
out, and put them in a place where they can be used to the soon-to-come
kata-deploy specific tests.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit af1b46bbf2)
2023-09-21 13:18:07 +02:00
Fabiano Fidêncio
dcc35781f7 ci: unencrypted-image: Don't fail to build on s390x
Let's make sure that we don't fail in case we're building non x86_64.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit eb463b38ec)
2023-09-21 13:15:08 +02:00
Fabiano Fidêncio
218d83bd3f tests: k8s: Ensure the runtime classes are properly created
With these 2 simple checks we can ensure that we do not regress on the
behaviour of allowing the runtime classes / default runtime class to be
created by the kata-deploy payload.

Fixes: #7491

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit 034d7aab87)
2023-09-21 13:13:34 +02:00
Fabiano Fidêncio
6ae591c618 ci: k8s: Add the image used for unencrypted confidential tests
Let's add here the image we'll be using for unencrypted confidential
tests.  Later on, we'll make sure to build and use this image as part of
our CI.

The image can easily be built as a multi-arch image, and has `cpuid`
installed in case of `x86_64` build, so it can be used to detect whether
we're running on a TEE guest without having to rely on `dmesg | grep
...`.

Fixes: #7595

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
(cherry picked from commit ab5f603ffa)
2023-09-21 13:13:16 +02:00
Unmesh Deodhar
8d4f9ef256 tests: upgrade bats version
Instead of using package manager to install bats, building
this from source. This gives us the updated version of bats
which supports functions such as setup_file and
teardown_file.
We can use these functions into our current tests.

Fixes: #7597

Signed-off-by: Unmesh Deodhar <udeodhar@amd.com>
(cherry picked from commit aeaec9dae9)
2023-09-21 13:12:56 +02:00
Fabiano Fidêncio
f910c66d6f ci: k8s: Do not fail when gathering info on AKS nodes
Otherwise the VM deletion may not delete, leaving us with several
machines behind.

Fixes: #7509

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-08-01 12:36:33 +02:00
Aurélien
e8f8641988 Merge pull request #7132 from sprt/aks-volume-tests
tests: Add `k8s-volume` and `k8s-file-volume` tests to GHA CI
2023-07-28 08:58:03 -07:00
Fabiano Fidêncio
8353aae41a ci: k8s: Rework get_nodes_and_pods_info()
The amount of info we've added seemed unnecessary, and ends up making
our lives even harder when trying to find errors.

Let's just rely on the kata-debug container to collect the needed info
for us.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
6ad5d7112e ci: k8s: Do not gather node info before running the tests
It's been proven to not be useful, and ends up making things more
confusing due to the amount of logs printed.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
5261e3a60c ci: k8s: Group messages to improve readability
Right now is getting way too easy to get lost in the logs.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
9cc6b5f461 ci: k8s: Get logs from kata-deploy
Let's make sure we can debug kata-deploy in case something goes wrong
during its execution.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00
Fabiano Fidêncio
9d285c6226 ci: k8s: Let kata-deploy take care of the runtimeclasses
By doing this we can test the change done for the daemonset. :-)

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2023-07-28 10:04:33 +02:00