Firecracker supports noflush semantic via Unsafe cache type.
There is no support for direct i/o, remove it from config file
Fixes: #7823
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Clh suports direct i/o for disks. It doesn't
offer any support for noflush, removed passing
of option to cloud-hypervisor internal config
Fixes: #7798
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Support quoted kernel command line parameters that include space
characters. Example:
dm-mod.create="dm-verity,,,ro,0 736328 verity 1
/dev/vda1 /dev/vda2 4096 4096 92041 0 sha256
f211b9f1921ef726d57a72bf82be23a510076639fa8549ade10f85e214e0ddb4
065c13dfb5b4e0af034685aa5442bddda47b17c182ee44ba55a373835d18a038"
Fixes: #8003
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
This is part of a bigger effort to drop gogoprotobuff from our code
base. IIUC, those options are basically used by *pb_test.go, and since
we are dropping gogoprotobuff and those are auto generated tests, let's
just remove it.
Fixes#7978.
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Users have noticed that this is needed, as CLH does not yet implement a
way to hotplug resources on aarh64.
With this patch, when building for x86_64, I can see the this is the
resulting config:
```
$ ARCH=amd64 make
...
$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=false
```
And when building for aarch64:
```
$ ARCH=arm64 make
...
$ cat config/configuration-clh.toml | grep static_sandbox_resource_mgmt
static_sandbox_resource_mgmt=true
```
Fixes: #7941
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
PR #6146 added the possibility to control QEMU with an extra HMP socket
as an aid for debugging. This is great for development or bug chasing
but this raises some concerns in production.
The HMP monitor allows to temper with the VM state in a variety of ways.
This could be intentionally or mistakenly used to inject subtle bugs in
the VM that would be extremely hard if not even impossible to debug. We
definitely don't want that to be enabled by default.
The feature is currently wired to the `enable_debug` setting in the
`[hypervisor.qemu]` section of the configuration file. This setting has
historically been used to control "debug output" and it is used as such
by some downstream users (e.g. Openshift). Forcing people to have the
extra HMP backdoor at the same time is abusive and dangerous.
A new `extra_monitor_socket` is added to `[hypervisor.qemu]` to give
fine control on whether the HMP socket is wanted or not. This setting
is still gated by `enable_debug = true` to make it clear it is for
debug only. The default is to not have the HMP socket though. This
isn't backward compatible with #6416 but it is for the sake of "better
safe than sorry".
An extra monitor socket makes the QEMU instance untrusted. A warning is
thus logged to the journal when one is requested.
While here, also allow the user to choose between HMP and QMP for the
extra monitor socket. Motivation is that QMP offers way more options to
control or introspect the VM than HMP does. Users can also ask for
pretty json formatting well suited for human reading. This will improve
the debugging experience.
This feature is only made visible in the base and GPU configurations
of QEMU for now.
Fixes#7952
Signed-off-by: Greg Kurz <groug@kaod.org>
cloud hypervisor on arm64 only support arm AMBA UART(pl011) as
tty. So, the console should be set to "ttyAMA0" instead of "ttyS0"
when enable hypervisor debug mode.
Fixes: #5080
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
by enabling IOMMU on the default PCI segment. For hotplug to work we need a
virtualized iommu and clh exposes one if there is some device or PCI segment
that requests it. I would have preferred to add a separate PCI segment for
hotplugging vfio devices but unfortunately kata assumes there is only one
segment all over the place. See create_pci_root_bus_path(),
split_vfio_pci_option() and grep for '0000'.
Enabling the IOMMU on the default PCI segment requires passing enabling IOMMU on
every device that is attached to it, which is why it is sprinkled all over the
place.
CLH does not support IOMMU for VirtioFs, so I've added a non IOMMU segment for
that device.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
There is no way for this branch to be hit, as port is only set when it is
different than config.NoPort.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
These test cases shows which options are valid for CLH/Qemu, and test that we
correctly catch unsupported combinations.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
The only supported options are hot_plug_vfio=root-port or no-port.
cold_plug_vfio not supported yet.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
hot_plug_vfio needs to be set to root-port, otherwise attaching vfio devices to
CLH VMs fails. Either cold_plug_vfio or hot_plug_vfio is required, and we have
not implemented support for cold_plug_vfio in CLH yet.
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
In the RemoveEndpoints(), when the endpoints paramete isn't empty,
using idx may result in wrong endpoint removals. To improve,
directly passing the endpoint parameter helps
locate the correct elements within n.eps.
Fixes: #7732
Signed-off-by: shixuanqing <1356292400@qq.com>
Fixes: #7732
Signed-off-by: shixuanqing <1356292400@qq.com>
Update src/runtime/virtcontainers/network_linux.go
Co-authored-by: Xuewei Niu <justxuewei@apache.org>
When creating a new endpoint, we check existing endpoint names and automatically adjust the naming of the new endpoint to ensure uniqueness.
Fixes: #7876
Signed-off-by: shixuanqing <1356292400@qq.com>
To support the removal of the `initcall_debug` and `earlyprintk=`
options from the default guest kernel cmdline, add `kernel_params` to the list
of enabled annotations to allow those kernel options (or others) to be
set using `kata-deploy` for either runtime.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
This syntax belongs to the legacy C virtiofsd implementation that
we don't support anymore since kata-containers 3.1.3 because
of other API breaking changes.
People have been warned to switch from "none" to "never" since
kata-containers 2.5.2. Let's officially do that.
The compat code that would convert "none" to "never" isn't
needed anymore. Just drop it.
Fixes#7864
Signed-off-by: Greg Kurz <groug@kaod.org>
gogo.nullable is the main gogo.protobuf' feature used here. Since we are
trying to remove gogo.protobuf, the first reasonable step seems to be
remove this feature. This is a core update, and it will change how the
structs are defined. I could spot only a few places using those structs,
based on make check/build.
Fixes#7723.
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Some use cases may just require passing extra arguments to virtiofsd,
and having this disabled by default makes it impossible to set when
using kata-deploy, as changes in the configuration file would be
overwritten by the daemon-set.
With this in mind, let's allow users to pass whatever thet need (and
here I'm specifically looking at `--xattr`) as a virtio_fs_extra_arg.
Fixes: #7853
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Let's make sure we use the TDX image as part of the QEMU TDX
configuration, which will help us to have the policies tested here.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
IoCopy is a tricky function (I don't claim to fully understand its contract),
but here is what I see: The goroutine that runs it spawns 3 goroutines - one
for each stream to handle (stdin/stdout/stderr). The goroutine then waits for
the stream goroutines to exit. The idea is that when the process exits and is
closed, the stdout goroutine will be unblocked and close stdin - this should
unblock the stdin goroutine. The stderr goroutine will exit at the same time as
the stdout goroutine. The iocopy routine then closes all tty.io streams.
The problem is that the stdout goroutine decrements the WaitGroup before
closing the stdin stream, which causes the iocopy goroutine to race to close
the streams. Move the wg.Done() of the stdout routine past the close so that
*this* race becomes impossible. I can't guarantee that this doesn't affect some
unspecified behavior.
Fixes: #5031
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
If we are running FC hypervisor, it is not started when prestart hooks
are executed. So we should just ignore such error and just go ahead and
run the hooks.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
FC does not support network device hotplug. Let's add a check to fail
early when starting containers created by docker.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Add a new hypervisor capability to tell if it supports device hotplug.
If not, we should run prestart hooks before starting new VMs as nerdctl
is using the prestart hooks to set up netns. To make nerdctl + FC
to work, we need to run the prestart hooks before starting new VMs.
Fixes: #6384
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
When running on amd machines, those tests will fail because there is no
vmx flag. Following other tests that checks for cpuType, let's adapt
them to restrict vmx only on Intel machines.
Fixes#7788.
Related #5066
Signed-off-by: Beraldo Leal <bleal@redhat.com>
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.
Make sure annotations are preferred over config options in image and initrd
path handling.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We should make sure annotations are preferred over
config options in image and initrd path handling.
Fixes: #7705
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.
Add a helper function ImageOrInitrdAssetPath to make sure annotations
are preferred over config options in image and initrd path handling.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Fixes: #7573
To enable this feature, build your rootfs using AGENT_POLICY=yes. The
default is AGENT_POLICY=no.
Building rootfs using AGENT_POLICY=yes has the following effects:
1. The kata-opa service gets included in the Guest image.
2. The agent gets built using AGENT_POLICY=yes.
After this patch, the shim calls SetPolicy if and only if a Policy
annotation is attached to the sandbox/pod. When creating a sandbox/pod
that doesn't have an attached Policy annotation:
1. If the agent was built using AGENT_POLICY=yes, the new sandbox uses
the default agent settings, that might include a default Policy too.
2. If the agent was built using AGENT_POLICY=no, the new sandbox is
executed the same way as before this patch.
Any SetPolicy calls from the shim to the agent fail if the agent was
built using AGENT_POLICY=no.
If the agent was built using AGENT_POLICY=yes:
1. The agent reads the contents of a default policy file during sandbox
start-up.
2. The agent then connects to the OPA service on localhost and sends
the default policy to OPA.
3. If the shim calls SetPolicy:
a. The agent checks if SetPolicy is allowed by the current
policy (the current policy is typically the default policy
mentioned above).
b. If SetPolicy is allowed, the agent deletes the current policy
from OPA and replaces it with the new policy it received from
the shim.
A typical new policy from the shim doesn't allow any future SetPolicy
calls.
4. For every agent rpc API call, the agent asks OPA if that call
should be allowed. OPA allows or not a call based on the current
policy, the name of the agent API, and the API call's inputs. The
agent rejects any calls that are rejected by OPA.
When building using AGENT_POLICY_DEBUG=yes, additional Policy logging
gets enabled in the agent. In particular, information about the inputs
for agent rpc API calls is logged in /tmp/policy.txt, on the Guest VM.
These inputs can be useful for investigating API calls that might have
been rejected by the Policy. Examples:
1. Load a failing policy file test1.rego on a different machine:
opa run --server --addr 127.0.0.1:8181 test1.rego
2. Collect the API inputs from Guest's /tmp/policy.txt and test on the
machine where the failing policy has been loaded:
curl -X POST http://localhost:8181/v1/data/agent_policy/CreateContainerRequest \
--data-binary @test1-inputs.json
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Remove the installation step in the virtcontainers doc
because the virtcontainers install/uninstall targets have
been removed by 86723b51ae
and they are not used anymore.
Fixes: #7637
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Remove configuration file shared_fs = none warnings
now that there is a solution to updating configMaps, secrets etc
Fixes: #7210
Signed-off-by: stevenhorsman <steven@uk.ibm.com>
For remote hypervisor, the configmap, secrets, downward-api or project-volumes are
copied from host to guest. This patch watches for changes to the host files
and copies the changes to the guest.
Note that configmap updates takes significantly longer than updates via downward-api.
This is similar across runc and Kata runtimes.
Fixes: #7210
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Julien Ropé <jrope@redhat.com>
(cherry picked from commit 3081cd5f8e)
(cherry picked from commit 68ec673bc4d9cd853eee51b21a0e91fcec149aad)