LaunchCustomQemu() currently starts QEMU with cmd.Run() which is
supposed to block until the child process terminates. This assumes
that QEMU daemonizes itself, otherwise LaunchCustomQemu() would
block forever. The virtcontainers package indeed enables the
Daemonize knob in the configuration but having such an implicit
dependency on a supposedly configurable setting is ugly and fragile.
cmd.Run() is :
func (c *Cmd) Run() error {
if err := c.Start(); err != nil {
return err
}
return c.Wait()
}
Let's open-code this : govmm calls cmd.Start() and returns the
cmd to virtcontainers which calls cmd.Wait().
If QEMU doesn't start, e.g. missing binary, there won't be any
errors to collect from QEMU output. Just drop these lines in govmm.
Similarily there won't be any log file to read from in virtcontainers.
Drop that as well.
Signed-off-by: Greg Kurz <groug@kaod.org>
Running QEMU daemonized ensures that the QMP socket is ready to
accept connections when LaunchQemu() returns. In order to be
able to run QEMU undaemonized, let's handle that part upfront.
Create a listener socket and connect to it. Pass the listener
to QEMU and pass the connected socket to QMP : this ensures
that we cannot fail to establish QMP connection and that we
can detect if QEMU exits before accepting the connection.
This is basically what libvirt does.
Signed-off-by: Greg Kurz <groug@kaod.org>
QEMU's -qmp option can be passed the file descriptor of a socket that
is already in listening mode. This is done with by passing `fd=XXX`
to `-qmp` instead of a path. Note that these two options are mutually
exclusive : QEMU errors out if both are passed, so we check that as
well in the validation function.
While here add the `path=` stanza in the path based case for clarity.
Signed-off-by: Greg Kurz <groug@kaod.org>
When QEMU is launched daemonized, we have the guarantee that the
QMP socket is available. In order to launch a non-daemonized QEMU,
the QMP connection should be created before QEMU is started in order
to avoid a race. Introduce a variant of QMPStart() that can use such
an existing connection.
Signed-off-by: Greg Kurz <groug@kaod.org>
Use pidfd_open and poll on newer versions of Linux to wait
for the process to exit. For older versions use existing wait logic
Fixes: #5617
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Now we are supporting two runtime/shim, the go version,
and the rust version, for debug purposes, we can
add an identification in the version info
to tell us which runtime/shim is used.
Fixes: #5806
Signed-off-by: Bin Liu <bin@hyper.sh>
Pass SELinux policy for containers to the agent if `disable_guest_selinux`
is set to `false` in the runtime configuration. The `container_t` type
is applied to the container process inside the guest by default.
Users can also set a custom SELinux policy to the container process using
`guest_selinux_label` in the runtime configuration. This will be an
alternative configuration of Kubernetes' security context for SELinux
because users cannot specify the policy in Kata through Kubernetes's security
context. To apply SELinux policy to the container, the guest rootfs must
be CentOS that is created and built with `SELINUX=yes`.
Fixes: #4812
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
We have starting to use golang 1.19, some features are
not supported later, so run `go fix` to fix them.
Fixes: #5750
Signed-off-by: Bin Liu <bin@hyper.sh>
Use MkdirAll instead of Mkdir so it doesn't generate an
error when the folder is created by another process
Fixes#5713
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
When the user tried to add new devices to the VM, there is no error info for the invalid
device. This PR adds a log record to the `appendDevices` for the invalid device of the
qemu config.
Fixes: #5719
Signed-off-by: wangyongchao.bj <wangyongchao.bj@inspur.com>
Let's follow the binary bump used in the CI and also bump the vendored
version of containerd to v1.6.8.
Fixes: #5722
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The default vhost-user-fs queue-size of qemu is 128 now. Set it to 1024
by default which is same as clh. Also make this value configurable.
Fixes: #5694
Signed-off-by: liyuxuan.darfux <liyuxuan.darfux@bytedance.com>
This patch re-generates the client code for Cloud Hypervisor v28.0.
Note: The client code of cloud-hypervisor's OpenAPI is automatically
generated by openapi-generator.
Fixes: #5683
Signed-off-by: Bo Chen <chen.bo@intel.com>
It seems that bumping the version of golang and golangci-lint new format
changes are required.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
The package has been deprecated as part of 1.16 and the same
functionality is now provided by either the io or the os package.
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
So that we get the latest language fixes.
There is little use to maitain compiler backward compatibility.
Let's just set the default golang version to the latest 1.19.2.
Fixes: #5494
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Through proactively checking if Cloud Hypervisor process is dead,
this patch provides a faster path for isClhRunning
Fixes: #5623
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Use atomic operations instead of acquiring a mutex in isClhRunning.
This stops isClhRunning from generating a deadlock by trying to
reacquire an already-acquired lock when called via StopVM->terminate.
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Avoid executing StopVM concurrently when virtiofs dies as a result of clh
being stopped in StopVM.
Fixes: #5622
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
1. Implemented a rust module for operating cgroups through systemd with the help of zbus (src/agent/rustjail/src/cgroups/systemd).
2. Add support for optional cgroup configuration through fs and systemd at agent (src/agent/rustjail/src/container.rs).
3. Described the usage and supported properties of the agent systemd cgroup (docs/design/agent-systemd-cgroup.md).
Fixes: #4336
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
An API change, done a long time ago, has been exposed on Cloud
Hypervisor and we should update it on the Kata Containers side to ensure
it doesn't affect Cloud Hypervisor CI and because the change is needed
for an upcoming work to get QAT working with Cloud Hypervisor.
Fixes: #5492
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Currently ACRN hypervisor support in Kata2.x releases is broken.
This commit re-enables ACRN hypervisor support and also refactors
the code so as to remove dependency on Sandbox.
Fixes#3027
Signed-off-by: Vijay Dhanraj <vijay.dhanraj@intel.com>
The containerd stats method and metrics API are broken with Kata 2.5.x, the stats fail to load and the metrics API responds with status code 500
This seems to be down to the conversion from the stats reported by the agent RPC `StatsContainer` where the field `Pagesize` is not
completed by the `setHugetlbStats` method. In the case where multiple sized tables stats are reported, this causes containerd to register two metrics
with the same label set, rather than each being partitioned by the `page` label.
Fixes: #5316
Signed-off-by: Champ-Goblem <cameron@northflank.com>
The new way to boot from TDX firmware (e.g. td-shim) is using the
combination of '--platform tdx=on' with '--firmware tdshim'.
Fixes: #5309
Signed-off-by: Bo Chen <chen.bo@intel.com>