DefaultMaxVCPUs may be larger than the defaultMaxQemuVCPUs that should
be checked and avoided.
Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The physical current vcpu number should not be used directly as the
largest vcpu number is limited to defaultMaxQemuVCPUs.
Here, a new helper is introduced in pkg/katautils/config.go to get
current vcpu number.
Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The current kernel version parse lib can't process suffix '+', as the
modified kernel version will add '+' as suffix, thus panic will occur.
For example, if the current kernel version is "5.14.0-rc4+", test
TestHostNetworkingRequested will panic:
--- FAIL: TestHostNetworkingRequested (0.00s)
panic: &{DistroName:ubuntu DistroVersion:18.04
KernelVersion:5.11.0-rc3+ Issue: Passed:[] Failed:[] Debug:true
ActualEUID:0}: failed to check test constraints: error: Build meta data
is empty
Here, remove the suffix '+' in kernel version fix helper.
Fixes: #2809
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
The 'quiet' kernel parameter can avoid guest kernel logs while booting,
which can reduce boot time.
Fix: #2820
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 7b2bfd4eca)
We will need to have console output from the guest only for debugging
purposes. As a result, we can turn-off both the serial and
virtio-console devices by default for better boot time.
Fixes: #2820
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 3e24e46c70)
When updating the guest routing table, we should forward the IP family
information up to the guest.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
(cherry picked from commit 71ce6cfe9e)
Our check for the IP family is working as long as we have either a
gateway or a destination IP. Some routes are missing both.
The RT netlink messages provide the IP family information for each
route, so we can carry that piece of information up to the guest. That
will allow for a more reliable route IP family determination.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
(cherry picked from commit 99450bd1f7)
We need to be able to get the IP family from the netlink route meesages,
and the Route.Family field only got recently added to the netlink
package.
The update generates static check warnings about the call for
nethandler.Delete() being deprecated in favor of a Close() call instead.
So we include the s/Delete()/Close()/ change as part of this PR.
Signed-off-by: Samuel Ortiz <s.ortiz@apple.com>
(cherry picked from commit f85fe70231)
Even CCA, which is the confidential compute archtecture, has not been
ready, add a empty implementation to avoid static check error.
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Suggested-by: Fabiano Fidêncio <fidencio@redhat.com>
Bump containerd to v1.5.7 in order to bring in a fix for CVE-2021-41103,
"insufficiently restricted permissions on plugins directories
(GHSA-c2h3-6mxw-7mvq)".
dependabot found a potential security vulnerability and raised a PR to
fix it. However, dependabot does not properly follows nor understands
the needed of our CIs (mainly related to formatting the PR and whatnot),
thus I'm re-raising it.
Fixes: #2796
Backports: #2797
Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
Since we now have "unix://" kind of socket returned by the
SocketAddress() function, there is no more need to build the sandbox
storage path dynamically to keep OS compatibility.
Fixes: #2738
Suggested-by: Christophe de Dinechin <dinechin@redhat.com>
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit 2304a59601)
We now support any container engine CRI compliant. Let's bump the
kata-monitor version to 0.2.0.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit 8b0bc1f45e)
This commit stops the container engine polling in favor of
the kata sandbox storage path monitoring.
The pod cache list is now refreshed based on fs events and synced with
the container engine only when needed.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit bfb556d56a)
When the container engine is different than containerd or CRI-O we
lack proper detection of kata workloads and consider all the pods as
kata ones.
Instead of querying the container engine for the lower level runtime
used in each pod, check if a directory matching the pod exists in
the virtualcontainers sandboxes storage path.
This provides a container engine independent way to check for kata pods.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit 0e854f3b80)
Retrieve the absolute sandbox storage path. We will soon need this to
monitor the creation/deletion of new kata sandboxes.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit afad910d0e)
The storage path we use to collect the sandbox files is defined in the
virtcontainers/persist/fs package.
We create the runtime socket in that storage path, by hardcoding the
full path in the SocketAddress() function in the runtime package.
This commit splits the hardcoded path by the socket address path so that
the runtime package will be able to provide the storage path to all the
components that may need it.
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit e38686f74d)
In order to retrieve the list of sandboxes, we poll the container engine
every 15 seconds via the CRI. Once we have the list we have to inspect
each pod to find out the kata ones.
This commit extend the sandbox cache to keep track of all the pods,
marking the kata ones, so that during the next polling only the new
sandboxes should be inspected to figure out which ones are using the
kata runtime.
Fixes: #2563
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit 245a12bbb7)
this is an unexpected event (likely a change in how containerd/cri-o
record the lower level runtime in the pod) and should be more visible:
raise the log level to "warning".
Signed-off-by: Francesco Giudici <fgiudici@redhat.com>
(cherry picked from commit fc067d61d4)
add a comment to explicitly mentioned method is a binary call
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
Backport from commit 72e3538e36
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
A discussion on the Linux kernel mailing list [1] exposed that virtiofsd makes a
core assumption that the file systems being shared are not accessible by any
non-privileged user. We currently create the `shared` directory in the sandbox
with the default `0750` permissions, which gives read and directory traversal
access to the group. There is no real good reason for a non-root user to access
the shared directory, and this is potentially dangerous.
Fixes: #2589
[1]: https://lore.kernel.org/linux-fsdevel/YTI+k29AoeGdX13Q@redhat.com/
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
In getThreadIDs(), the cpuID variable is derived from a string that
already contains a whitespace. As a result, strings.SplitAfter returns
the cpuID with a leading space. This makes any go variant of string to int
fail (strconv.ParseInt() in our case). This patch makes sure that the
leading space character is removed so the string passed to
strconv.ParseInt() is "CPUID" and not " CPUID".
This has been caused by a change in the naming scheme of vcpu threads
for Firecracker after v0.19.1.
Fixes: #2592
Signed-off-by: Anastassios Nanos <ananos@nubificus.co.uk>
Change logger in Trace call in newContainer from sandbox.Logger() to
nil. Passing nil will cause an error to be logged by kataTraceLogger
instead of the sandbox logger, which will avoid having the log message
report it as part of the sandbox subsystem when it is part of the
container subsystem.
The kataTraceLogger will not log it as related to the container
subsystem, but since the container logger has not been created at this
point, and we already use the kataTraceLogger in other instances where a
subsystem's logger has not been created yet, this PR makes the call
consistent with other code.
Backport of #2666Fixes#2667
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
Call StopTracing with s.rootCtx, which is the root context for tracing,
instead of s.ctx, which is parent to a subset of trace spans.
Backport of #2662Fixes#2663
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
If the device has no permission, such as /dev/null, /dev/urandom,
it needs to be added into cgroup.
Fixes: #2615
Backport: #2616
Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
adds the default devices for unix such as /dev/null, /dev/urandom to
the container's resource cgroup spec
Fixes: #2539
Backports: #2603
Signed-off-by: Binbin Zhang <binbin36520@gmail.com>
Two default values defined in the 'cloud-hypervisor.yaml' have typo, and this
patch manually overwrites them with the correct value as a workaround
before the corresponding fix is landed to Cloud Hypervisor upstream.
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit 932ee41b3f)
This patch fixes the unit tests over clh.go with the updated client code.
Signed-off-by: Bo Chen <chen.bo@intel.com>
(cherry picked from commit bff38e4f4d)