This patch allows copying of directories and symlinks when
static file copying is used between host and guest. This change is
necessary to support recursive file copying between shim and agent.
Signed-off-by: Yohei Ueda <yohei@jp.ibm.com>
(cherry picked from commit de232b8030)
For remote hypervisor, the configmap, secrets, downward-api or project-volumes are
copied from host to guest. This patch watches for changes to the host files
and copies the changes to the guest.
Note that configmap updates takes significantly longer than updates via downward-api.
This is similar across runc and Kata runtimes.
Fixes: #7210
Signed-off-by: Pradipta Banerjee <pradipta.banerjee@gmail.com>
Signed-off-by: Julien Ropé <jrope@redhat.com>
(cherry picked from commit 3081cd5f8e)
(cherry picked from commit 68ec673bc4d9cd853eee51b21a0e91fcec149aad)
This patch upgrades Firecracker version from v1.1.0 to v1.4.0.
* Generate swagger models for v1.4.0 (from `firecracker.yaml`)
- The version of go-swagger used is v0.30.0
* The firecracker v1.4.0 includes the following changes.
- Added
* Added support for custom CPU templates allowing users to adjust vCPU features
exposed to the guest via CPUID, MSRs and ARM registers.
* Introduced V1N1 static CPU template for ARM to represent Neoverse V1 CPU
as Neoverse N1.
* Added support for the virtio-rng entropy device. The device is optional. A
single device can be enabled per VM using the /entropy endpoint.
* Added a cpu-template-helper tool for assisting with creating and managing
custom CPU templates.
- Changed
* Set FDP_EXCPTN_ONLY bit (CPUID.7h.0:EBX[6]) and ZERO_FCS_FDS bit
(CPUID.7h.0:EBX[13]) in Intel's CPUID normalization process.
- Fixed
* Fixed feature flags in T2S CPU template on Intel Ice Lake.
* Fixed CPUID leaf 0xb to be exposed to guests running on AMD host.
* Fixed a performance regression in the jailer logic for closing open file
descriptors.
* A race condition that has been identified between the API thread and the VMM
thread due to a misconfiguration of the api_event_fd.
* Fixed CPUID leaf 0x1 to disable perfmon and debug feature on x86 host.
* Fixed passing through cache information from host in CPUID leaf 0x80000006.
* Fixed the T2S CPU template to set the RRSBA bit of the IA32_ARCH_CAPABILITIES
MSR to 1 in accordance with an Intel microcode update.
* Fixed the T2CL CPU template to pass through the RSBA and RRSBA bits of the
IA32_ARCH_CAPABILITIES MSR from the host in accordance with an Intel microcode
update.
* Fixed passing through cache information from host in CPUID leaf 0x80000005.
* Fixed the T2A CPU template to disable SVM (nested virtualization).
* Fixed the T2A CPU template to set EferLmsleUnsupported bit
(CPUID.80000008h:EBX[20]), which indicates that EFER[LMSLE] is not supported.
Fixes: #7610
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Since the passed fd through unix socket would be any
stream fd such as pipe/fifo fd or any other socket
fd, thus we should deal with it as a normal hybrid
stream instead of a unix stream.
Fixes:#7584
Signed-off-by: Fupan Li <fupan.lfp@antgroup.com>
There are many places where the code currently creates new `Vec`
instances when it's not really needed. The result is a perf hit because
it allocates memory, copies all elements, then frees the memory; in some
cases, copying elements also involves extra allocations (e.g., when
elements are strings, or structs containing strings).
This patch addresses a number of these cases.
Fixes: #7203
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Refine implementation of mount by:
- log message with `path.display()` instead of `{:?}`
- add prefix "_" to unused variables
- pass by reference instead of by value to avoid creating redundant
array
- exactly matching prefix "fsgid=" instead of "fsgid"
- avoid redundant clone() operations
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
There's a bug in function update_ephemeral_mounts() which only handles
the first storage object and ignores all other storage objects.
Fixes: #7551
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Simplify function online_cpu_memory() by on calling update_cpuset_path()
for containers with cpuset configured.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Refine style of code related to sandbox by:
- remove unnecessary comments for caller to take lock, we have already taken
`&mut self`.
- change "*count < 1 " to "*count == 0", `count` is type of u32.
- make remove_sandbox_storage() to take `&mut self` instead of `&self`.
- group related function to each others
- avoid search the map twice in function find_process()
- avoid unwrap() in function run_oom_event_monitor()
- avoid unwrap() in online_resources()
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Avoid unwrap() in function do_remove_container(), and also make
implmementation symmetric for both timeout and non-timeout cases.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Optimize agent rpc implementation by:
- avoid clone objects when possible
- avoid unwrap() when possible
- explictly drop object to ensure order
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
This pull request is mainly for updating vm-memory and vmm-sys-util.
The affacted crates include:
- vm-memory: from 0.9.0 to 0.10.0
- vmm-sys-util: from 0.10.0 to 0.11.0
- virtio-queue: from 0.6.0 to 0.7.0
- fuse-backend-rs: from 0.10.4 to 0.10.5
- linux-loader: from 0.6.0 to 0.8.0
- nydus-api: from 0.3.0 to 0.3.1
- nydus-rafs: from 0.3.1 to 0.3.2
- nydus-storage: from 0.6.3 to 0.6.4
Fixes: #0000
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
These calls cause two extra atomic instructions each time they're used,
one to increment and another one to decrement the refcount.
Since we don't need them because the referred value is guaranteed to
outlive the function, remove the calls.
Fixes: #7190
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
When the mounted block device isn't a layer, we want to mount it into
containers, but since it's already mounted with the correct fs (e.g.,
tar, ext4, etc.) in the pod, we just bind-mount it into the container.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
When at least one `io.katacontainers.fs-opt.layer` option is added to
the rootfs, it gets inserted into the VM as a layer, and the file system
is mounted as an overlay of all layers using the overlayfs driver.
Additionally, if the `io.katacontainers.fs-opt.block_device=file` option
is present in a layer, it is mounted as a block device backed by a file
on the host.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This causes the overlay-fs driver to add the `upperdir` and `workdir`
options to an overlay-fs mount so that the mount becomes writable using
a discardable directory under the container id.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
This is so that file systems don't fail when we pass kata-specific
options from the snapshotter to kata.
Fixes: #7536
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Version 0.10.5, which was just released, breaks `nydus-storage`.
This is a workaround to fix the CI which is blocking other PRs.
Fixes: #7541
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Allow `clippy::redundant_clone` in the agent's unit tests
because rustc>=1.70 shows the errors as false-negatives.
These `clone()` are required because the following codes
refer to the variable, but the clippy analyzes them by mistake,
using the conservative and limited approach.
Ref. https://rust-lang.github.io/rust-clippy/master/index.html#/redundant_cloneFixes: #7534
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Kata containers as VM-based containers are allowed to run in the host
netns. That is, the network is able to isolate in the L2. The network
performance will benefit from this architecture, which eliminates as many
hops as possible. We called it a Directly Attachable Network (DAN for
short).
The network devices are placed at the host netns by the CNI plugins. The
configs are saved at {dan_conf}/{sandbox_id}.json in the format of JSON,
including device name, type, and network info. At the very beginning stage,
the DAN only supports host tap devices. More devices, like the DPDK, will
be supported in later versions.
The format of file looks like as below:
```json
{
"netns": "/path/to/netns",
"devices": [{
"name": "eth0",
"guest_mac": "xx:xx:xx:xx:xx",
"device": {
"type": "vhost-user",
"path": "/tmp/test",
"queue_num": 1,
"queue_size": 1
},
"network_info": {
"interface": {
"ip_addresses": ["192.168.0.1/24"],
"mtu": 1500,
"ntype": "tuntap",
"flags": 0
},
"routes": [{
"dest": "172.18.0.0/16",
"source": "172.18.0.1",
"gateway": "172.18.31.1",
"scope": 0,
"flags": 0
}],
"neighbors": [{
"ip_address": "192.168.0.3/16",
"device": "",
"state": 0,
"flags": 0,
"hardware_addr": "xx:xx:xx:xx:xx"
}]
}
}]
}
```
Fixes: #1922
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
If modeVFIO is enabled we need 1st to attach the VFIO control group
device /dev/vfio/vfio an 2nd the actuall device(s) afterwards.Sort the
devices starting with device #1 being the VFIO control group device and
the next the actuall device(s)
/dev/vfio/<group>
Fixes: #7493
Signed-off-by: Zvonko Kaiser <zkaiser@nvidia.com>
Multiple instances of task service may get registered by
ServiceManager::run(), fix it by making operation symmetric.
Fixes: #7479
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
The previous kata-monitor in golang could not communicate with runtime-rs
to gather metrics due to different sandbox addresses.
This PR adds the subcommand monitor in kata-ctl to gather metrics from
runtime-rs and monitor itself.
Fixes: #5017
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
Several functions in kata-ctl need to establish a connection with runtime-rs through MgmtClient.
This PR provides a global TIMEOUT to avoid multiple definitions.
Fixes: #5017
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
1. Implemented metrics collection for runtime-rs shim and dragonball hypervisor.
2. Described the current supported metrics in runtime-rs.(docs/design/kata-metrics-in-runtime-rs.md)
Fixes: #5017
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>