Strings in Rust don't have \0 at the end, but C does, which leads to `umount2`
in the libc can't get the correct path. Besides, calling `nix::mount::umount2`
to avoid using an unsafe block is a robust solution.
Fixes: #5871
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
Standalone share fs should add virtiofs device in setup_device_before_start_vm
and return the storages to mount the directory in guest. And it uses
hypervisor's jailer root directly instead of jail config.
Besides, we tweaked the parameter, so it adapts to rust version virtiofsd
now. And its cache policy which forbids caching is "never" now, instead of
"none". Hence, we change the default cache mode.
Fixes: #5655
Signed-off-by: Yipeng Yin <yinyipeng@bytedance.com>
Cgroup manager for a container will always be created.
Thus, dropping the option for LinuxContainer.cgroup_manager
is feasible and could simplify the code.
Fixes: #5778
Signed-off-by: Yuan-Zhuo <yuanzhuo0118@outlook.com>
Use pidfd_open and poll on newer versions of Linux to wait
for the process to exit. For older versions use existing wait logic
Fixes: #5617
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
Fixed the issue when using nonblocking, the `tokio::io::copy()` needing
to handle EAGAIN, resulting in high CPU usage.
Fixes: #5740
Signed-off-by: Quanwei Zhou <quanweiZhou@linux.alibaba.com>
Removed the `Debug` trait for the `ShareFs` and etc. Renamed
`ShareFsMount::upgrade()` and `ShareFsMount::downgrade()` to
`upgrade_to_rw()` and `downgrade_to_ro()`. Protected `mounted_info_set`
with a mutex to avoid race conditions.
Fixes: #5588
Signed-off-by: Xuewei Niu <justxuewei@apache.org>
This commit implemented umonut controls and permission controls. When a volume
is no longer referenced, it will be umounted immediately. When a volume mounted
with readonly permission and a new coming container needs readwrite permission,
the volume should be upgraded to readwrite permission. On the contrary, if a
volume with readwrite permission and no container needs readwrite, then the
volume should be downgraded.
Fixes: #5588
Signed-off-by: Xuewei Niu <justxuewei@apache.org>
Implemented bind mount related managment on the sandbox side, involving bind
mount a volume if it's not mounted before, upgrade permission to readwrite if
there is a new container needs.
Fixes: #5588
Signed-off-by: Xuewei Niu <justxuewei@apache.org>
Also added crate `runtime-rs/crates/runtimes` as dependency as it's
immediately depended upon by the `direct-volume` feature, see issue
5341 and PR 5467.
Fixes#5810
Signed-off-by: Tingzhou Yuan <tzyuan15@bu.edu>
Now we are supporting two runtime/shim, the go version,
and the rust version, for debug purposes, we can
add an identification in the version info
to tell us which runtime/shim is used.
Fixes: #5806
Signed-off-by: Bin Liu <bin@hyper.sh>
Pass SELinux policy for containers to the agent if `disable_guest_selinux`
is set to `false` in the runtime configuration. The `container_t` type
is applied to the container process inside the guest by default.
Users can also set a custom SELinux policy to the container process using
`guest_selinux_label` in the runtime configuration. This will be an
alternative configuration of Kubernetes' security context for SELinux
because users cannot specify the policy in Kata through Kubernetes's security
context. To apply SELinux policy to the container, the guest rootfs must
be CentOS that is created and built with `SELINUX=yes`.
Fixes: #4812
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
The kata-agent supports SELinux for containers inside the guest
to comply with the OCI runtime specification.
Fixes: #4812
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
This commit re-implements `start` operation by leveraging the agent codes.
Currently, `runk` has own `start` mechanism even if the agent already
has the feature to handle starting a container. This worsen the maintainability
and `runk` cannot keep up with the changes on the agent side easily.
Hence, `runk` replaces own implementations with agent's ones.
Fixes: #5648
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
For now, we can check if host support running kata by check if "/dev/kvm"
exist on aarch64.
Fixes: #5768
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
When using source code to compile runtime-rs,make the
documentation point out the detailed environment build
and compilation methods to avoid errors caused by related
dependent packages.
Fixes:#5757
Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>
The displayed commit message and version message are partially duplicated.
Remove the version number from the commit display message.
Fixes:#5735
Signed-off-by: Chen Taotao <chentt10@chinatelecom.cn>
Some rootfs put iptables-save and iptables-restore
under /usr/sbin instead of /sbin. This pr checks both
and returns the one exist.
Fixes: #5608
Signed-off-by: Ji-Xinyou <jerryji0414@outlook.com>
We have starting to use golang 1.19, some features are
not supported later, so run `go fix` to fix them.
Fixes: #5750
Signed-off-by: Bin Liu <bin@hyper.sh>
Use MkdirAll instead of Mkdir so it doesn't generate an
error when the folder is created by another process
Fixes#5713
Signed-off-by: Alexandru Matei <alexandru.matei@uipath.com>
As the increase of the I/O intensive tasks, two issues could be caused:
1. When the future is blocked, the current thread (which is in the network namespace)
might be take over by other tasks. After the future is finished, the thread take over
the current task might not be in the pod network namespace
2. When finish setting up the network, the current thread will be set back to the host namsapce.
But the task which be taken over would still stay in the pod network namespace
To avoid that, we need to block the future on the current thread.
Fixes:#5728
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>