Commit Graph

727 Commits

Author SHA1 Message Date
Tim Zhang
91c6ba74fa Merge pull request #1225 from Tim-Zhang/update-cgroup-to-0.2.0
agent: upgrade cgroups to 0.2.0
2021-01-05 19:50:05 +08:00
Tim Zhang
d4cd255485 agent: Avoid container stats panic caused by cgroup controller non-exist
Return SingularPtrField::none() instead of panic when getting stats
from cgroup failed caused by cgroup controller missing.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-01-05 11:35:41 +08:00
Tim Zhang
157e055fdd agent: upgrade crate cgroups to 0.2.0
Fixes: #1224

35ecd6f (origin/change-name, change-name) Update readme
eb6577e Change package name to cgroups-rs
8f6a7e0 Merge pull request #19 from Tim-Zhang/0.2.0
9baa065 (origin/0.2.0, 0.2.0) release: v0.2.0
e160df0 Make read_i64_from private and merge read_str_from to its caller
e1e05d3 Make new_with_relative_paths=new and load_with_relative_paths=new in v2
a89f4a0 Support set notify_on_release & release_agent
61a0957 Fix set_swappiness in cgroup v2
0592045 Ignore kmem in cgroup v2
c254fff Update readme
438d774 Fix test
42ee1ba Make Cgroup can be stored in struct
b6bb5ae docs: Hide Re-exports
d2882b1 Print cause when println!("{}")
abcb5ed Add more logs for create_dir error in controller.create
1f188be Detect subsystems and get root from /proc/self/mountinfo
fbd7164 Fix warnings in tests
f342254 Remove Box wrap of Cgroup.hire
cd998f3 Do not place cgroup under relative path read from cgroup by default
1ac76b6 Make function find_v1_mount pub
121f78d Expose deletion error
0f76570 Avoid exception caused by cgroup writeback feature
10650e2 Update tests to adapt new type of fields in resource
567cdb4 Use Option as resource fields, remove the update switch: update_values
0c18b08 Support customized attributes for CpuController and MemController
ca610bb add add_task_by_tgid

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-01-05 11:35:34 +08:00
David Gibson
e3ec1d509e agent: Simplify .or_else() to .or()
get_bool_value() in src/agent/src/config.rs includes a Result::or_else()
call with a trivial closure which can be replaced by a Result::or.  This
removes a clippy warning.

fixes #1201

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-01-05 12:54:21 +11:00
David Gibson
e9e39fd081 Merge pull request #1207 from dgibson/bug1206
Fix error reporting in listInterfaces() and listRoutes()
2021-01-05 12:02:07 +11:00
Liu Jiang
b366af9358 jail: add more test cases for validator
Fixes: #1214

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2020-12-24 20:17:06 +08:00
Liu Jiang
d38a5d3fcf jail/validator: introduce helpers to reduce duplicated code
Fixes: #1214

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2020-12-24 19:02:31 +08:00
Liu Jiang
76ad32136f jail/validator: avoid unwrap() for safety
Explicitly return error codes instead of unwrap().

Fixes: #1214

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2020-12-24 19:02:13 +08:00
Liu Jiang
51fd624f3e rustjail: add more context info for errors
Fixes: #1214

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2020-12-24 17:47:58 +08:00
Peng Tao
f1b3f2e178 Merge pull request #1150 from fidencio/wip/make-install-breaks
Add void "install" targets for both "trace-forwarder" and "agent-ctl"
2020-12-23 18:41:42 +08:00
Peng Tao
109ab54d63 Merge pull request #1212 from jiangliu/typo
oci: fix a typo in "addtionalGids"
2020-12-23 18:03:26 +08:00
Bin Liu
8d6096210e Merge pull request #1186 from maruthgoyal/2.0-dev
Don't update cpusets if no CPUs changed closes #1172
2020-12-23 10:05:59 +08:00
Liu Jiang
9321e1b21b oci: fix two incompatible issues with OCI spec
The first incompatible issue is caused by a typo, "swapiness" should
be "swappiness". The second incompatible issue is caused by a serde
format. The struct LinuxBlockIODevice is introduced for convenience,
but it also changes serialized data, so "#[serde(flatten)]" should
be used for compatibility with OCI spec.

Fixes: #1211

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2020-12-22 11:16:15 +08:00
Liu Jiang
406a91ffdd agent: consume ttrpc crate from crates.io
The ttrpc v0.3.0 has been published to crates.io, so consume from
crates.io.

Fixes: #1213

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2020-12-22 09:46:41 +08:00
Liu Jiang
6181570ccc oci: fix a typo in "addtionalGids"
There's a typo in "addtionalGids", which should be "additionalGids".

Fixes: #1211

Signed-off-by: Liu Jiang <gerry@linux.alibaba.com>
2020-12-22 00:03:27 +08:00
Maruth Goyal
4af5beda35 agent/sandbox: Don't update cpuset when ncpus = 0
When receiving an OnlineCpuMemory RPC, if the number of CPUs to be
made available is 0, then updating the cpusets is a redundant operation.

Fixes: #1172

Signed-off-by: Maruth Goyal <maruthgoyal@gmail.com>
2020-12-18 18:11:16 +05:30
David Gibson
e004616b02 runtime/network: Fix error reporting in listRoutes()
If the upcast from resultingRoutes to *grpc.IRoutes fails, we return
(nil, err), but previous code ensures that err is nil at that point, so we
return no error.

fixes #1206

Forward port of
0ffaeeb5d8

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-18 14:36:09 +11:00
David Gibson
1ae8e81abb runtime/network: Correct error reporting in listInterfaces()
If the upcast from resultingInterfaces to *grpc.Interfaces fails, we
return (nil, err), but previous code ensures that err is nil at that
point, so we return no error.

Forward port of
b86e904c2d

fixes #1206

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-18 14:35:50 +11:00
Bin Liu
caa6965c17 Merge pull request #1183 from wainersm/runtime_destdir
runtime: Allow to overwrite DESTDIR
2020-12-17 14:10:56 +08:00
Bin Liu
3b87d10d79 Merge pull request #1191 from mxpv/fd
Don't leak fd when reseeding rng
2020-12-17 12:50:55 +08:00
David Gibson
a19263e58d agent/protocols: Remove unneeded import from oci.proto
oci.proto imports "google/protobuf/wrappers.proto", but doesn't appear to
use it, which causes a warning from protoc when we compile it.  Remove the
import to fix the warning.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-17 13:06:41 +11:00
David Gibson
a19cf28c26 agent/protocols: Remove some unnecessary include directives from protoc
The generate_go_sources() function in update-generate-proto.sh adds a
number of include directives to the protoc command line.  Some of these
don't appear to be necessary to correctly compile the agent's protocol
files, so remove them.

Amongst other things were directives pointing at the old Kata1 runtime and
agent repositories.  Those ones could be actively harmful by causing odd
dependencies of the Kata2 build on the Kata1 repositories.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-16 12:10:27 +11:00
David Gibson
2b4520904c agent/protocols: Remove some unneeded dependencies for protocol generation
src/agent/protocols/hack/update-generated-proto.sh checks for the presence
of protoc-gen-rust and ttrpc_rust_plugin, but it doesn't actually need
them.  Those tools are needed to generate Rust code from the gRPC proto
files, but that's already handled in src/agent/protocols/build.rs using
Cargo for dependency management.

This script is only needed for the Go code, for which the other tools are
sufficient.

fixes #1198

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2020-12-16 12:10:27 +11:00
Maksym Pavlenko
3db1c8059d agent: Don't leak fd when reseeding rng
This PR wraps fd raw descriptor with File, so it'll be properly closed once exited.

Fixes: #1192

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2020-12-11 16:18:41 -08:00
Wainer dos Santos Moschetta
10e9bfc6f7 runtime: Allow to overwrite DESTDIR
On runtime/Makefile the value of DESTDIR is set to "/", unless one
pass that variable as an argument to `make`. This change will
allow its overwrite if DESTDIR is exported in the environment as
well.

Fixes #1182

Signed-off-by: Wainer dos Santos Moschetta <wainersm@redhat.com>
2020-12-09 09:04:04 -05:00
Peng Tao
e167bf30e3 Merge pull request #1165 from liubin/fix/exec-hang-when-bg-process-running
agent: exit from exec hangs if background process is present
2020-12-08 20:32:23 +08:00
bin liu
1ca415d87e agent: exit from exec hangs if background process is present
This is the Rust porting of https://github.com/kata-containers/agent/pull/371

`read_stdout`/`read_stderr` is blocking rpc calls, if exec process
exited, these calls is on blocking state for reading on process's
term master fd, and can't get a chance to break the wait.

In this PR, `read_stdout`/`read_stderr` will not read directly from
a term master of a process, instead, it will first have to get
an fd to read from newly added `epoller.poll()`. `epoller.poll()` may returns:

- the term master fd of exec process, if the process is running.
- a fd(piped fd) will return EOF when reading to indicate that th process is exited.

Fixes: #1160

Signed-off-by: bin liu <bin@hyper.sh>
2020-12-07 10:52:44 +08:00
Peng Tao
4bca7312c7 Merge pull request #1158 from liubin/fix/1156-fix-cpuset
handle vcpus properly utilized in the guest
2020-12-04 22:32:15 +08:00
Fabiano Fidêncio
f7383ef835 Merge pull request #1166 from cmaf/fix-ctx-port
shimv2: handle ctx passed by containerd
2020-12-03 19:45:52 +01:00
Bin Liu
4e0a7e31f9 Merge pull request #1103 from likebreath/1111/clh_fix_cleanupVM
runtime: clh: Enforce to call 'cleanupVM' for 'stopSandbox'
2020-12-03 17:34:26 +08:00
Chelsea Mafrica
0155fe1260 shimv2: handle ctx passed by containerd
Sometimes shim process cannot be shutdown because of container list
is not empty. This container list is written in shim service, while
creating container. We find that if containerd cancel its Create
Container Request due to timeout, but runtime didn't handle it properly
and continue creating action, then this container cannot be deleted at
all. So we should make sure the ctx passed to Create Service rpc call
is effective.

Fixes #1088

Signed-off-by: Yves Chan <shanks.cyp@gmail.com>
Signed-off-by: Chelsea Mafrica <chelsea.e.mafrica@intel.com>
2020-12-02 14:28:31 -08:00
Archana Shinde
f96cdc1a67 Merge pull request #1114 from c3d/bug/1111-agent-oom-killer
agent: Adjust OOM Score to avoid agent being killed.
2020-12-02 11:40:35 -08:00
bin liu
a793b8d90d agent: update cpuset of container path
After cpu hot-plugged is available, cpuset for containers will be written into
cgroup files recursively, the paths should include container's cgroup path, and up
to root path of cgroup filesystem.

Fixes: #1156, #1159

Signed-off-by: bin liu <bin@hyper.sh>
2020-12-02 10:38:26 +08:00
bin liu
705182d04e agent: ignore updating cpuset error when update cgroups
The result of `cpuset_controller.set_cpus(&cpu.cpus)` is unwrapped,
this will lead creating container to fail if cpuset is set.

The sandbox's `CreateContainer` sequence is:

c, err := newContainer(s, &contConfig)
err = c.create()
  c.sandbox.agent.createContainer(c.sandbox, c) (1)
err = s.updateResources()
  oldCPUs, newCPUs, err := s.hypervisor.resizeVCPUs(sandboxVCPUs) (2)

cpuset only avaiable after `s.hypervisor.resizeVCPUs` has been called at (2),
and then cpuset is written to cgourps file.

Fixes: #1159

Signed-off-by: bin liu <bin@hyper.sh>
2020-12-02 10:38:16 +08:00
Bo Chen
647331ace6 runtime: clh: Enforce to call 'cleanupVM' for 'stopSandbox'
We should always cleanup the vm directory when doing `stopSandbox`,
while we are skipping the cleanup process on some error code paths when
using cloud-hypervisor driver.

Fixes: #1098

Signed-off-by: Bo Chen <chen.bo@intel.com>
2020-12-01 17:27:44 -08:00
Fabiano Fidêncio
5e407758f6 trace-forwarder: Add void "install" target
Otherwise `make install` run from the top directory would just fail as
the target is not defined.

Fixes: #1149

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2020-11-27 15:26:23 +01:00
Julio Montes
70f198d78e cli: check modules and permissions before loading a module
Before loading a module, the check subcommand should check if the
current user can load it.

fixes #3085

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-11-26 11:55:42 -06:00
Julio Montes
cb684cf8ea cli: don't fail if rate limit is exceeded
Don't fail if rate limit is exceeded since this is a
limitation/restriction of Github not a problem in the host.
Print a warning when the rate limit is exceeded.

For more information about Github's rate limit, see
https://developer.github.com/v3/#rate-limiting

Signed-off-by: Julio Montes <julio.montes@intel.com>
2020-11-26 11:50:14 -06:00
Bin Liu
b8716d8eec Merge pull request #1141 from lifupan/fix_thread_spwan
rustjail: fork a new child process to change the pid ns
2020-11-25 15:20:36 +08:00
fupan.lfp
9216f2ad63 rustjail: fork a new child process to change the pid ns
The main process do unshare pid namespace, the process
couldn't spawn new thread, in order to avoid this issue,
fork a new child process and do the pid namespace unshare
in the new temporary process.

Fixes: #1140

Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
2020-11-23 17:57:33 +08:00
fupan.lfp
3b08376c4e rustjail: remove the network ns validation against container
Since kata containers shared the network ns with
the guest system, thus there's no need to do the
network ns check.

Fixes: #1047

Signed-off-by: fupan.lfp <fupan.lfp@antfin.com>
2020-11-23 14:41:22 +08:00
James O. D. Hunt
7c12c5481e Merge pull request #1128 from liubin/fix/1127-delete-wait
runtime: don't wait the second shim process in shim start
2020-11-18 14:19:11 +00:00
Julio Montes
f00655a40f Merge pull request #1060 from jongwu/rootbus
agent: create pci root Bus Path for arm64
2020-11-18 08:13:30 -06:00
Julio Montes
e411ebc779 Merge pull request #1126 from liubin/fix/1125-enable-lto
agent: enable lto flag for Cargo to get better optimized code
2020-11-18 08:07:58 -06:00
bin liu
c388ec5bef runtime: don't wait the second shim process in shim start
In first shim v2 startup(with `start` command-line option), it will start
the second shim v2 process running as ttrpc server, there is no needs to
wait the second process, because the current shim v2 process will exit immediately.

Fixes: #1127

Signed-off-by: bin liu <bin@hyper.sh>
2020-11-18 17:18:59 +08:00
bin liu
d6acc4c09c agent: enable lto flag for Cargo to get better optimized code
The lto setting controls the -C lto flag which controls LLVM's link time optimizations.
LTO can produce better optimized code, using whole-program analysis,
at the cost of longer linking time.

https://doc.rust-lang.org/cargo/reference/profiles.html#lto

Fixes: #1125

Signed-off-by: bin liu <bin@hyper.sh>
2020-11-18 15:50:27 +08:00
Julio Montes
1dd77e204f Merge pull request #1120 from liubin/fix/1119-revert-cleanupcontainer-api
virtcontainers: revert CleanupContainer from PR 1079
2020-11-17 09:11:29 -06:00
bin liu
fdbf7d3222 virtcontainers: revert CleanupContainer from PR 1079
In PR 1079, CleanupContainer's parameter of sandboxID is changed to VCSandbox, but at cleanup,
there is no VCSandbox is constructed, we should load it from disk by loadSandboxConfig() in
persist.go. This commit reverts parts of #1079

Fixes: #1119

Signed-off-by: bin liu <bin@hyper.sh>
2020-11-17 10:31:33 +08:00
James O. D. Hunt
91a390f072 docs: Create hypervisor summary document
Split some of the core hypervisor details out of the virtualisation
document and present in a simpler fashion for new users.

Fixes: #1063.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2020-11-16 11:52:40 +00:00
Christophe de Dinechin
53b5d063e9 agent: Adjust OOM Score to avoid agent being killed.
Under stress, the agent can be OOM-killed, which exists the sandbox.
One possible hard-to-diagnose manifestation is a virtiofsd crash.

Fixes: #1111

Reported-by: Qian Cai <caiqian@redhat.com>
Signed-off-by: Christophe de Dinechin <dinechin@redhat.com>
2020-11-13 11:10:19 +01:00