Commit Graph

569 Commits

Author SHA1 Message Date
Eric Ernst
71f304ce17 agent: watcher: cleanup mount if needed when container is removed
If a bind mount was created for watchable storage, make sure we remove
when removing a container.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-08-11 08:53:28 -07:00
Samuel Ortiz
f1a505dbfe agent: Temporarily allow unknown linters
Bump thiserror to 1.0.26 for vsock-exporter and work around
a bug in Clippy nonstandard_macro_braces lint.
(See https://github-redirect.dependabot.com/rust-lang/rust-clippy/issues/7422)

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-08-11 08:53:28 -07:00
Eric Ernst
961aaff004 agent: watcher: fixes to make more robust
inotify/watchable-mount changes...

- Allow up to 16 files. It isn't that uncommon to have 3 files in a secret.
In Kubernetes, this results in 9 files in the mount (the presented files,
which are symlinks to the latest files, which are symlinks to actual files
which are in a seperate hidden directoy on the mount). Bumping from eight to 16 will
help ensure we can support "most" secret/tokens, and is still a pretty
small number to scan...

- Now we will only replace the watched storage with a bindmount if we observe
that there are too many files or if its too large. Since the scanning/updating is racy,
we should expect that we'll occassionally run into errors (ie, a file
deleted between scan / update). Rather than stopping and making a bind
mount, continue updating, as the changes will be updated the next time
check is called for that entry (every 2 seconds today).

To facilitate the 'oversized' handling, we create specific errors for too large
or too many files, and handle these specific errors when scanning the storage entry.

- When handling an oversided mount, do not remove the prior files -- we'll just
overwrite them with the bindmount. This'll help avoid the files
disappearing from the user, avoid racy cleanup and simplifies the flow.
Similarly, only mark it as a non-watched storage device after the
bindmount is created successfully.

- When creating bind mount, make sure destination exists. If we hadn't
had a successful scan before, this wouldn't exist and the mount would
fail. Update logic and unit test to cover this.

- In several spots, we were returning when there was an error (both in
scan and update). For update case, let's just log an warning and continue;
since the scan/update is racy, we should expect that we'll have
transient errors which should resolve the next time the watcher runs.

Fixes: #2402

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-08-11 08:52:51 -07:00
Samuel Ortiz
233b53c048 agent: Fix cargo 1.54 clippy warning
Mostly the needless borrow one, plus a few others that are now enforced.

Fixes #2408

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-08-05 18:41:55 +02:00
Samuel Ortiz
49083bfa31 agent: Create the process CWD when it does not exist
Although the OCI specification does not explictly requires that, we
should create the process CWD if it does not exist, before chdir'ing
to it. Without that fizx, the kata-agent fails to create a container
and returns a grpc error when it's trying to change the containerd
working directory to an non existing folder.

runc, the OCI runtime reference implementation, also creates the process
CWD when it's not part of the container rootfs.

Fixes #2374

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-08-01 04:27:03 +02:00
Fupan Li
838e169b9c Merge pull request #2248 from lifupan/check_file_exist
mount: fix the issue of missing check file exists
2021-07-27 23:29:26 +08:00
Peng Tao
fd2607cc43 Merge pull request #2202 from teawater/swap7
Add swap support
2021-07-20 21:12:30 +08:00
Fabiano Fidêncio
75c5edd66a Merge pull request #2263 from eryugey/eryugey/for-main
agent: clear MsFlags if the option has clear flag set
2021-07-20 12:50:45 +02:00
Bin Liu
6b00806bb8 Merge pull request #2243 from egernst/bump-tokio
agent/agent-ctl: update tokio to 1.8.1
2021-07-20 13:56:32 +08:00
Hui Zhu
4f066db8da agent: agent.proto: Add AddSwap
Add new fuction AddSwap.  When agent get AddSwap, it will get the device
name from PCIPath and set the device as the swap device.

Fixes: #2201

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-07-19 23:20:34 +08:00
Bin Liu
b94ebc30b4 Merge pull request #2235 from Tim-Zhang/vsock-exporter-async
vsock-exporter: switch to tokio runtime
2021-07-19 17:06:14 +08:00
Eryu Guan
35cbc93dee agent: clear MsFlags if the option has clear flag set
'FLAGS' hash map has bool to indicate if the flag should be cleared or
not. But in parse_mount_flags_and_options() we set the flag even 'clear'
is true. This results in a 'rw' mount being mounted as 'MS_RDONLY'.

Fixes: #2262
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2021-07-19 11:50:10 +08:00
fupan.lfp
5371b9214f mount: fix the issue of missing check file exists
It's better to check whether the destination file exists
before creating them, if it had been existed, then return
directly.

Fixes: #2247

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
2021-07-15 18:09:33 +08:00
Eric Ernst
acf6932863 agent: update tokio to 1.8.1
Update to latest tokio to address RUSTSEC-2021-0072:
 Task dropped in wrong thread when aborting `LocalSet` task

Update the toml to specify just 1.x for the tokio version.

Fixes: #2165

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-07-14 17:18:21 -07:00
Tim Zhang
73d3798cb1 vsock-exporter: switch to tokio runtime
Make the vsock-exporter async totally using tokio runtime.
And delay the timing of the connection to trace-forwarder so that
it is easy to reconnect when the connection was broken.

Fixes: #2234

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-14 20:16:05 +08:00
Fabiano Fidêncio
a104f13230 agent: Add make vendor
This has a similar intent as the go code, but not totally equal.  For
the go code we want to ensure that the vendored code is up-to-date,
while here we want to ensure that `cargo vendor` actually works.

We happened to release a few tarballs where `cargo vendor` didn't work
and it causes some pain for downstream maintainers.

Related: #2159

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-07-14 13:59:41 +02:00
Tim Zhang
7960689ef7 tracing: replace SimpleSpanProcessor with BatchSpanProcessor
This change make tokio could be use in vsock-exporter.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-14 15:59:52 +08:00
Tim Zhang
6999dccaa8 trace-forwarder: Add option rustflags, target, build-type for the make
Support rust-flags, target and build-type.

Fixes: #2215

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-13 11:35:46 +08:00
David Gibson
1ab72518b3 agent: Fix to parsing of /proc/self/mountinfo
get_mounts() parses /proc/self/mountinfo in order to get the mountpoints
for various cgroup filesystems.  One of the entries in mountinfo is the
"device" for each filesystem, but for virtual filesystems like /proc, /sys
and cgroups, the device entry is arbitrary.  Depending on the exact rootfs
setup, it can end up being "-".

This breaks get_mounts() because it uses " - " as a separator.  There
really is a " - " separator in mountinfo, but in this case the device entry
shows up as a second one.  Fix this, by changing a split to a splitn, which
will effectively only consider the first " - " in the line.

While we're there, make the warning message more useful, by having it
actually show which line it wasn't able to parse.

fixes #2182

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-10 19:30:27 +10:00
Jakob Naucke
8758ce26b7 agent: Enable virtio-blk-ccw
Forward-port of https://github.com/kata-containers/agent/pull/600.
Enable virtio-blk-ccw devices in agent (virtio-blk for s390x, already
enabled in runtime).

Fixes: #2026

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-07-08 14:59:47 +02:00
Bin Liu
26985bbfff Merge pull request #2173 from Tim-Zhang/enhance-test-execute-hook
agent: enhance tests of execute_hook
2021-07-05 14:36:45 +08:00
Fabiano Fidêncio
015b3baf06 Merge pull request #2178 from mxpv/config
agent: Cleanup config
2021-07-03 09:51:16 +02:00
Fupan Li
2de9c5b41d Merge pull request #1969 from liubin/feature/1968-pass-span-context-to-agent
Pass span context from runtime to agent to get a full trace #1968
2021-07-03 09:31:02 +08:00
Maksym Pavlenko
e6b1766f6b agent: Cleanup config
This commit clean up config parsing and testing code to make it a bit more easy to maintain.
- Adds `with_context` from anyhow to include the underlying error. This helps to understand what exactly went wrong.
- Uses ensure and bail as a shorter alternative for `if` checks.
- TestData in test_parse_cmdline is now implements Default to reduce boilerplate code
- Remove `make_err` as it doesn’t make any sense.

Fixes: #2177

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-07-02 14:28:43 -07:00
Tim Zhang
55c5c871d2 agent: enhance tests of execute_hook
Use which to find the full path of exe before run execute_hook
to avoid error: 'No such file or directory'

Fixes: #2172

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-02 14:30:56 +08:00
bin
65d2fb5d11 agent: remove instrument attribute for some simple functions
For some simple functions that only process memory data(list/hashmap),
they don't need to be instrumented.

And sometime they may generate non-parent spans, if they are called from
daemon-style "threads".

Fixes: #1968

Signed-off-by: bin <bin@hyper.sh>
2021-07-02 10:07:28 +08:00
bin
cfb8139f36 agent: add more instruments for RPC calls
All RPC calls can get parent span context,
and create new sub-spans for the full trace.

Fixes: #1968

Signed-off-by: bin <bin@hyper.sh>
2021-07-02 10:07:28 +08:00
Fabiano Fidêncio
3fe0af6a9b Merge pull request #2152 from liubin/fix/2111-update-netlink-libs
agent: update netlink libraries
2021-07-01 12:01:35 +02:00
Fabiano Fidêncio
550029c473 Merge pull request #2060 from liubin/2059/delete-some-lint-attributes
agent: delete some lint attributes
2021-06-30 16:51:07 +02:00
bin
aa264f915f agent: update netlink libraries
Update rtnetlink to use crate.io to make cargo vendor work.
Add vendor/ to .gitignore.

Fixes: #2111

Signed-off-by: bin <bin@hyper.sh>
2021-06-30 22:39:50 +08:00
Fabiano Fidêncio
7d37fbfdfb Merge pull request #2115 from sameo/topic/rust-nix
cargo: Use latest nix crate for all Rust code bases
2021-06-28 08:18:53 +02:00
Samuel Ortiz
f6294226e8 cargo: Use latest nix crate for all Rust code bases
Our dependencies already bring several versions of nix, we should avoid
adding even more fragementation.

Fixes #2114

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-06-25 03:38:37 +02:00
Maksym Pavlenko
6a93e5d593 agent: Initial watchable-bind implementation
Add support for watchable-bind storage driver. When watchable-bind storage
is present, the agent will create a watchable path in a tmpfs, and poll the
watchable-bind source to keep this new mount-point up to date.

This poll will allow the agent to present the mount-point to the
container, allowing for inotify usage by the container workload.

If a mount becomes too large, either in file count or in overall size,
we want to stop treating it as watchable, and instead just treat as a
bindmount. This'll help avoid DoS by growing tmpfs too large, as well
as limiting time spent scanning files. If a watchable-bind grows beyond
8 files (arbitrary sane number for certs/secrets) or 1MB (limit on ConfigMap size),
we treat it as a normal bind.

Fixes: #1879

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>

agent: watcher: SandboxStorages check loop cleanup
2021-06-24 10:07:06 -07:00
bin
000049b69e agent: delete some lint attributes
Thes lint attributes can be deleted to keep clean code.

Fixes: #2059

Signed-off-by: bin <bin@hyper.sh>
2021-06-18 16:08:25 +08:00
snir911
1faaf5f35d Merge pull request #2000 from ManaSugi/update-mount-flags
agent: Add some mount options and sort the options alphabetically
2021-06-17 11:53:11 +03:00
Manabu Sugimoto
bd27f7bab5 agent: Sort PROPAGATION and OPTIONS alphabetically to scan easily
It's hard to visually scan over the list currently.
Therefore, we should sort the list alphabetically to scan easily.

Fixes: #1999

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2021-06-16 17:23:05 +09:00
Tim Zhang
799cb27234 agent: Upgrade mio to v0.7.13 to fix epoll_fd leak problem
Fixes: #2035
Fixes: tokio-rs/tokio/#3809

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-06-15 11:35:49 +08:00
Manabu Sugimoto
e544779c61 agent: Add some mount options
Add the following mount options to catch up with the runtime spec
- silent
- loud
- (no)acl
- (no)iversion
- (no)lazytime

Fixes: #1999

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2021-06-11 15:08:46 +09:00
Manabu Sugimoto
a1247bc0bb agent: Conform to the latest nix version (0.21.0)
We need to fix some agent's code to conform to the latest nix crate
to be able to use new features of the nix.

Fixes: #1987

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2021-06-10 16:58:51 +09:00
Tim Zhang
9e3349c18e agent: Fix fd leak caused by netlink
See also: little-dude/netlink#165

Fixes: #1952

Because the author of netlink has no time to maintain the crate
(https://github.com/little-dude/netlink/issues/161), so we
need to switch the dependency to github temporarily.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-06-03 17:23:37 +08:00
Chelsea Mafrica
33c12b6d08 Merge pull request #1929 from jodh-intel/add-agent-tracing
tracing: Add basic VSOCK tracing
2021-06-02 11:45:41 -07:00
James O. D. Hunt
a9a0eccf33 tracing: Add basic VSOCK tracing
Implement an openTelemetry custom exporter that sends trace spans to a
VSOCK socket. A VSOCK-to-span converter (such as the Kata trace
forwarder) needs to be running on the host to allow systems like Jaeger
to capture the trace spans.

By default, tracing is not enabled (meaning a NOP tracer is used). To
activate tracing, set the `agent.kata.enable_tracing=true` in the
configuration file.

The type of tracing this change introduces is "static isolated"
tracing. See [1] for further details.

> **Note:**
>
> This change only provides the foundational changes for agent
> tracing work. The feature is _not_ yet complete since it does
> not yet show the correct trace hierarchy.

Fixes: #60.

[1] - https://github.com/kata-containers/agent/blob/master/TRACING.md

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-06-02 18:00:05 +01:00
Tim Zhang
9bf781d704 agent: Upgrade tokio-vsock to fix fd leak of vsock socket
Fixes: #1950

The further information: rust-vsock/vsock-rs#15

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-06-02 16:03:09 +08:00
Peng Tao
c455d84571 Merge pull request #1918 from lifupan/main
cgroup: fix the issue of set mem.limit and mem.swap
2021-05-29 10:05:44 +08:00
fupan.lfp
30f4834c5b cgroup: fix the issue of set mem.limit and mem.swap
When update memory limit, we should adapt the write sequence
for memory and swap memory, so it won't fail because
the new value and the old value don't fit kernel's
validation.

Fixes: #1917

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
2021-05-28 15:44:14 +08:00
fupan.lfp
0ae364c8eb agent: re-enable the standard SIGPIPE behavior
The Rust standard library had suppressed the default SIGPIPE
behavior, see https://github.com/rust-lang/rust/pull/13158.
Since the parent's signal handler would be inherited by it's child
process, thus we should re-enable the standard SIGPIPE behavior as a
workaround.

Fixes: #1887

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
2021-05-28 15:25:05 +08:00
James O. D. Hunt
45f02227b2 tracing: Add trace points
Use the tracing crate to create automatic trace spans for the _majority_
of top-level modules.

Note that not all functions in the top-level modules can be traced:

- Some functions cannot be traced due to the requirement that all
  function parameters implement the `Debug` trait. In some cases (such
  as `netlink.rs`), objects are being passed that are defined in
  different crates and which do not implement `Debug`.
- Some functions may never return (`signal.rs`).
- Some functions are inlined.
- Some functions are very simple getter/setter functions.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2021-05-27 10:42:58 +01:00
Fabiano Fidêncio
c3f6c88668 Merge pull request #1915 from quanweiZhou/fix_start_container_failed_when_drop_all_caps
agent: fix start container failed when dropping all capabilities
2021-05-24 14:13:52 +02:00
quanweiZhou
3e4ebe10ac agent: fix start container failed when dropping all capabilities
When starting a container and dropping all capabilities,
the init child process has no permission to read the exec.fifo
file because the parent set the file mode 0o622. So change the exec.fifo file mode to 0o644.

fixes #1913

Signed-off-by: quanweiZhou <quanweiZhou@linux.alibaba.com>
2021-05-22 17:33:49 +08:00
Manabu Sugimoto
20a382c158 agent: Remove unnecessary underscore(_) variables
We should remove underscore(_) prefixed variables when ? operator is
used.

Fixes: #1903

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2021-05-21 17:45:34 +09:00