Commit Graph

583 Commits

Author SHA1 Message Date
David Gibson
907459c1c1 agent/device: Don't force PCI rescans
The agent initiates a PCI rescan from two places.  One is triggered
for each virtio-blk PCI device, and one is triggered unconditionally
when we start a new container.

The PCI bus rescan code was added long time ago in Clear Containers due to
lack of ACPI support in QEMU 2.9 + q35.  Since Kata routinely plugs devices
under a PCIe-to-PCI bridge, that left SHPC as the only available hotplug
mechanism.

However, while Kata was using SHPC on the qemu side, it wasn't actually
using it on the guest side.  Due to a quirk of our guest kernel
configuration, the SHPC driver never bound to the bridge, and *no* hotplug
was working at all.  To work around that, Kata was forcing the rescan
manually, which would discover the new device.  That was very fragile (we
were arguably relying on a kernel bug).  Even if we were using SHPC
propertly, it includes a mandatory 5s delay during plug operations
(designed for physical cards and human operators), which makes it
unsuitable quick start up.

Worse, the forced PCI rescans could race with either SHPC or PCIe native
hotplug sequences, causing several problems.  In some cases this could put
the device into an entirely broken state where it wouldn't respond to
config space accesses at all.

Since pull request #2323 was merged, we have instead used ACPI hotplug
which is both fast, and more solid in terms of semantics and races.  So,
the forced PCI rescans are no longer necessary.  Remove them all.

fixes #683

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
75f426dd1e agent: Simplify do_add_swap()
do_add_swap() has some mildly complex code to translate the PCI path of
a virtio-blk device (where the swap will reside) into a /dev path. However,
the device module already has get_virtio_blk_pci_device_name() which does
exactly that.  The existing code has some further advantages: it uses
more precise matching of the sysfs paths, and if necessary it will wait for
the device to be added to the guest.

While we're there, remove an unnecessary 'as u8' from the PCI path
construction: pci::Path::new() already accepts anything which implements
TryInfo<u8>, which u32 certainly does.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
8bc71105f4 agent/device: Add device type for VFIO devices
Currently, VFIO devices attached to a Kata container aren't described to
the agent at all.  We essentially just hope they're ready by the time
we've entered the container proper, which is usually the case because of
the PCI rescan - but that causes other problems.

This adds a new device type to the agent representing VFIO devices.  The
agent will use its existing uevent watching mechanisms to wait for the
associated guest PCI device to appear before proceeding.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
f7a2707505 agent: Move driver type constants into device.rs
Currently the constants giving the names for each device/driver type in
the protocol are in mount.rs, and used in device.rs.  Since these constants
are inherently related to, well, devices, it makes more sense to put them
in device.rs and use them from mount.rs.

This will become even more so with planned extensions which will add some
device types that will not be used in mount.rs at all.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
David Gibson
5b1eb08bde agent/uevent: Improve logging of wait_for_uevent()
These messages will help when debugging matchers not matching properly.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-27 12:46:33 +10:00
Eric Ernst
272771dcf9 watcher: ensure we create target mount point for storage
We would only create the target when updating files. We need to make
sure that we create the target if the source is a directory. Without
this, we'll fail to start a container that utilizes an empty configmap,
for example.

Add unit tests for this.

Fixes: #2638

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-09-23 08:29:28 -07:00
David Gibson
9d3cd9841f agent/mount: Remove unused ensure_destination_exists()
The only remaining callers of ensure_destination_exists() are in its own
unit tests.  So, just remove it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-16 12:24:47 +10:00
David Gibson
64aa562355 agent: Correct mount point creation
mount_storage() first makes sure the mount point for the storage volume
exists.  It uses fs::create_dir_all() in the case of 9p or virtiofs volumes
otherwise ensure_destination_exists().  But.. ensure_destination_exists()
boils down to an fs::create_dir_all() in most cases anyway.  The only case
it doesn't is for a bind fstype, where it creates a file instead of a
directory.  But, that's not correct anyway because we need to create either
a file or a directory depending on the source of the bind mount, which
ensure_destination_exists() doesn't know.

The 9p/virtiofs paths also check if the mountpoint exists before calling
fs::create_dir_all(), which is unnecessary (fs::create_dir_all already
handles that case).

mount_storage() does have the information to know what we need to create,
so have it explicitly call ensure_destination_file_exists() for the bind
mount to a non-directory case, and fs::create_dir_all() in all other cases.

fixes #2390

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-16 12:24:47 +10:00
David Gibson
08d7aebc28 agent/mount: Split out regular file case from ensure_destination_exists()
ensure_destination_exists() can create either a directory or a regular file
depending on the arguments.  This patch extracts the regular file specific
option into its own helper: ensure_destination_file_exists().  This:
 - Avoids doing some steps in the directory case (they're already handled
   by create_dir_all())
 - Enables some further future cleanups

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-16 12:24:47 +10:00
David Gibson
9fa3beff4f agent: Remove unnecessary BareMount structure
struct Baremount contains the information necessary to make a new mount.
As a datastructure, however, it's pointless, since every user just
constructs it, immediately calls the BareMount::mount() method then
discards the structure.

Simplify the code by making this a direct function call baremount().

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-16 12:24:47 +10:00
David Gibson
49282854f1 agent: Simplify BareMount::mount by using nix::mount::mount
BareMount::mount does some complicated marshalling and uses unsafe code to
call into the mount(2) system call.  However, we're already using the nix
crate which provides a more Rust-like wrapper for mount(2).  We're even
already using nix::mount::umount and nix::mount::MsFlags from the same
module.

In the same way, we can replace the direct usage of libc::umount() with
nix::mount::umount() in one of the tests.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-16 12:24:47 +10:00
David Gibson
25ac3524c9 versions: Allow newer Rust versions
Rust 1.47.0 which is the latest we note as tested in versions.yaml is now
getting fairly old - many current distros have newer versions (e.g.
Rust 1.54.0 in Fedora 34).  Bring this more up to date.

Note that this is only updating the 'newest-version', not the minimum
required version.

The new version changes the name of the 'clippy::unknown_clipp_lints'
option to simply 'unknown_lints' so we need to change that as well to avoid
warnings.

fixes #2633

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-09-15 08:58:28 +10:00
bin
2abc450a4d test: enable running tests under root user
Add tests that run under root user to test special cases.

Fixes: #2446

Signed-off-by: bin <bin@hyper.sh>
2021-09-09 14:21:34 +08:00
Peng Tao
256c3b2747 license: drop redundent license files
There is no need to keep multiple copies of the license file in
different directory. We can just use the top level one for the project.

Fixes: #2553
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2021-09-01 15:10:04 +08:00
Eric Ernst
71f304ce17 agent: watcher: cleanup mount if needed when container is removed
If a bind mount was created for watchable storage, make sure we remove
when removing a container.

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-08-11 08:53:28 -07:00
Samuel Ortiz
f1a505dbfe agent: Temporarily allow unknown linters
Bump thiserror to 1.0.26 for vsock-exporter and work around
a bug in Clippy nonstandard_macro_braces lint.
(See https://github-redirect.dependabot.com/rust-lang/rust-clippy/issues/7422)

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-08-11 08:53:28 -07:00
Eric Ernst
961aaff004 agent: watcher: fixes to make more robust
inotify/watchable-mount changes...

- Allow up to 16 files. It isn't that uncommon to have 3 files in a secret.
In Kubernetes, this results in 9 files in the mount (the presented files,
which are symlinks to the latest files, which are symlinks to actual files
which are in a seperate hidden directoy on the mount). Bumping from eight to 16 will
help ensure we can support "most" secret/tokens, and is still a pretty
small number to scan...

- Now we will only replace the watched storage with a bindmount if we observe
that there are too many files or if its too large. Since the scanning/updating is racy,
we should expect that we'll occassionally run into errors (ie, a file
deleted between scan / update). Rather than stopping and making a bind
mount, continue updating, as the changes will be updated the next time
check is called for that entry (every 2 seconds today).

To facilitate the 'oversized' handling, we create specific errors for too large
or too many files, and handle these specific errors when scanning the storage entry.

- When handling an oversided mount, do not remove the prior files -- we'll just
overwrite them with the bindmount. This'll help avoid the files
disappearing from the user, avoid racy cleanup and simplifies the flow.
Similarly, only mark it as a non-watched storage device after the
bindmount is created successfully.

- When creating bind mount, make sure destination exists. If we hadn't
had a successful scan before, this wouldn't exist and the mount would
fail. Update logic and unit test to cover this.

- In several spots, we were returning when there was an error (both in
scan and update). For update case, let's just log an warning and continue;
since the scan/update is racy, we should expect that we'll have
transient errors which should resolve the next time the watcher runs.

Fixes: #2402

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-08-11 08:52:51 -07:00
Samuel Ortiz
233b53c048 agent: Fix cargo 1.54 clippy warning
Mostly the needless borrow one, plus a few others that are now enforced.

Fixes #2408

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-08-05 18:41:55 +02:00
Samuel Ortiz
49083bfa31 agent: Create the process CWD when it does not exist
Although the OCI specification does not explictly requires that, we
should create the process CWD if it does not exist, before chdir'ing
to it. Without that fizx, the kata-agent fails to create a container
and returns a grpc error when it's trying to change the containerd
working directory to an non existing folder.

runc, the OCI runtime reference implementation, also creates the process
CWD when it's not part of the container rootfs.

Fixes #2374

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-08-01 04:27:03 +02:00
Fupan Li
838e169b9c Merge pull request #2248 from lifupan/check_file_exist
mount: fix the issue of missing check file exists
2021-07-27 23:29:26 +08:00
Peng Tao
fd2607cc43 Merge pull request #2202 from teawater/swap7
Add swap support
2021-07-20 21:12:30 +08:00
Fabiano Fidêncio
75c5edd66a Merge pull request #2263 from eryugey/eryugey/for-main
agent: clear MsFlags if the option has clear flag set
2021-07-20 12:50:45 +02:00
Bin Liu
6b00806bb8 Merge pull request #2243 from egernst/bump-tokio
agent/agent-ctl: update tokio to 1.8.1
2021-07-20 13:56:32 +08:00
Hui Zhu
4f066db8da agent: agent.proto: Add AddSwap
Add new fuction AddSwap.  When agent get AddSwap, it will get the device
name from PCIPath and set the device as the swap device.

Fixes: #2201

Signed-off-by: Hui Zhu <teawater@antfin.com>
2021-07-19 23:20:34 +08:00
Bin Liu
b94ebc30b4 Merge pull request #2235 from Tim-Zhang/vsock-exporter-async
vsock-exporter: switch to tokio runtime
2021-07-19 17:06:14 +08:00
Eryu Guan
35cbc93dee agent: clear MsFlags if the option has clear flag set
'FLAGS' hash map has bool to indicate if the flag should be cleared or
not. But in parse_mount_flags_and_options() we set the flag even 'clear'
is true. This results in a 'rw' mount being mounted as 'MS_RDONLY'.

Fixes: #2262
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2021-07-19 11:50:10 +08:00
fupan.lfp
5371b9214f mount: fix the issue of missing check file exists
It's better to check whether the destination file exists
before creating them, if it had been existed, then return
directly.

Fixes: #2247

Signed-off-by: fupan.lfp <fupan.lfp@antgroup.com>
2021-07-15 18:09:33 +08:00
Eric Ernst
acf6932863 agent: update tokio to 1.8.1
Update to latest tokio to address RUSTSEC-2021-0072:
 Task dropped in wrong thread when aborting `LocalSet` task

Update the toml to specify just 1.x for the tokio version.

Fixes: #2165

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
2021-07-14 17:18:21 -07:00
Tim Zhang
73d3798cb1 vsock-exporter: switch to tokio runtime
Make the vsock-exporter async totally using tokio runtime.
And delay the timing of the connection to trace-forwarder so that
it is easy to reconnect when the connection was broken.

Fixes: #2234

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-14 20:16:05 +08:00
Fabiano Fidêncio
a104f13230 agent: Add make vendor
This has a similar intent as the go code, but not totally equal.  For
the go code we want to ensure that the vendored code is up-to-date,
while here we want to ensure that `cargo vendor` actually works.

We happened to release a few tarballs where `cargo vendor` didn't work
and it causes some pain for downstream maintainers.

Related: #2159

Signed-off-by: Fabiano Fidêncio <fidencio@redhat.com>
2021-07-14 13:59:41 +02:00
Tim Zhang
7960689ef7 tracing: replace SimpleSpanProcessor with BatchSpanProcessor
This change make tokio could be use in vsock-exporter.

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-14 15:59:52 +08:00
Tim Zhang
6999dccaa8 trace-forwarder: Add option rustflags, target, build-type for the make
Support rust-flags, target and build-type.

Fixes: #2215

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-13 11:35:46 +08:00
David Gibson
1ab72518b3 agent: Fix to parsing of /proc/self/mountinfo
get_mounts() parses /proc/self/mountinfo in order to get the mountpoints
for various cgroup filesystems.  One of the entries in mountinfo is the
"device" for each filesystem, but for virtual filesystems like /proc, /sys
and cgroups, the device entry is arbitrary.  Depending on the exact rootfs
setup, it can end up being "-".

This breaks get_mounts() because it uses " - " as a separator.  There
really is a " - " separator in mountinfo, but in this case the device entry
shows up as a second one.  Fix this, by changing a split to a splitn, which
will effectively only consider the first " - " in the line.

While we're there, make the warning message more useful, by having it
actually show which line it wasn't able to parse.

fixes #2182

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2021-07-10 19:30:27 +10:00
Jakob Naucke
8758ce26b7 agent: Enable virtio-blk-ccw
Forward-port of https://github.com/kata-containers/agent/pull/600.
Enable virtio-blk-ccw devices in agent (virtio-blk for s390x, already
enabled in runtime).

Fixes: #2026

Signed-off-by: Jakob Naucke <jakob.naucke@ibm.com>
2021-07-08 14:59:47 +02:00
Bin Liu
26985bbfff Merge pull request #2173 from Tim-Zhang/enhance-test-execute-hook
agent: enhance tests of execute_hook
2021-07-05 14:36:45 +08:00
Fabiano Fidêncio
015b3baf06 Merge pull request #2178 from mxpv/config
agent: Cleanup config
2021-07-03 09:51:16 +02:00
Fupan Li
2de9c5b41d Merge pull request #1969 from liubin/feature/1968-pass-span-context-to-agent
Pass span context from runtime to agent to get a full trace #1968
2021-07-03 09:31:02 +08:00
Maksym Pavlenko
e6b1766f6b agent: Cleanup config
This commit clean up config parsing and testing code to make it a bit more easy to maintain.
- Adds `with_context` from anyhow to include the underlying error. This helps to understand what exactly went wrong.
- Uses ensure and bail as a shorter alternative for `if` checks.
- TestData in test_parse_cmdline is now implements Default to reduce boilerplate code
- Remove `make_err` as it doesn’t make any sense.

Fixes: #2177

Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
2021-07-02 14:28:43 -07:00
Tim Zhang
55c5c871d2 agent: enhance tests of execute_hook
Use which to find the full path of exe before run execute_hook
to avoid error: 'No such file or directory'

Fixes: #2172

Signed-off-by: Tim Zhang <tim@hyper.sh>
2021-07-02 14:30:56 +08:00
bin
65d2fb5d11 agent: remove instrument attribute for some simple functions
For some simple functions that only process memory data(list/hashmap),
they don't need to be instrumented.

And sometime they may generate non-parent spans, if they are called from
daemon-style "threads".

Fixes: #1968

Signed-off-by: bin <bin@hyper.sh>
2021-07-02 10:07:28 +08:00
bin
cfb8139f36 agent: add more instruments for RPC calls
All RPC calls can get parent span context,
and create new sub-spans for the full trace.

Fixes: #1968

Signed-off-by: bin <bin@hyper.sh>
2021-07-02 10:07:28 +08:00
Fabiano Fidêncio
3fe0af6a9b Merge pull request #2152 from liubin/fix/2111-update-netlink-libs
agent: update netlink libraries
2021-07-01 12:01:35 +02:00
Fabiano Fidêncio
550029c473 Merge pull request #2060 from liubin/2059/delete-some-lint-attributes
agent: delete some lint attributes
2021-06-30 16:51:07 +02:00
bin
aa264f915f agent: update netlink libraries
Update rtnetlink to use crate.io to make cargo vendor work.
Add vendor/ to .gitignore.

Fixes: #2111

Signed-off-by: bin <bin@hyper.sh>
2021-06-30 22:39:50 +08:00
Fabiano Fidêncio
7d37fbfdfb Merge pull request #2115 from sameo/topic/rust-nix
cargo: Use latest nix crate for all Rust code bases
2021-06-28 08:18:53 +02:00
Samuel Ortiz
f6294226e8 cargo: Use latest nix crate for all Rust code bases
Our dependencies already bring several versions of nix, we should avoid
adding even more fragementation.

Fixes #2114

Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>
2021-06-25 03:38:37 +02:00
Maksym Pavlenko
6a93e5d593 agent: Initial watchable-bind implementation
Add support for watchable-bind storage driver. When watchable-bind storage
is present, the agent will create a watchable path in a tmpfs, and poll the
watchable-bind source to keep this new mount-point up to date.

This poll will allow the agent to present the mount-point to the
container, allowing for inotify usage by the container workload.

If a mount becomes too large, either in file count or in overall size,
we want to stop treating it as watchable, and instead just treat as a
bindmount. This'll help avoid DoS by growing tmpfs too large, as well
as limiting time spent scanning files. If a watchable-bind grows beyond
8 files (arbitrary sane number for certs/secrets) or 1MB (limit on ConfigMap size),
we treat it as a normal bind.

Fixes: #1879

Signed-off-by: Eric Ernst <eric_ernst@apple.com>
Signed-off-by: Maksym Pavlenko <pavlenko.maksym@gmail.com>
Signed-off-by: Samuel Ortiz <samuel.e.ortiz@protonmail.com>

agent: watcher: SandboxStorages check loop cleanup
2021-06-24 10:07:06 -07:00
bin
000049b69e agent: delete some lint attributes
Thes lint attributes can be deleted to keep clean code.

Fixes: #2059

Signed-off-by: bin <bin@hyper.sh>
2021-06-18 16:08:25 +08:00
snir911
1faaf5f35d Merge pull request #2000 from ManaSugi/update-mount-flags
agent: Add some mount options and sort the options alphabetically
2021-06-17 11:53:11 +03:00
Manabu Sugimoto
bd27f7bab5 agent: Sort PROPAGATION and OPTIONS alphabetically to scan easily
It's hard to visually scan over the list currently.
Therefore, we should sort the list alphabetically to scan easily.

Fixes: #1999

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2021-06-16 17:23:05 +09:00