Commit Graph

2069 Commits

Author SHA1 Message Date
GabyCT
61a167139c Merge pull request #4186 from liubin/fix/4185-skip-loop-by-user
agent: Add a macro to skip a loop easier
2022-05-09 16:58:29 -05:00
Fupan Li
8aad2c59c5 Merge pull request #4184 from liubin/fix/4182-runk-kill-all
runk: use custom Kill command to support --all option
2022-05-09 17:56:10 +08:00
James O. D. Hunt
79d93f1fe7 Merge pull request #4137 from Shensd/sandbox-tests-online_resources
agent: add test coverage for functions find_process and online_resources
2022-05-06 09:20:57 +01:00
Chelsea Mafrica
e2f68c6093 Merge pull request #4187 from fidencio/test-hook-grpc-to-oci
rustjail: Add tests for hook_grpc_to_oci
2022-05-04 09:25:45 -07:00
Fabiano Fidêncio
bd5da4a7d9 Merge pull request #4189 from yibozhuang/watchable-mount-permission
agent watchers: ensure uid/gid is preserved on copy/mkdir
2022-05-04 12:29:24 +02:00
Fabiano Fidêncio
33a8b70558 clh: Rely on Cloud Hypervisor for generating the device ID
We're currently hitting a race condition on the Cloud Hypervisor's
driver code when quickly removing and adding a block device.

This happens because the device removal is an asynchronous operation,
and we currently do *not* monitor events coming from Cloud Hypervisor to
know when the device was actually removed.  Together with this, the
sandbox code doesn't know about that and when a new device is attached
it'll quickly assign what may be the very same ID to the new device,
leading to the Cloud Hypervisor's driver trying to hotplug a device with
the very same ID of the device that was not yet removed.

This is, in a nutshell, why the tests with Cloud Hypervisor and
devmapper have been failing every now and then.

The workaround taken to solve the issue is basically *not* passing down
the device ID to Cloud Hypervisor and simply letting Cloud Hypervisor
itself generate those, as Cloud Hypervisor does it in a manner that
avoids such conflicts.  With this addition we have then to keep a map of
the device ID and the Cloud Hypervisor's generated ID, so we can
properly remove the device.

This workaround will probably stay for a while, at least till someone
has enough cycles to implement a way to watch the device removal event
and then properly act on that.  Spoiler alert, this will be a complex
change that may not even be worth it considering the race can be avoided
with this commit.

Fixes: #4176

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-05-04 09:04:03 +02:00
Jack Hance
475e3bf38f agent: add test coverage for functions find_process and online_resources
Add test coverage for the functions find_process and online_resources in src/sandbox.rs.

Fixes #4085
Fixes #4136

Signed-off-by: Jack Hance <jack.hance@ndsu.edu>
2022-05-03 16:00:24 -05:00
Yibo Zhuang
70eda2fa6c agent: watchers: ensure uid/gid is preserved on copy/mkdir
Today in agent watchers, when we copy files/symlinks
or create directories, the ownership of the source path
is not preserved which can lead to permission issues.

In copy, ensure that we do a chown of the source path
uid/gid to the destination file/symlink after copy to
ensure that ownership matches the source ownership.
fs::copy() takes care of setting the permissions.

For directory creation, ensure that we set the
permissions of the created directory to the source
directory permissions and also perform a chown of the
source path uid/gid to ensure directory ownership
and permissions matches to the source.

Fixes: #4188

Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
2022-05-03 09:57:31 -07:00
Garrett Mahin
4a1e13bd8f rustjail: Add tests for hook_grpc_to_oci
Add test coverage for hook_grpc_to_oci in rustjail/src/lib.rs

Fixes: #4125

Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>
2022-05-02 23:59:33 +02:00
Bin Liu
383be2203a agent: Add a macro to skip a loop easier
Add a macro to skip a loop easier without using a
if {} else {} condition check.

Fixes: #4185

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-04-30 20:45:41 +08:00
Bin Liu
c633780ba7 Merge pull request #4119 from bradenrayhorn/test-create-logger-task
agent: add tests for create_logger_task function
2022-04-30 19:48:07 +08:00
Bin Liu
97d7b1845b runk: use custom Kill command to support --all option
runk uses liboci-cli crate to parse command line options,
but liboci-cli does not support --all option for kill command,
though this is the runtime spec behavior.

But crictl will issue kill --all command when stopping containers,
as a workaround, we use a custom kill command instead of the one
provided by liboci-cli.

Fixes: #4182

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-04-30 19:34:18 +08:00
Bin Liu
7772f7dd99 runk: set BinaryName for runk for containerd
The default runtime for io.containerd.runc.v2 is runc,
to use runk, the containerd configuration should set the
default runtime to runk or add BinaryName options for the
runtime.

Fixes: #4177

Signed-off-by: Bin Liu <bin@hyper.sh>
2022-04-29 22:26:32 +08:00
James O. D. Hunt
cc839772d3 Merge pull request #2785 from ManaSugi/standard-container-runtime
tools: Add a Rust-based standard OCI container runtime based on Kata agent
2022-04-29 13:20:59 +01:00
James O. D. Hunt
2d5f11501c Merge pull request #4083 from bradenrayhorn/test-parse-mount-table
rustjail: add tests for parse_mount_table
2022-04-29 11:34:22 +01:00
Jianyong Wu
982c32358a Merge pull request #4031 from Jaylyn-Ren/kata-spdk
Virtcontainers: Enable hot plugging vhost-user-blk device on ARM
2022-04-29 12:16:38 +08:00
Chelsea Mafrica
3f069c7acb Merge pull request #4166 from jodh-intel/agent-ctl-fix-abstract
agent-ctl: Fix abstract socket connections
2022-04-28 10:17:28 -07:00
James O. D. Hunt
666aee54d2 docs: Add VSOCK localhost example for agent-ctl
Update the `agent-ctl` docs to show how to use a VSOCK local address
when running the agent and the tool in the same environment. This is an
alternative to using a Unix socket.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-04-28 13:33:23 +01:00
James O. D. Hunt
86d348e065 docs: Use VM term in agent-ctl doc
Use the standard "VM" acronym to mean Virtual Machine.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-04-28 13:33:19 +01:00
James O. D. Hunt
4b9b62bb3e agent-ctl: Fix abstract socket connections
Unbreak the `agent-ctl` tool connecting to the agent with a Unix domain
socket.

It appears that [1] changed the behaviour of connecting to the agent
using a local Unix socket (which is not used by Kata under normal
operation).

The change can be seen by reverting to commit
72b8144b56 (the one before [1]) and
running the agent manually as:

```bash
$ sudo KATA_AGENT_SERVER_ADDR=unix:///tmp/foo.socket target/x86_64-unknown-linux-musl/release/kata-agent
```

Before [1], in another terminal we see this:

```bash
$ sudo lsof -U 2>/dev/null |grep foo|awk '{print $9}'
@/tmp/foo.socket@
```

But now, we see the following:

```bash
$ sudo lsof -U 2>/dev/null |grep foo|awk '{print $9}'
@/tmp/foo.socket
```

Note the last byte which represents a nul (`\0`) value.

The `agent-ctl` tool used to add that trailing nul but now it seems to not
be needed, so this change removes it, restoring functionality. No
external changes are necessary so the `agent-ctl` tool can connect to
the agent as below like this:

```bash
$ cargo run -- -l debug connect --server-address "unix://@/tmp/foo.socket" --bundle-dir "$bundle_dir" -c Check -c GetGuestDetails
```

[1] - https://github.com/kata-containers/kata-containers/issues/3124

Fixes: #4164.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2022-04-28 13:33:09 +01:00
Fabiano Fidêncio
b6467ddd73 clh: Expose disk rate limiter config
With everything implemented, let's now expose the disk rate limiter
configuration options in the Cloud Hypervisor configuration file.

Fixes: #4139

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:28:29 +02:00
Fabiano Fidêncio
7580bb5a78 clh: Expose net rate limiter config
With everything implemented, let's now expose the net rate limiter
configuration options in the Cloud Hypervisor configuration file.

Fixes: #4017

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:28:13 +02:00
Fabiano Fidêncio
a88adabaae clh: Cloud Hypervisor has a built-in Rate Limiter
The notion of "built-in rate limiter" was added as part of
bd8658e362, and that commit considered
that only Firecracker had a built-in rate limiter, which I think was the
case when that was introduced (mid 2020).

Nowadays, however, Cloud Hypervisor takes advantage of the very same crate
used by Firecraker to do I/O throttling.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:27:56 +02:00
Fabiano Fidêncio
63c4da03a9 clh: Implement the Disk RateLimiter logic
Let's take advantage of the newly added DiskRateLimiter* options and
apply those to the network device configuration.

The logic here is identical to the one already present in the Network
part of Cloud Hypervisor's driver.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:27:53 +02:00
Fabiano Fidêncio
511f7f822d config: Add DiskRateLimiter* to Cloud Hypervisor
Let's add the newly added disk rate limiter configurations to the Cloud
Hypervisor's hypervisor configuration.

Right now those are not used anywhere, and there's absolutely no way the
users can set those up.  That's coming later in this very same series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:27:15 +02:00
Fabiano Fidêncio
5b18575dfe hypervisor: Add disk bandwidth and operations rate limiters
This is the disk counterpart of the what was introduced for the network
as part of the previous commits in this series.

The newly added fields are:
* DiskRateLimiterBwMaxRate, defined in bits per second, which is used to
  control the network I/O bandwidth at the VM level.
* DiskRateLimiterBwOneTimeBurst, also defined in bits per second, which
  is used to define an *initial* max rate, which doesn't replenish.
* DiskRateLimiterOpsMaxRate, the operations per second equivalent of the
  DiskRateLimiterBwMaxRate.
* DiskRateLimiterOpsOneTimeBurst, the operations per second equivalent of
  the DiskRateLimiterBwOneTimeBurst.

For now those extra fields have only been added to the hypervisor's
configuration and they'll be used in the coming patches of this very
same series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:27:11 +02:00
Fabiano Fidêncio
1cf9469297 clh: Implement the Network RateLimiter logic
Let's take advantage of the newly added NetRateLimiter* options and
apply those to the network device configuration.

The logic here is quite similar to the one already present in the
Firecracker's driver, with the main difference being the single Inbound
/ Outbound MaxRate and the presence of both Bandwidth and Operations
rate limiter.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:26:38 +02:00
Fabiano Fidêncio
00a5b1bda9 utils: Define DefaultRateLimiterRefillTimeMilliSecs
Firecracker's driver doesn't expose the RefillTime option of the rate
limiter to the user.  Instead, it uses a contant value of 1000
miliseconds (1 second).

As we're following Firecracker's driver implementation, let's expose
create a new constant, use it as part of the Firecracker's driver, and
later on re-use it as part of the Cloud Hypervisor's driver.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:22:42 +02:00
Fabiano Fidêncio
be1bb7e39f utils: Move FC's function to revert bytes to utils
Firecracker's revertBytes function, now called "RevertBytes", can be
exposed as part of the virtcontainers' utils file, as this function will
be reused by Cloud Hypervisor, when adding the rate limiter logic there.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:22:42 +02:00
Fabiano Fidêncio
c9f6496d6d config: Add NetRateLimiter* to Cloud Hypervisor
Let's add the newly added network rate limiter configurations to the
Cloud Hypervisor's hypervisor configuration.

Right now those are not used anywhere, and there's absolutely no way the
users can set those up.  That's coming later in this very same series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:22:42 +02:00
Fabiano Fidêncio
2d35e6066d hypervisor: Add network bandwidth and operations rate limiters
In a similar way to what's already exposed as RxRateLimiterMaxRate and
TxRateLimiterMaxRate, let's add four new fields to the Hypervisor's
configuration.

The values added are related to bandwidth and operations rate limiters,
which have to be added so we can expose I/O throttling configurations to
users using Cloud Hypervisor as their preferred VMM.

The reason we cannot simply re-use {Rx,Tx}RateLimiterMaxRate is because
Cloud Hypervisor exposes a single MaxRate to be used for both inbound
and outbound queues.

The newly added fields are:
* NetRateLimiterBwMaxRate, defined in bits per second, which is used to
  control the network I/O bandwidth at the VM level.
* NetRateLimiterBwOneTimeBurst, also defined in bits per second, which
  is used to define an *initial* max rate, which doesn't replenish.
* NetRateLimiterOpsMaxRate, the operations per second equivalent of the
  NetRateLimiterBwMaxRate.
* NetRateLimiterOpsOneTimeBurst, the operations per second equivalent of
  the NetRateLimiterBwOneTimeBurst.

For now those extra fields have only been added to the hypervisor's
configuration and they'll be used in the coming patches of this very
same series.

Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
2022-04-28 10:22:42 +02:00
Braden Rayhorn
b0e439cb66 rustjail: add tests for parse_mount_table
Add tests for parse_mount_table function in rustjail/src/mount.rs.
Includes some minor refactoring improve the testability of the
function and improve its error values.

Fixes: #4082

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-04-27 20:06:01 -05:00
Manabu Sugimoto
b221a2590f tools: Add runk
Add a Rust-based standard OCI container runtime based on
Kata agent.

You can build and install runk as follows:

```sh
$ cd src/tools/runk
$ make
$ sudo make install
$ runk --help
```

Fixes: #2784

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-04-28 00:48:57 +09:00
Manabu Sugimoto
2c218a07b9 agent: Modify Kata agent for runk
Generate an oci-kata-agent which is a customized agent to be
called from runk which is a Rust-based standard OCI container
runtime based on Kata agent.

Fixes: #2784

Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
2022-04-28 00:48:57 +09:00
James O. D. Hunt
0a6e7d443e Merge pull request #3910 from etrunko/agent_random
Agent: Unit tests for random.rs
2022-04-27 09:41:02 +01:00
James O. D. Hunt
7b20707197 Merge pull request #4107 from garrettmahin/test-mount-grpc-to-oci
rustjail: Add tests for mount_grpc_to_oci
2022-04-27 08:50:24 +01:00
Peng Tao
5b6e45ed6c Merge pull request #4141 from dgibson/cleanup-tmp
Fix Go unit tests to clean up /tmp after themselves
2022-04-26 15:43:34 +08:00
Garrett Mahin
4b9e78b837 rustjail: Add tests for mount_grpc_to_oci
Add test coverage for mount_grpc_to_oci in rustjail/src/lib.rs

Fixes: #4106

Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>
2022-04-25 08:37:17 -05:00
James O. D. Hunt
bc919cc54c Merge pull request #4122 from bradenrayhorn/test-mount-from
rustjail: add tests for mount_from function
2022-04-25 11:55:21 +01:00
James O. D. Hunt
cb8dd0f4fc Merge pull request #4143 from garrettmahin/test-hooks-grpc-to-oci
rustjail: Add tests for hooks_grpc_to_oci
2022-04-25 10:50:52 +01:00
Braden Rayhorn
81f6b48626 agent: add tests for create_logger_task function
Add tests for create_logger_task function in src/main.rs.

Fixes: #4113

Signed-off-by: Braden Rayhorn <bradenrayhorn@fastmail.com>
2022-04-24 21:38:32 -05:00
Garrett Mahin
96bc3ec2e9 rustjail: Add tests for hooks_grpc_to_oci
Add test coverage for hooks_grpc_to_oci in rustjail/src/lib.rs

Fixes: #4142

Signed-off-by: Garrett Mahin <garrett.mahin@gmail.com>
2022-04-22 19:20:04 -05:00
holyfei
0239502781 agent: modify the type of swappiness to u64
The type of MemorySwappiness in runtime is uint64, and the type of swappiness in agent is int64,
if we set max uint64 in runtime and pass it to agent, the value will be equal to -1. We should
modify the type of swappiness to u64

Fixes: #4123

Signed-off-by: holyfei <yangfeiyu20092010@163.com>
2022-04-22 16:55:37 +08:00
David Gibson
1b931f4203 runtime: Allock mockfs storage to be placed in any directory
Currently EnableMockTesting() takes no arguments and will always place the
mock storage in the fixed location /tmp/vc/mockfs.  This means that one
test run can interfere with the next one if anything isn't cleaned up
(and there are other bugs which means that happens).  If if those were
fixed this would allow developers testing on the same machine to interfere
with each other.

So, allow the mockfs to be placed at an arbitrary place given as a
parameter to EnableMockTesting().  In TestMain() we place it under our
existing temporary directory, so we don't need any additional cleanup just
for the mockfs.

fixes #4140

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-04-22 14:47:59 +10:00
David Gibson
ef6d54a781 runtime: Let MockFSInit create a mock fs driver at any path
Currently MockFSInit always creates the mockfs at the fixed path
/tmp/vc/mockfs.  This change allows it to be initialized at any path
given as a parameter.  This allows the tests in fs_test.go to be
simplified, because the by using a temporary directory from
t.TempDir(), which is automatically cleaned up, we don't need to
manually trigger initTestDir() (which is misnamed, it's actually a
cleanup function).

For now we still use the fixed path when auto-creating the mockfs in
MockAutoInit(), but we'll change that later.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-04-22 14:23:36 +10:00
David Gibson
5d8438e939 runtime: Move mockfs control global into mockfs.go
virtcontainers/persist/fs/mockfs.go defines a mock filesystem type for
testing.  A global variable in virtcontainers/persist/manager.go is used to
force use of the mock fs rather than a normal one.

This patch moves the global, and the EnableMockTesting() function which
sets it into mockfs.go.  This is slightly cleaner to begin with, and will
allow some further enhancements.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-04-22 14:23:36 +10:00
David Gibson
963d03ea8a runtime: Export StoragePathSuffix
storagePathSuffix defines the file path suffix - "vc" - used for
Kata's persistent storage information, as a private constant.  We
duplicate this information in fc.go which also needs it.

Export it from fs.go instead, so it can be used in fc.go.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-04-22 14:23:36 +10:00
David Gibson
1719a8b491 runtime: Don't abuse MockStorageRootPath() for factory tests
A number of unit tests under virtcontainers/factory use
MockStorageRootPath() as a general purpose temporary directory.  This
doesn't make sense: the mockfs driver isn't even in use here since we only
call EnableMockTesting for the pase virtcontainers package, not the
subpackages.

Instead use t.TempDir() which is for exactly this purpose.  As a bonus it
also handles the cleanup, so we don't need MockStorageDestroy any more.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-04-22 14:23:36 +10:00
David Gibson
bec59f9e39 runtime: Make bind mount tests better clean up after themselves
There are several tests in mount_test.go which perform a sample bind
mount.  These need a corresponding unmount to clean up afterwards or
attempting to delete the temporary files will fail due to the existing
mountpoint.  Most of them had such an unmount, but
TestBindMountInvalidPgtypes was missing one.

In addition, the existing unmounts where done inconsistently - one was
simply inline (so wouldn't be executed if the test fails too early) and one
is a defer.  Change them all to use the t.Cleanup mechanism.

For the dummy mountpoint files, rather than cleaning them up after the
test, the tests were removing them at the beginning of the test.  That
stops the test being messed up by a previous run, but messily.  Since
these are created in a private temporary directory anyway, if there's
something already there, that indicates a problem we shouldn't ignore.
In fact we don't need to explicitly remove these at all - they'll be
removed along with the rest of the private temporary directory.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-04-22 14:20:35 +10:00
David Gibson
f7ba21c86f runtime: Clean up mock hook logs in tests
The tests in hook_test.go run a mock hook binary, which does some debug
logging to /tmp/mock_hook.log.  Currently we don't clean up those logs
when the tests are done.  Use a test cleanup function to do this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2022-04-22 14:14:52 +10:00