The current implementation of walking the
disks to match with the requested volume path
in agent doesn't work because the volume path
provided by the shim to the agent is the mount
path within the guest and not the device name.
The current logic is trying to match the
device name to the volume path which will never
match.
This change will simplify the
get_volume_capacity_stats and
get_volume_inode_stats to just call statfs and
get the bytes and inodes usage of the volume
path directly.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
Today the shim does a translation when doing
direct-volume stats where it takes the source and
returns the mount path within the guest.
The source for a direct-assigned volume is actually
the device path on the host and not the publish
volume path.
This change will perform a lookup of the mount info
during direct-volume stats to ensure that the
device path is provided to the shim for querying
the volume stats.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
The go default http mux AFAIK doesn’t support pattern
routing so right now client is padding the url
for direct-volume stats with a subpath of the volume
path and this will always result in 404 not found returned
by the shim.
This change will update the shim to take the volume
path as a GET query parameter instead of a subpath.
If the parameter is missing or empty, then return
400 BadRequest to the client.
Fixes: #4297
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
The action function expects a function that returns error
but the current direct-volume stats Action returns
(string, error) which is invalid.
This change fixes the format and print out the stats from
the command instead.
Fixes: #4293
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
The documentation of the bufio package explicitly says
"Err returns the first non-EOF error that was encountered by the
Scanner."
When io.EOF happens, `Err()` will return `nil` and `Scan()` will return
`false`.
Fixes#4079
Signed-off-by: Rafael Fonseca <r4f4rfs@gmail.com>
In the is_signal_handled function, when parsing the hex string returned
from `/proc/<pid>/status` the space/tab character after the colon
is not removed.
This patch trims the result of SigCgt so that
all whitespace characters are removed. It also extends the existing
test cases to check for this scenario.
Fixes: #4250
Signed-off-by: Champ-Goblem <cameron@northflank.com>
As now we build and ship the rust version of virtiofsd, which is not
tied to QEMU, we need to update its default location to match with where
we're installing this binary.
Fixes: #4249
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
go-test.sh by default adds the -v option to 'go test' meaning that output
will be printed from all the passing tests as well as any failing ones.
This results in a lot of output in which it's often difficult to locate the
failing tests you're interested in.
So, remove -v from the default flags.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
One of the responsibilities of the go-test.sh script is setting up the
default flags for 'go test'. This is constructed across several different
places in the script using several unneeded intermediate variables though.
Consolidate all the flag construction into one place.
fixes#4190
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
go-test.sh changes behaviour based on both the $CI and $KATA_DEV_MODE
variables, but not in a way that makes a lot of sense.
If either one is set it uses the test_coverage path, instead of the
test_local path. That collects coverage information, as the name
suggests, but it also means it runs the tests twice as root and
non-root, which is very non-obvious.
It's not clear what use case the test_local path is for at all.
Developer local builds will typically have $KATA_DEV_MODE set and CI
builds will have $CI set. There's essentially no downside to running
coverage all the time - it has little impact on the test runtime.
In addition, if *both* $CI and $KATA_DEV_MODE are set, the script
refuses to run things as root, considering it "unsafe". While having
both set might be unwise in a general sense, there's not really any
way running sudo can be any more unsafe than it is with either one
set.
So, simplify everything by just always running the test_coverage path.
This leaves the test_local path unused, so we can remove it entirely.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
go-test.sh accepts subcommands, however invoking it in the usual way via
the Makefile doesn't use them. In fact the only remaining subcommand is
"help" and we already have another way of getting the usage information
(-h or --help). We don't need a second way, so just drop subcommand
handling.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
go-test.sh defaults to testing all the packages listed by go list, except
for a number filtered out. It turns out that none of those filters are
necessary any more:
* We've long required a Go newer than 1.9 which means the vendor filter
isn't needed
* The agent filter doesn't do anything now that we've moved to the Kata
2.x unified repo
* The tests filters don't hit anything on the list of modules in
src/runtime (which is the only user of the script)
But since we don't need to filter anything out any more, we don't even need
to iterate through a list ourselves. We can simply pass "./..." directly
to go test and it will iterate through all the sub-packages itself.
Interestingly this more than doubles the speed of "make test" for me - I
suspect because go test's internal paralellism works better over a larger
pool of tests.
This also lets us remove handling of non-existent coverage files from
test_go_package(), since with default options we will no longer test packages without tests
by default. If the user explicitly requests testing of a package with no
tests, then failing makes sense.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The go-test.sh script has an explicit chmod command, run as root, to
set the mode of the temporary coverage files to 0644. AFAICT the
point of this is specifically the 004 bit allowing world read access,
so that we can then merge the temporary coverage file into the main
coverage file.
That's a convoluted way of doing things. Instead we can just run the tail
command which reads the temporary file as the same user that generated it.
In addition, go-test.sh became root to remove that temporary coverage
file. This is not necessary, since deleting a regular file just requires
write access to the directory, not the file itself.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The html-coverage option to this script doesn't really alter behaviour
it just does the same thing as normal coverage, then converts the
report to HTML. That conversion is a single command, plus a chmod to
make the final output mode 0644. That overrides any umask the user
has set, which doesn't seem like a policy decision this script should
be making.
Nothing in the kata-containers or tests repository uses this, so it doesn't
really make sense to keep this logic inside this script.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In addition to coverage.txt, the go-test.sh script creates
coverage.txt.tmp files while running. These are temporary and
certainly shouldn't be committed, so add them to the gitignore file.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The go unit tests for the runtime are invoked by the helper script
ci/go-test.sh. Which calls the run_go_test() function in ci/lib.sh. Which
calls into .ci/go-test.sh from the tests repository.
But.. the runtime is the only user of this script, and generally stuff for
unit tests (rather than functional or integration tests) lives in the main
repository, not the tests repository.
So, just move the actual script into src/runtime. A change to remove it
from the tests repo will follow.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
to the kata-containers repo under the src/tools/log-parser folder
and vendor the modules
Fixes: #4100
Signed-off-by: Snir Sheriber <ssheribe@redhat.com>
We're currently hitting a race condition on the Cloud Hypervisor's
driver code when quickly removing and adding a block device.
This happens because the device removal is an asynchronous operation,
and we currently do *not* monitor events coming from Cloud Hypervisor to
know when the device was actually removed. Together with this, the
sandbox code doesn't know about that and when a new device is attached
it'll quickly assign what may be the very same ID to the new device,
leading to the Cloud Hypervisor's driver trying to hotplug a device with
the very same ID of the device that was not yet removed.
This is, in a nutshell, why the tests with Cloud Hypervisor and
devmapper have been failing every now and then.
The workaround taken to solve the issue is basically *not* passing down
the device ID to Cloud Hypervisor and simply letting Cloud Hypervisor
itself generate those, as Cloud Hypervisor does it in a manner that
avoids such conflicts. With this addition we have then to keep a map of
the device ID and the Cloud Hypervisor's generated ID, so we can
properly remove the device.
This workaround will probably stay for a while, at least till someone
has enough cycles to implement a way to watch the device removal event
and then properly act on that. Spoiler alert, this will be a complex
change that may not even be worth it considering the race can be avoided
with this commit.
Fixes: #4176
Signed-off-by: Fabiano Fidêncio <fabiano.fidencio@intel.com>
Add test coverage for the functions find_process and online_resources in src/sandbox.rs.
Fixes#4085Fixes#4136
Signed-off-by: Jack Hance <jack.hance@ndsu.edu>
Today in agent watchers, when we copy files/symlinks
or create directories, the ownership of the source path
is not preserved which can lead to permission issues.
In copy, ensure that we do a chown of the source path
uid/gid to the destination file/symlink after copy to
ensure that ownership matches the source ownership.
fs::copy() takes care of setting the permissions.
For directory creation, ensure that we set the
permissions of the created directory to the source
directory permissions and also perform a chown of the
source path uid/gid to ensure directory ownership
and permissions matches to the source.
Fixes: #4188
Signed-off-by: Yibo Zhuang <yibzhuang@gmail.com>
runk uses liboci-cli crate to parse command line options,
but liboci-cli does not support --all option for kill command,
though this is the runtime spec behavior.
But crictl will issue kill --all command when stopping containers,
as a workaround, we use a custom kill command instead of the one
provided by liboci-cli.
Fixes: #4182
Signed-off-by: Bin Liu <bin@hyper.sh>
The default runtime for io.containerd.runc.v2 is runc,
to use runk, the containerd configuration should set the
default runtime to runk or add BinaryName options for the
runtime.
Fixes: #4177
Signed-off-by: Bin Liu <bin@hyper.sh>