Update the protection checking code to detect newer versions of Intel
TDX (whose userland interface has now stabilised).
> **Note:** that we don't need to retain the existing behaviour since:
>
> - We haven't yet landed the TDX feature (#6448).
> - Systems wishing to use TDX will need to use the latest available
> system components (such as firmware and host kernel).
Also added an explicit TDX unit test.
Fixes: #7384.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
IoCopy is a tricky function (I don't claim to fully understand its contract),
but here is what I see: The goroutine that runs it spawns 3 goroutines - one
for each stream to handle (stdin/stdout/stderr). The goroutine then waits for
the stream goroutines to exit. The idea is that when the process exits and is
closed, the stdout goroutine will be unblocked and close stdin - this should
unblock the stdin goroutine. The stderr goroutine will exit at the same time as
the stdout goroutine. The iocopy routine then closes all tty.io streams.
The problem is that the stdout goroutine decrements the WaitGroup before
closing the stdin stream, which causes the iocopy goroutine to race to close
the streams. Move the wg.Done() of the stdout routine past the close so that
*this* race becomes impossible. I can't guarantee that this doesn't affect some
unspecified behavior.
Fixes: #5031
Signed-off-by: Jeremi Piotrowski <jpiotrowski@microsoft.com>
If we are running FC hypervisor, it is not started when prestart hooks
are executed. So we should just ignore such error and just go ahead and
run the hooks.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
FC does not support network device hotplug. Let's add a check to fail
early when starting containers created by docker.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Add a new hypervisor capability to tell if it supports device hotplug.
If not, we should run prestart hooks before starting new VMs as nerdctl
is using the prestart hooks to set up netns. To make nerdctl + FC
to work, we need to run the prestart hooks before starting new VMs.
Fixes: #6384
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
When running on amd machines, those tests will fail because there is no
vmx flag. Following other tests that checks for cpuType, let's adapt
them to restrict vmx only on Intel machines.
Fixes#7788.
Related #5066
Signed-off-by: Beraldo Leal <bleal@redhat.com>
The log_forwarder task does not check if the peer has closed, causing a
meaningless loop during the period of “kata vm exit”, when the peer
closed, and “ShutdownContainer RPC received” that aborts the log forwarder.
This patch fixes the problem.
Fixes: #7741
Signed-off-by: Zixuan Tan <tanzixuan.me@gmail.com>
There are several processes for container exit:
- Non-detach mode: `Wait` request is sent by containerd, then
`wait_process()` will be called eventually.
- Detach mode: `Wait` request is not sent, the `wait_process()` won’t be
called.
- Killed by ctr: For example, a container runs `tail -f /dev/null`, and
is killed by `sudo ctr t kill -a -s SIGTERM <CID>`. Kill request is
sent, then `kill_process()` will be called. User executes `sudo ctr c
rm <CID>`, `Delete` request is sent, then `delete_process()` will be
called.
- Exited on its own: For example, a container runs `sleep 1s`. The
container’s state goes to `Stopped` after 1 second. User executes
the delete command as below.
Where do we do container cleanup things?
- `wait_process()`: No, because it won’t be called in detach mode.
- `delete_process()`: No, because it depends on when the user executes the
delete command.
- `run_io_wait()`: Yes. A container is considered exited once its IO ended.
And this always be called once a container is launched.
Fixes: #7713
Signed-off-by: Jianyong Wu <jianyong.wu@arm.com>
Signed-off-by: Xuewei Niu <niuxuewei.nxw@antgroup.com>
Refine storage related code by:
- remove the STORAGE_HANDLER_LIST
- define type alias
- move code near to its caller
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Introduce StorageDevice and StorageHandlerManager, which will be used
to refine storage device management for kata-agent.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Simplify the way to manage storage objects, and introduce
StorageStateCommon structures for coming extensions.
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.
Make sure annotations are preferred over config options in image and initrd
path handling.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
We should make sure annotations are preferred over
config options in image and initrd path handling.
Fixes: #7705
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
Right now if we configure an image annotation and have a config file
setting initrd, the initrd config would override the image annotation.
Add a helper function ImageOrInitrdAssetPath to make sure annotations
are preferred over config options in image and initrd path handling.
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
When the FileMode field for the device is unset (0), use a default value instead
to allow the use of the device from the container.
This behaviour is seen from cri-o typically.
Note: this is what runc is doing, which is why regular containers don't have an
issue. This change makes sure kata behaves the same as runc.
Fixes: #7717
Signed-off-by: Julien Ropé <jrope@redhat.com>
Introduce structure KataVirtualVolume to to encapsulate information
for extra mount options and direct volumes, so we could build a common
infrastructure to handle these cases.
Fixes: #7699
Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
When building with AGENT_POLICY=yes and AGENT_INIT=yes:
1. Include OPA and the Policy settings in rootfs.
2. Start OPA from the kata agent.
Before these changes, building with both AGENT_POLICY=yes and
AGENT_INIT=yes was unsupported.
Starting OPA from systemd (when AGENT_INIT=no) was already supported.
Fixes: #7615
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
The error message when the kill command is executed with the container's
state == Stopped should be "container not running" because the containerd
tests expect that OCI runtimes return the error message and compare it.
If the error message is different from the expected one, the tests fail.
Fixes: #7650
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
We extend the `Result` and `Option` types with associated types that
allows converting a `Result<T, E>` and `Option<T>` into
`ttrpc::Result<T>`.
This allows the elimination of many `match` statements in favor of
calling the map function plus the `?` operator. This transformation
simplifies the code.
Fixes: #7624
Signed-off-by: Wedson Almeida Filho <walmeida@microsoft.com>
Fixes: #7573
To enable this feature, build your rootfs using AGENT_POLICY=yes. The
default is AGENT_POLICY=no.
Building rootfs using AGENT_POLICY=yes has the following effects:
1. The kata-opa service gets included in the Guest image.
2. The agent gets built using AGENT_POLICY=yes.
After this patch, the shim calls SetPolicy if and only if a Policy
annotation is attached to the sandbox/pod. When creating a sandbox/pod
that doesn't have an attached Policy annotation:
1. If the agent was built using AGENT_POLICY=yes, the new sandbox uses
the default agent settings, that might include a default Policy too.
2. If the agent was built using AGENT_POLICY=no, the new sandbox is
executed the same way as before this patch.
Any SetPolicy calls from the shim to the agent fail if the agent was
built using AGENT_POLICY=no.
If the agent was built using AGENT_POLICY=yes:
1. The agent reads the contents of a default policy file during sandbox
start-up.
2. The agent then connects to the OPA service on localhost and sends
the default policy to OPA.
3. If the shim calls SetPolicy:
a. The agent checks if SetPolicy is allowed by the current
policy (the current policy is typically the default policy
mentioned above).
b. If SetPolicy is allowed, the agent deletes the current policy
from OPA and replaces it with the new policy it received from
the shim.
A typical new policy from the shim doesn't allow any future SetPolicy
calls.
4. For every agent rpc API call, the agent asks OPA if that call
should be allowed. OPA allows or not a call based on the current
policy, the name of the agent API, and the API call's inputs. The
agent rejects any calls that are rejected by OPA.
When building using AGENT_POLICY_DEBUG=yes, additional Policy logging
gets enabled in the agent. In particular, information about the inputs
for agent rpc API calls is logged in /tmp/policy.txt, on the Guest VM.
These inputs can be useful for investigating API calls that might have
been rejected by the Policy. Examples:
1. Load a failing policy file test1.rego on a different machine:
opa run --server --addr 127.0.0.1:8181 test1.rego
2. Collect the API inputs from Guest's /tmp/policy.txt and test on the
machine where the failing policy has been loaded:
curl -X POST http://localhost:8181/v1/data/agent_policy/CreateContainerRequest \
--data-binary @test1-inputs.json
Signed-off-by: Dan Mihai <dmihai@microsoft.com>
Remove the installation step in the virtcontainers doc
because the virtcontainers install/uninstall targets have
been removed by 86723b51ae
and they are not used anymore.
Fixes: #7637
Signed-off-by: Manabu Sugimoto <Manabu.Sugimoto@sony.com>
Remove configuration file shared_fs = none warnings
now that there is a solution to updating configMaps, secrets etc
Fixes: #7210
Signed-off-by: stevenhorsman <steven@uk.ibm.com>