In order to fix#1059, we want to create a hypervisor package. Some of
the hypervisor implementations (qemu) depend on the network and endpoint
interfaces. We can not have a virtcontainers -> hypervisor -> network,
endpoint -> virtcontainers cyclic dependency.
So before creating the hypervisor package, we need to decouple the
network API from the virtcontainers one.
Fixes: #1180
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
There's only one real implementer of the network interface and no real
need to implement anything else. We can just go ahead and remove this
abstraction.
Fixes: #1179
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
The agent code creates a directory at
`/run/kata-containers/shared/sandboxes/sbid/` to hold shared data
between host and guest. We need to clean it up when removing a sandbox.
Fixes: #1138
Signed-off-by: Peng Tao <bergwolf@gmail.com>
In order to move the hypervisor implementations into their own package,
we need to put the asset type into the types package and break the
hypervisor->asset->virtcontainers->hypervisor cyclic dependency.
Fixes: #1119
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We always call waitSandbox after we start the VM (startSandbox), so
let's simplify the hypervisor interface and integrate waiting for the VM
into startSandbox.
This makes startSandbox a blocking call, but that is practically the
case today.
Fixes: #1009
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We always combine the hypervisor init and createSandbox, because what
we're trying to do is simply that: Set the hypervisor and have it create
a sandbox.
Instead of keeping a method with vague semantics, remove init and
integrate the actual hypervisor setup phase into the createSandbox one.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We can now remove all the sandbox shared types and convert the rest of
the code to using the new internal types package.
This commit includes virtcontainers, cli and containerd-shim changes in
one atomic change in order to not break bisect'ibility.
Fixes: #1095
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Since we're going to have both external and internal types packages, we
alias the external one as vcTypes. And the internal one will be usable
through the types namespace.
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Pass Seccomp profile to the agent only if
the configuration.toml allows it to be passed
and the agent/image is seccomp capable.
Fixes: #688
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
cri containerd calls kill on stopped sandbox and if we
fail the call, it can cause `cri stopp` command to fail
too.
Fixes: #1084
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Now that stopVM() also calls agent.stopSandbox(), we can have the
sandbox Stop() call using stopVM() directly and avoid code duplication.
Fixes: #1011
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
We always ask the agent to start the sandbox when we start the VM, so we
should simply call agent.startSandbox from startVM instead of open
coding those.
This slightly simplifies the complex createSandboxFromConfig routine.
Fixes: #1011
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
This includes cleaning up the sandbox on disk resources,
and closing open fds when preparing the hypervisor.
Fixes: #1057
Signed-off-by: Peng Tao <bergwolf@gmail.com>
- Container only is responsable of namespaces and cgroups
inside the VM.
- Sandbox will manage VM resources.
The resouces has to be re-calculated and updated:
- Create new Container: If a new container is created the cpus and memory
may be updated.
- Container update: The update call will change the cgroups of a container.
the sandbox would need to resize the cpus and VM depending the update.
To manage the resources from sandbox the hypervisor interaface adds two methods.
- resizeMemory().
This function will be used by the sandbox to request
increase or decrease the VM memory.
- resizeCPUs()
vcpus are requested to the hypervisor based
on the sum of all the containers in the sandbox.
The CPUs calculations use the container cgroup information all the time.
This should allow do better calculations.
For example.
2 containers in a pod.
container 1 cpus = .5
container 2 cpus = .5
Now:
Sandbox requested vcpus 1
Before:
Sandbox requested vcpus 2
When a update request is done only some atributes have
information. If cpu and quota are nil or 0 we dont update them.
If we would updated them the sandbox calculations would remove already
removed vcpus.
This commit also moves the sandbox resource update call at container.update()
just before the container cgroups information is updated.
Fixes: #833
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
In order to support use cases such as containerd-shim-v2 where
we would have a long running process holding the sandbox pointer,
there would be no reason to call into the stateless functions
PauseContainer() and ResumeContainer(), which would recreate a
new sandbox pointer and the corresponding ones for containers.
Fixes#903
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support use cases such as containerd-shim-v2 where
we would have a long running process holding the sandbox pointer,
there would be no reason to call into the stateless function
ProcessListContainer(), which would recreate a new sandbox pointer
and the corresponding ones for containers.
Fixes#903
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function KillContainer(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.
Fixes#903
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StopContainer(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.
Fixes#903
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StopSandbox(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.
Fixes#903
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StartSandbox(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.
Fixes#903
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Now that Interface structure includes the useful information about
the type of interface, Kata does not need to do any assumption about
the type of interface that needs to be added.
Fixes#866
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
This commit replaces every place where the "types" package from the
Kata agent was used, with the new "types" package from virtcontainers.
In order to do so, it introduces a few translation functions between
the agent and virtcontainers types, since this is needed by the kata
agent implementation.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
After we scan the netns, we should hotplug the network interface to
the guest after it is kicked off running.
Fixes: #871
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Fixes#344
Add host cgroup support for kata.
This commits only adds cpu.cfs_period and cpu.cfs_quota support.
It will create 3-level hierarchy, take "cpu" cgroup as an example:
```
/sys/fs/cgroup
|---cpu
|---kata
|---<sandbox-id>
|--vcpu
|---<sandbox-id>
```
* `vc` cgroup is common parent for all kata-container sandbox, it won't be removed
after sandbox removed. This cgroup has no limitation.
* `<sandbox-id>` cgroup is the layer for each sandbox, it contains all other qemu
threads except for vcpu threads. In future, we can consider putting all shim
processes and proxy process here. This cgroup has no limitation yet.
* `vcpu` cgroup contains vcpu threads from qemu. Currently cpu quota and period
constraint applies to this cgroup.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Signed-off-by: Jingxiao Lu <lujingxiao@huawei.com>
Some agent types definition that were generic enough to be reused
everywhere, have been split from the initial grpc package.
This prevents from importing the entire protobuf package through
the grpc one, and prevents binaries such as kata-netmon to stay
in sync with the types definitions.
Fixes#856
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Refactor these functions so differernt types of endpoints can use a unified
function to hotplug nics.
Fixes#731
Signed-off-by: Ruidong Cao <caoruidong@huawei.com>
This reverts commit 8a6d383715.
Don't remove all directories in the shared directory because
`docker cp` re-mounts all the mount points specified in the
config.json causing serious problems in the host.
fixes#777
Signed-off-by: Julio Montes <julio.montes@intel.com>
Add configuration to decide the amount of slots that will be used in a VM
- This will limit the amount of times that memory can be hotplugged.
- Use memory slots provided by user.
- tests: aling struct
cli: kata-env: Add memory slots info.
- Show the slots to be added to the VM.
```diff
[Hypervisor]
MachineType = "pc"
Version = "QEMU ..."
Path = "/opt/kata/bin/qemu-system-x86_64"
BlockDeviceDriver = "virtio-scsi"
Msize9p = 8192
+ MemorySlots = 10
Debug = false
UseVSock = false
```
Fixes: #751
Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
Add support for cgroup driver systemd.
systemd cgroup is not applied in the VM since in some cases like initrd images
there is no systemd running and nobody can update a systemd cgroup using
systemctl.
fixes#596
Signed-off-by: Julio Montes <julio.montes@intel.com>
Get and store guest details after sandbox is completely created.
And get memory block size from sandbox state file when check
hotplug memory valid.
Signed-off-by: Clare Chen <clare.chenhui@huawei.com>
Signed-off-by: Zichang Lin <linzichang@huawei.com>
Because the network monitor will be listening to every event received
through the netlink socket, it will be notified everytime a new link
will be added/updated/modified in the network namespace it's running
into. The goal being to detect new interface added by Docker such as
a veth pair.
The problem is that kata-runtime will add other internal interfaces
when the network monitor will ask for the addition of the new veth
pair. And we need a way to ignore those new interfaces being created
as they relate to the veth pair that is being added. That's why, in
order to prevent from running into an infinite loop, virtcontainers
needs to tag the internal interfaces with the "kata" suffix so that
the network monitor will be able to ignore them.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The PR moves ahead the start of proxy process for vm factory so that
it waits for both vm and proxy to be up at the same time. This saves
about 300ms for new container creation in my local test machine.
Fixes: #683
Signed-off-by: Peng Tao <bergwolf@gmail.com>
We can just use hyprvisor config to specify the memory size
of a guest. There is no need to maintain the extra place just
for memory size.
Fixes: #692
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Fixes#635
When container rootfs is block based in devicemapper use case, we can re-use
sandbox device manager to manage rootfs block plug/unplug, we don't detailed
description of block in container state file, instead we only need a Block index
referencing sandbox device.
Remove `HotpluggedDrive` and `RootfsPCIAddr` from state file because it's not
necessary any more.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
For vm factory, we also need netns to be set otherwise we fail to
create new VMs in `s.network.run`.
Fixes: #681
Signed-off-by: Peng Tao <bergwolf@gmail.com>
If the sandbox has been initialized with a factory, this means the
caller should be in charge of adding any network to the VM, and
virtcontainers library cannot make any assumptions about adding
the default underlying network.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
The CLI being the implementation of the OCI specification, and the
hooks being OCI specific, it makes sense to move the handling of any
OCI hooks to the CLI level. This changes allows the Kata API to
become OCI agnostic.
Fixes#599
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>