We only need one qmp channel and it is qemu internal detail thus
sandbox.go does not need to be aware of it.
Fixes: #428
Signed-off-by: Peng Tao <bergwolf@gmail.com>
Before this patch shared dir will reamin when sandox
has already removed, espacilly for kata-agent mod.
Do clean up shared dirs after all mounts are umounted.
Fixes: #291
Signed-off-by: Haomin <caihaomin@huawei.com>
Out CI is failing because of a recent change introduced in the
CNI plugins repo(github.com/containernetworking/plugins) that vendors in
CNI v0.7.0-alpha0. Refer to commit #e4fdb6cd1883b7b.
However, it looks like the the plugins themselves have not been
updated yet, causing failures in CI. This was verified by vendoring
in the latest CNI and CNI plugins in our repo.
Till the plugin binaries our fixed, use older version of CNI plugins
for testing virtcontainers. See this:
https://github.com/containernetworking/plugins/commit/68b4efb4056c
In any case we should keep this version
in sync with what we vendor in, in our runtime and not use the
latest commit.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
All calls to deleteNetNS were passing the "mounted" parameter as
true. So drop this parameter.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
In case of physical network interfaces, we explicitly
pass through them to the VM. We need to bind them back to
the host driver when the sandbox is stopped, irrespective if
the network namespace has been created by virtcontainers or not.
Fixes#384
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
This commit checks the size of "/dev/shm" for the sandbox container
which is then used to create the shared memory inside the guest.
kata agent then uses this size to set up a sandbox level ephemeral
storage for shm. The containers then simply bind mount this sandbox level
shm.
With this, we will now be able to support docker --shm-size option
as well have a shared shm within containers in a pod, since they are
supposed to be in the same IPC namespace.
Fixes#356
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Correct the document URLs which have gone stale.
The virtcontainers build status links have been moved to the top-level
README.
Fixes#376.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
in old specs.Spec, Capabilities is [] string, but we don't use CompatOCISpec
for compatibility in kataAgent/createContainer.
fixes#333
Signed-off-by: y00316549 <yangshukui@huawei.com>
Pause and resume container functions allow us to just pause/resume a
specific container not all the sanbox, in that way different containers
can be paused or running in the same sanbox.
Signed-off-by: Julio Montes <julio.montes@intel.com>
When a container is updated, those modifications are stored, to
avoid race conditions with other operations, a RW lock should be used.
fixes#346
Signed-off-by: Julio Montes <julio.montes@intel.com>
Since the vendoring included changes introducing PauseContainer
and ResumeContainer changes, fix the tests to satisfy the grpc api.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
k8s provides a configuration for sharing PID namespace
among containers. In case of crio and cri plugin, an infra
container is started first. All following containers are
supposed to share the pid namespace of this container.
In case a non-empty pid namespace path is provided for a container,
we check for the above condition while creating a container
and pass this out to the kata agent in the CreatContainer
request as SandboxPidNs flag. We clear out the PID namespaces
in the configuration passed to the kata agent.
Fixes#343
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Still there are some codes left which
will cause some misunderstanding
Change `p` in short of `pod` into `s` or `sandbox`
Fixes: #325
Signed-off-by: Haomin <caihaomin@huawei.com>
Events cli display container events such as cpu,
memory, and IO usage statistics.
By now OOM notifications and intel RDT are not fully supproted.
Fixes: #186
Signed-off-by: Haomin <caihaomin@huawei.com>
Don't fail if a new container with a CPU constraint was added to
a POD and no more vCPUs are available, instead apply the constraint
and let kernel balance the resources.
Signed-off-by: Julio Montes <julio.montes@intel.com>
There is a relation between the maximum number of vCPUs and the
memory footprint, if QEMU maxcpus option and kernel nr_cpus
cmdline argument are big, then memory footprint is big, this
issue only occurs if CPU hotplug support is enabled in the kernel,
might be because of kernel needs to allocate resources to watch all
sockets waiting for a CPU to be connected (ACPI event).
For example
```
+---------------+-------------------------+
| | Memory Footprint (KB) |
+---------------+-------------------------+
| NR_CPUS=240 | 186501 |
+---------------+-------------------------+
| NR_CPUS=8 | 110684 |
+---------------+-------------------------+
```
In order to do not affect CPU hotplug and allow to users to have containers
with the same number of physical CPUs, this patch tries to mitigate the
big memory footprint by using the actual number of physical CPUs as the
maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in
the runtime configuration file, otherwise `default_maxvcpus` is used as the
maximum number of vCPUs.
Before this patch a container with 256MB of RAM
```
total used free shared buff/cache available
Mem: 195M 40M 113M 26M 41M 112M
Swap: 0B 0B 0B
```
With this patch
```
total used free shared buff/cache available
Mem: 236M 11M 188M 26M 36M 186M
Swap: 0B 0B 0B
```
fixes#295
Signed-off-by: Julio Montes <julio.montes@intel.com>
Reduce the virtcontainers prefix path to avoid hitting the 107 byte
Unix domain socket path limit.
Related #268.
Fixes#290.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
A Unix domain socket is limited to 107 usable bytes on Linux. However,
not all code creating socket paths was checking for this limits.
Created a new `utils.BuildSocketPath()` function (with tests) to
encapsulate the logic and updated all code creating sockets to use it.
Fixes#268.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
An empty string for an environment variable simply means that the
variable is unset. Do not error out if the env value is empty.
Fixes#288
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Update command is used to update container's resources at run time.
All constraints are applied inside the VM to each container cgroup.
By now only CPU constraints are fully supported, vCPU are hot added
or removed depending of the new constraint.
fixes#189
Signed-off-by: Julio Montes <julio.montes@intel.com>
* Move makeNameID() func to virtcontainers/utils file as it's a generic
function for making name and ID.
* Move bindDevicetoVFIO() and bindDevicetoHost() to vfio driver package.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
CreateDevice() is only used by `NewDevices()` so we can make it private and
there's no need to export it.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Fixes#50
This is done for decoupling device management part from other parts.
It seperate device.go to several dirs and files:
```
virtcontainers/device
├── api
│ └── interface.go
├── config
│ └── config.go
├── drivers
│ ├── block.go
│ ├── generic.go
│ ├── utils.go
│ ├── vfio.go
│ ├── vhost_user_blk.go
│ ├── vhost_user.go
│ ├── vhost_user_net.go
│ └── vhost_user_scsi.go
└── manager
├── manager.go
└── utils.go
```
* `api` contains interface definition of device management, so upper level caller
should import and use the interface, and lower level should implement the interface.
it's bridge to device drivers and callers.
* `config` contains structed exported data.
* `drivers` contains specific device drivers including block, vfio and vhost user
devices.
* `manager` exposes an external management package with a `DeviceManager`.
Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
Store the PCI address of rootfs in case the rootfs is block
based and passed using virtio-block.
This helps up get rid of prdicting the device name inside the
container for the block device. The agent will determine the device
node name using the PCI address.
Fixes#266
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Store PCI address for a block device on hotplugging it via
virtio-blk. This address will be passed by kata agent in the
device "Id" field. The agent within the guest can then use this
to identify the PCI slot in the guest and create the device node
based on it.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
We need to store the bridge address to state to use it
for assigning addresses to devices attached to teh bridge.
So we need to make sure that the bridge pointer is assigned
the address.
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>