Commit Graph

119 Commits

Author SHA1 Message Date
Sebastien Boeuf
ec0fd1b67a virtcontainers: sandbox: Add new getter to retrieve netns
As we want to call the OCI hook from the CLI, we need a way for the
CLI to figure out what is the network namespace used by the sandbox.
This is needed particularly because virtcontainers creates the netns
if none was provided.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-08-24 14:19:25 -07:00
Sebastien Boeuf
cb351dca10 network: Create network namespace from the CLI
This commit moves the network namespace creation out of virtcontainers
in order to anticipate the move of the OCI hooks to the CLI through a
follow up commit.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-08-24 14:19:23 -07:00
James O. D. Hunt
d0679a6fd1 tracing: Add tracing support to virtcontainers
Add additional `context.Context` parameters and `struct` fields to allow
trace spans to be created by the `virtcontainers` internal functions,
objects and sub-packages.

Note that not every function is traced; we can add more traces as
desired.

Fixes #566.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-08-22 08:24:58 +01:00
Ruidong Cao
1a17200cc8 virtcontainers: add sandbox hotplug network API
Add sandbox hotplug network API to meet design

Signed-off-by: Ruidong Cao <caoruidong@huawei.com>
2018-08-16 16:10:10 +08:00
Wei Zhang
6e6be98b15 devices: add interface "sandbox.AddDevice"
Fixes #50 .

Add new interface sandbox.AddDevice, then for Frakti use case, a device
can be attached to sandbox before container is created.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-08-15 15:24:12 +08:00
Sebastien Boeuf
16600efc1d Merge pull request #531 from WeiZhang555/bugfix
re-add: refactor device manager
2018-08-02 07:32:02 -07:00
Wei Zhang
b7464899ec devices: address some comments
Address some review comments:
* remove unnecessary rollback logics
* add vfio hot unplug handling.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-31 10:05:56 +08:00
Zhang Wei
44c37bf774 devices: rename VFIODrive to VFIODev
Rename VFIODrive to VFIODev, also rename device interface "GetDeviceDrive()" to
"GetDeviceInfo()".

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-07-31 10:05:56 +08:00
Wei Zhang
f905c16f21 device-manager: refactor device manger
Fixes #50

This commit imports a big logic change:
* host device to be attached or appended now is sandbox level resources,
one device should bind to sandbox/hypervisor first, then container could
reference it via device's unique ID.
* attach or detach device should go through the device manager interface
instead of the device interface.
* allocate device ID in global device mapper to guarantee every device
has a uniq device ID and there won't be any ID collision.

With this change, there will some changes on data format on disk for sandbox
and container, these changes also make a breakage of backward compatibility.

New persist data format:
* every sandbox will get a new "devices.json" file under "/run/vc/sbs/<sid>/"
which saves detailed device information, this also conforms to the concept that
device should be sandbox level resource.
* every container uses a "devices.json" file but with new data format:
```
[
  {
    "ID": "b80d4736e70a471f",
    "ContainerPath": "/dev/zero"
  },
  {
    "ID": "6765a06e0aa0897d",
    "ContainerPath": "/dev/null"
  }
]
```
`ID` should reference to a device in a sandbox, `ContainerPath` indicates device
path inside a container.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-07-31 10:03:57 +08:00
Wei Zhang
eec7fa394f devices: don't use drivers package directly.
Instead of using drivers.XXXDevice directly, we should use exported
struct from device structure. package drivers should be internal struct
and other package should avoid read it's struct content directly.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-31 09:59:29 +08:00
Wei Zhang
5db5f42b71 devices: remove interface VhostUserDevice
The interface "VhostUserDevice" has duplicate functions and fields with
Device, so we can merge them into one interface and manage them with one
group of interfaces.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-31 09:59:29 +08:00
Wei Zhang
1194154309 devices: use device manager to manage all devices
Fixes #50

Previously the devices are created with device manager and laterly
attached to hypervisor with "device.Attach()", this could work, but
there's no way to remember the reference count for every device, which
means if we plug one device to hypervisor twice, it's truly inserted
twice, but actually we only need to insert once but use it in many
places.

Use device manager as a consolidated entrypoint of device management can
give us a way to handle many "references" to single device, because it
can save all devices and remember it's use count.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-31 09:59:29 +08:00
James O. D. Hunt
763a1b6265 logging: Remove unnecessary fields and use standard names
Ensure the entire codebase uses `"sandbox"` and `"container"` log
fields for the sandboxID and containerID respectively.

Simplify code where fields can be dropped.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-07-30 15:32:41 +01:00
Sebastien Boeuf
927487c142 revert: "virtcontainers: support pre-add storage for frakti"
This PR got merged while it had some issues with some shim processes
being left behind after k8s testing. And because those issues were
real issues introduced by this PR (not some random failures), now
the master branch is broken and new pull requests cannot get the
CI passing. That's the reason why this commit revert the changes
introduced by this PR so that we can fix the master branch.

Fixes #529

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-07-27 09:39:56 -07:00
Wei Zhang
8391b20805 devices: address some comments
Address some review comments:
* remove unnecessary rollback logics
* add vfio hot unplug handling.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 14:15:52 +08:00
Zhang Wei
04f4f528f7 devices: rename VFIODrive to VFIODev
Rename VFIODrive to VFIODev, also rename device interface "GetDeviceDrive()" to
"GetDeviceInfo()".

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-07-26 14:15:52 +08:00
Wei Zhang
7f5989f06c device-manager: refactor device manger
Fixes #50

This commit imports a big logic change:
* host device to be attached or appended now is sandbox level resources,
one device should bind to sandbox/hypervisor first, then container could
reference it via device's unique ID.
* attach or detach device should go through the device manager interface
instead of the device interface.
* allocate device ID in global device mapper to guarantee every device
has a uniq device ID and there won't be any ID collision.

With this change, there will some changes on data format on disk for sandbox
and container, these changes also make a breakage of backward compatibility.

New persist data format:
* every sandbox will get a new "devices.json" file under "/run/vc/sbs/<sid>/"
which saves detailed device information, this also conforms to the concept that
device should be sandbox level resource.
* every container uses a "devices.json" file but with new data format:
```
[
  {
    "ID": "b80d4736e70a471f",
    "ContainerPath": "/dev/zero"
  },
  {
    "ID": "6765a06e0aa0897d",
    "ContainerPath": "/dev/null"
  }
]
```
`ID` should reference to a device in a sandbox, `ContainerPath` indicates device
path inside a container.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-07-26 14:09:53 +08:00
Wei Zhang
c08a26397e devices: don't use drivers package directly.
Instead of using drivers.XXXDevice directly, we should use exported
struct from device structure. package drivers should be internal struct
and other package should avoid read it's struct content directly.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 14:09:53 +08:00
Wei Zhang
b54df7e127 devices: remove interface VhostUserDevice
The interface "VhostUserDevice" has duplicate functions and fields with
Device, so we can merge them into one interface and manage them with one
group of interfaces.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 11:33:28 +08:00
Wei Zhang
2885eb0532 devices: use device manager to manage all devices
Fixes #50

Previously the devices are created with device manager and laterly
attached to hypervisor with "device.Attach()", this could work, but
there's no way to remember the reference count for every device, which
means if we plug one device to hypervisor twice, it's truly inserted
twice, but actually we only need to insert once but use it in many
places.

Use device manager as a consolidated entrypoint of device management can
give us a way to handle many "references" to single device, because it
can save all devices and remember it's use count.

Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
2018-07-26 11:33:28 +08:00
Eric Ernst
cd133dc9cb Merge pull request #509 from lifupan/kata-integration
virtconainers: rollback the NetNs when createNetwork failed
2018-07-24 08:13:16 -07:00
Eric Ernst
20066270b9 Merge pull request #503 from bergwolf/container
sandbox: change container slice to a map
2018-07-24 08:10:39 -07:00
Haomin Tsai
8939fd802f Merge pull request #351 from woshijpf/fix-no-kata-agent
virtcontainers: process the case that kata-agent doesn't start in VM
2018-07-24 19:47:08 +08:00
flyflypeng
7103c4f14a virtcontainers: add qemu process rollback
If some errors occur after qemu process start, then we need to
rollback to kill qemu process

Fixes: #297

Signed-off-by: flyflypeng <jiangpengfei9@huawei.com>
2018-07-24 21:36:57 +08:00
Peng Tao
f9d50723b9 sandbox: change container slice to a map
ContainerID is supposed to be unique within a sandbox. It is better to use
a map to describe containers of a sandbox.

Fixes: #502

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-24 12:17:02 +08:00
fupan
c6fda444b7 virtconainers: rollback the NetNs when createNetwork failed
When createNetwork failed, cleanup the NetNs if it created.

Fixes: #508

Signed-off-by: fupan <lifupan@gmail.com>
2018-07-24 12:09:13 +08:00
Peng Tao
d69fbcf17f sandbox: add stateful sandbox config
When enabled, do not release in memory sandbox resources in VC APIs,
and callers are expected to call sandbox.Release() to release the in
memory resources.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-23 09:54:02 +08:00
Peng Tao
7a6f205970 virtcontainers: keep qmp connection when possible
For each time a sandbox structure is created, we ensure s.Release()
is called. Then we can keep the qmp connection as long as Sandbox
pointer is alive.

All VC interfaces are still stateless as s.Release() is called before
each API returns.

OTOH, for VCSandbox APIs, FetchSandbox() must be paired with s.Release,
the same as before.

Fixes: #500

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-23 08:37:55 +08:00
Peng Tao
a7d888febc virtconainers: add SetFactory API
Add SetFactory to allow virtcontainers consumers to set a vm factory.
And use it to create new VMs whenever the factory is set.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-19 12:44:58 +08:00
Peng Tao
057214f0fe agent: prepare for vm factory
There are a few changes we need on kata agent to introduce vm factory
support:
1. decouple agent creation from sandbox config
2. setup agent without creating a sandbox
3. expose vm storage path and share mount point

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-19 12:44:55 +08:00
Peng Tao
18e6a6effc hypervisor: decouple hypervisor from sandbox
A hypervisor implementation does not need to depend on a sandbox
structure. Decouple them in preparation for vm factory.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-07-19 10:49:25 +08:00
fupan
114482ed99 api: To stop its monitor after a sandbox paused
After the sandbox is paused, it's needed to stop its monitor,
Otherwise, its monitors will receive timeout errors if it is
paused for a long time, thus its monitor will not tell it's a
crash caused timeout or just a paused timeout.

Fixes: #472

Signed-off-by: fupan <lifupan@gmail.com>
2018-07-06 19:40:43 +08:00
fupan
9155412b24 api: To watch the vm console in FetchSandbox api
When do sandbox release, the kataBuiltInProxy will
be closed, and it will stop the watch of vm's console;
Thus it needs to restart the proxy to monitor the vm
console once to restore the sandbox.

Fixes: #441

Signed-off-by: fupan <lifupan@gmail.com>
2018-06-26 08:04:33 +08:00
Sebastien Boeuf
fca7eb822d Merge pull request #429 from bergwolf/qmp
qemu: clean up qmp channel
2018-06-20 10:36:43 -07:00
Peng Tao
2b942524a2 sandbox: expose share sandbox pidns setting
So that we let callers decide if kata-agent should let all containers in
a sandbox share the same pid namespace.

This will be first used only by frakti. And kata cli can possibly use it
as well when cri-o and containerd-cri stop creating pause containers
and just pass the CreateSandbox CRI requests to kata.

Fixes: #426

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-06-20 20:51:22 +08:00
Peng Tao
8f329dbf48 qemu: clean up qmp channel
We only need one qmp channel and it is qemu internal detail thus
sandbox.go does not need to be aware of it.

Fixes: #428

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-06-20 17:58:54 +08:00
c00416947
8a6d383715 virtcontainers : fix shared dir resource remaining
Before this patch shared dir will reamin when sandox
has already removed, espacilly for kata-agent mod.

Do clean up shared dirs after all mounts are umounted.

Fixes: #291

Signed-off-by: Haomin <caihaomin@huawei.com>
2018-06-19 20:32:07 +08:00
Sebastien Boeuf
593bd44f20 Merge pull request #385 from amshinde/always-bind-back-physical-interfaces
network: Always bind back physical interfaces
2018-06-18 09:24:58 -07:00
Archana Shinde
f2d9632bc0 network: Always bind back physical interfaces
In case of physical network interfaces, we explicitly
pass through them to the VM. We need to bind them back to
the host driver when the sandbox is stopped, irrespective if
the network namespace has been created by virtcontainers or not.

Fixes #384

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2018-06-16 22:55:15 -07:00
Archana Shinde
4d470e513b shm: Create shared /dev/shm
This commit checks the size of "/dev/shm" for the sandbox container
which is then used to create the shared memory inside the guest.
kata agent then uses this size to set up a sandbox level ephemeral
storage for shm. The containers then simply bind mount this sandbox level
shm.

With this, we will now be able to support docker --shm-size option
as well have a shared shm within containers in a pod, since they are
supposed to be in the same IPC namespace.

Fixes #356

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2018-06-10 01:54:51 -07:00
Archana Shinde
d885782df1 namespace: Check if pid namespaces need to be shared
k8s provides a configuration for sharing PID namespace
among containers. In case of crio and cri plugin, an infra
container is started first. All following containers are
supposed to share the pid namespace of this container.

In case a non-empty pid namespace path is provided for a container,
we check for the above condition while creating a container
and pass this out to the kata agent in the CreatContainer
request as SandboxPidNs flag. We clear out the PID namespaces
in the configuration passed to the kata agent.

Fixes #343

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2018-05-30 13:34:24 -07:00
c00416947
7abb8fe326 virtcontainers: fix codes misunderstanding in virtcontainers
Still there are some codes left which
will cause some misunderstanding

Change `p` in short of `pod` into `s` or `sandbox`

Fixes: #325

Signed-off-by: Haomin <caihaomin@huawei.com>
2018-05-21 11:11:27 +08:00
Peng Tao
be82c7fc6f Merge pull request #299 from jshachm/implement-events-command
cli :Implement events command
2018-05-18 15:35:52 +08:00
c00416947
1205e347f2 cli: implement events command
Events cli display container events such as cpu,
memory, and IO usage statistics.

By now OOM notifications and intel RDT are not fully supproted.

Fixes: #186

Signed-off-by: Haomin <caihaomin@huawei.com>
2018-05-18 09:17:49 +08:00
Julio Montes
4527a8066a virtcontainers/qemu: honour CPU constrains
Don't fail if a new container with a CPU constraint was added to
a POD and no more vCPUs are available, instead apply the constraint
and let kernel balance the resources.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-05-14 17:33:31 -05:00
Julio Montes
81f376920e cli: implement update command
Update command is used to update container's resources at run time.
All constraints are applied inside the VM to each container cgroup.
By now only CPU constraints are fully supported, vCPU are hot added
or removed depending of the new constraint.

fixes #189

Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-05-08 07:26:38 -05:00
Zhang Wei
f4a453b86c virtcontainers: address some comments
* Move makeNameID() func to virtcontainers/utils file as it's a generic
function for making name and ID.
* Move bindDevicetoVFIO() and bindDevicetoHost() to vfio driver package.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-05-08 10:24:26 +08:00
Zhang Wei
366558ad5b virtcontainers: refactor device.go to device manager
Fixes #50

This is done for decoupling device management part from other parts.
It seperate device.go to several dirs and files:

```
virtcontainers/device
├── api
│   └── interface.go
├── config
│   └── config.go
├── drivers
│   ├── block.go
│   ├── generic.go
│   ├── utils.go
│   ├── vfio.go
│   ├── vhost_user_blk.go
│   ├── vhost_user.go
│   ├── vhost_user_net.go
│   └── vhost_user_scsi.go
└── manager
    ├── manager.go
    └── utils.go
```

* `api` contains interface definition of device management, so upper level caller
should import and use the interface, and lower level should implement the interface.
it's bridge to device drivers and callers.
* `config` contains structed exported data.
* `drivers` contains specific device drivers including block, vfio and vhost user
devices.
* `manager` exposes an external management package with a `DeviceManager`.

Signed-off-by: Zhang Wei <zhangwei555@huawei.com>
2018-05-08 10:24:26 +08:00
Peng Tao
1bb6ab9e22 api: add sandbox iostream API
It returns stdin, stdout and stderr stream of the specified process in
the container.

Fixes: #258

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-05-04 15:38:32 +08:00
Peng Tao
bf4ef4324e API: add sandbox winsizeprocess api
It sends tty resize request to the agent to resize a process's tty
window.

Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-05-04 15:38:32 +08:00