Commit Graph

471 Commits

Author SHA1 Message Date
Sebastien Boeuf
b4c3a2ffbd virtcontainers: fc: Stop the VM by killing the process
Because firecracker currently does not support a proper stop from
the caller, and because we don't want the agent to initiate a reboot
to shutdown the VM, the simplest and most efficient solution at the
moement is to signal the VM process with SIGTERM first, followed by
a SIGKILL if the process is still around.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-20 11:54:59 -08:00
Manohar Castelino
fba23796d6 firecracker: Add support for pseudo hotplug
Use the firecracker rescan logic to update the pre-attached drive.
This allows us to emulate hotplug.

Initially the drive backing stores are set to empty files on the
host. Once the actual block based device or file is available
swap the backing store.

The rescan needs to be issued iff the VM is running.

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2018-12-20 11:54:59 -08:00
Manohar Castelino
22ebc09f00 firecracker: Close the vsock vhostfd
Unlike QEMU firecracker cannot accept a fd as part of the REST API.
Close the vsock vhostfd close to the point where we launch the VM.

Note: This is still racy.

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2018-12-20 11:54:59 -08:00
Manohar Castelino
e65bafa793 virtcontainers: Add firecracker as a supported hypervisor
Add firecracker as a supported hypervisor. This connects the
newly defined firecracker implementation as a supported
hypervisor.

Move operation definition to the common hypervisor code.

Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2018-12-20 11:54:59 -08:00
Manohar Castelino
c1d3f1a98b firecracker: VMM API support
Initial Support for the firecracker VMM

Note:
- 9p is unsupported by firecracker
- Enable pseudo hotplug block device hotplug capability

Initially, this will be a pseudo capability for Firecracker hypervisor,
but we will utilize a pool of block devices and block device rescan as a
temporary workaround.

Fixes: #1064

Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
2018-12-20 11:54:49 -08:00
Sebastien Boeuf
a21d1e693f virtcontainers: cgroups: Don't error if no thread ID
In case the hypervisor implementation does not return any thread
ID, this should not issue any error since there is simply nothing
to constrain.

Fixes #1062

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-19 14:19:40 -08:00
Julio Montes
378d8157a6 virtcontainers: copy or bind mount shared file
Copy files to contaier's rootfs if hypervisor doesn't supports filesystem
sharing, otherwise bind mount them in the shared directory.

see #1031

Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-12-19 09:58:44 -06:00
Julio Montes
bc31844106 virtcontainers: Check file sharing support
If the hypervisor does not support filesystem sharing (for example, 9p),
files will be copied over gRPC using the copyFile request function.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-12-19 09:58:21 -06:00
Julio Montes
62917621c2 virtcontainers: copy files form host to guest
Files are copied over gRPC and there is no limit in size of the files that
can be copied. Small files are copied using just one gRPC call while big files
are copied by parts.

Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-12-19 09:55:25 -06:00
Eric Ernst
dcd48a9ca1 vc: capabilities: add capability flags for filesystem sharing
Not all hypervisors support filesystem sharing. Add capability flags to track
this. Since most hypervisor implementations in Kata *do* support this, the set
semantices are reversed (ie, set the flag if you do not support the feature).

Fixes: #1022

Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Signed-off-by: Julio Montes <julio.montes@intel.com>
2018-12-19 09:54:00 -06:00
Frank Cao
07a0b163f9 Merge pull request #1049 from sameo/topic/ctx-unset
virtcontainers: Add context when creating tests sandboxes
2018-12-19 14:43:16 +08:00
Sebastien Boeuf
0f1fde498d virtcontainers: network: Use multiqueue flag only when appropriate
The multiqueue flag associated with the TUNTAP network device cannot
be used if the number of queues indicates 0. When 0, this means the
multiqueue is not supported, and we cannot use the according flag.

Fixes #1051

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-18 11:06:06 -08:00
Samuel Ortiz
f63a18deea virtcontainers: Add context when creating tests sandboxes
We can use the background context when creating test sandboxes from the
sanbox unit tests. This shuts the "trace called before context set"
erros down.

Fixes: #1048

Signed-off-by: Samuel Ortiz <sameo@linux.intel.com>
2018-12-18 13:22:08 +01:00
Sebastien Boeuf
658bd82490 Merge pull request #1034 from Pennyzct/hvc
qemu-arm64: refactor 'console=hvc0,hvc1' for kata-agent debugging
2018-12-17 06:50:55 -08:00
Penny Zheng
c8c564bdd6 qemu-arm64: refactor 'console=hvc0,hvc1' for kata-agent debugging
Since kata-agent is using virtio-console to output debugging info
and the console ports are available in the guest as /dev/hvc0 and
/dev/hvc1, we should swap origin console type 'console=ttyAMA0'
with 'console=hvc0,hvc1'.

Fixes: #1033

Signed-off-by: Penny Zheng <penny.zheng@arm.com>
Signed-off-by: Wei Chen <Wei.Chen@arm.com>
2018-12-17 11:34:11 +08:00
Sebastien Boeuf
a1af1cb099 virtcontainers: network: Rely on hypervisor capabilities for multi queues
In order to properly setup the network, hence allocate or not multiple
queues, this commit makes sure that the hypervisor capabilities are
checked for this.

Fixes #1027

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-14 15:39:25 -08:00
Sebastien Boeuf
a227ab852a virtcontainers: hypervisor: Add capability regarding multiqueue support
Each hypervisor is different and supports different options regarding
the network interface it creates. In particular, the multiqueue option
is not supported by Firecracker and should not be assumed by default.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-14 15:11:44 -08:00
Sebastien Boeuf
0bcd221fad virtcontainers: network: Rename numCPUs to queues
The point of knowing the number of CPUs from the network perspective
is to determine the number of queues that can be allocated to the
network interface of the our virtual machine.

Therefore, it's more logical to name it queues from a network.go
perspective.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-14 15:08:55 -08:00
Sebastien Boeuf
2cb4bb9db7 virtcontainers: network: Reorganize endpoints interconnection
In order to prevent from future duplication of calls into the
hypervisor interface, the hypervisor is directly passed as part
of the xConnectVMNetwork() function. Because this does not apply
the disconnection case, this commit splits the former function
into two separate ones.

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-14 14:50:11 -08:00
James O. D. Hunt
bcf995bfe1 Merge pull request #887 from jcvenegas/sandbox-manage-resources
virtcontainers: make sandbox manage VM resources
2018-12-14 09:21:36 +00:00
Jose Carlos Venegas Munoz
d4586d4bcc test: remove TestHotplugRemoveMemory
HotplugRemoveMemory require to do a qmp call, but
unit test does not start a Qemu instance.

Depends-on: github.com/kata-containers/tests#1007

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-12-13 16:33:35 -06:00
Jose Carlos Venegas Munoz
0d80202573 vc:sandbox: rename newcontainer to fetchcontainer.
The containers is not new but fech from an existing one.

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-12-13 16:33:24 -06:00
Jose Carlos Venegas Munoz
618cfbf1db vc: sandbox: Let sandbox manage VM resources.
- Container only is responsable of namespaces and cgroups
inside the VM.

- Sandbox will manage VM resources.

The resouces has to be re-calculated and updated:

- Create new Container: If a new container is created the cpus and memory
may be updated.

- Container update: The update call will change the cgroups of a container.
the sandbox would need to resize the cpus and VM depending the update.

To manage the resources from sandbox the hypervisor interaface adds two methods.

- resizeMemory().

This function will be used by the sandbox to request
increase or decrease the VM memory.

- resizeCPUs()

vcpus are requested to the hypervisor based
on the sum of all the containers in the sandbox.

The CPUs calculations use the container cgroup information all the time.

This should allow do better calculations.

For example.

2 containers in a pod.

container 1 cpus = .5
container 2 cpus = .5

Now:
Sandbox requested vcpus 1

Before:
Sandbox requested vcpus 2

When a update request is done only some atributes have
information. If cpu and quota are nil or 0 we dont update them.

If we would updated them the sandbox calculations would remove already
removed vcpus.

This commit also moves the sandbox resource update call at container.update()
just before the container cgroups information is updated.

Fixes: #833

Signed-off-by: Jose Carlos Venegas Munoz <jose.carlos.venegas.munoz@intel.com>
2018-12-13 16:33:14 -06:00
Peng Tao
8444a7a99e factory: set guest time after resuming
We might have paused a guest for a long time so we need to sync
its time.

Fixes:#951
Signed-off-by: Peng Tao <bergwolf@gmail.com>
2018-12-12 12:54:16 +08:00
Julio Montes
976f5b2a6e Merge pull request #990 from alicefr/s390x
s390x: add support for s390x
2018-12-11 10:57:27 -06:00
Alice Frosi
6f83061139 s390x: add support for s390x
The PR adds the support for s390x.

In the case of CCW devices, the vhost-user devices are not supported.
See #659. An error message is thrown if they tried to be used.

Memory hotplug is not supported on s390 yet and an error message is thrown.

The VirtioNetPCI has been changed to VirtioNet. The generalization
allows to set the VirtioNet to the correct CCW device for s390x.

Fixes: #666

Co-authored-by: Yash D Jain ydjainopensource@gmail.com
Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-12-11 12:32:17 +01:00
fupan
20f2d30ab8 virtcontainers: share the agent's client between factory's VM and sandbox
When agent is configured as longLive, the VM's agent created
by factory will not close it's client once it connected, thus
the sandbox's agent cannot re-connect successfully.

Sharing the agent's client between VM's agent and sandbox
can fix this issue.

Fixes: #995

Signed-off-by: fupan <lifupan@gmail.com>
2018-12-10 18:28:08 +08:00
Xu Wang
408428edf4 Merge pull request #957 from teawater/cache
Block: Add cache-related options for block devices
2018-12-07 11:01:40 +08:00
Sebastien Boeuf
31b0db0892 Merge pull request #960 from alicefr/update_cid_vsock
Update cid vsock
2018-12-06 22:11:31 +00:00
James O. D. Hunt
ed6f7eb56a Merge pull request #938 from jodh-intel/trace-shim
shim: Add trace config option
2018-12-06 11:03:44 +00:00
Alice Frosi
deb6f16d82 virtcontainers: update context id of vsock to uint64
The CID of VSock needs to be change to uint64. Otherwise that leads to
an endianess issue. For more details see
https://github.com/kata-containers/runtime/issues/947

Remove the uint64 introduced by #984

Fixes: #958

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-12-06 10:13:30 +00:00
Hui Zhu
f6511471d4 block: Add cache-related options for block devices
Add block_device_cache_set, block_device_cache_direct and
block_device_cache_noflush.
They are cache-related options for block devices that are described in
https://github.com/qemu/qemu/blob/master/qapi/block-core.json.
block_device_cache_direct denotes whether use of O_DIRECT (bypass the host
page cache) is enabled.  block_device_cache_noflush denotes whether flush
requests for the device are ignored.
The json said they are supported since 2.9.
So add block_device_cache_set to control the cache options set to block
devices or not.  It will help to support the old version qemu.

Fixes: #956

Signed-off-by: Hui Zhu <teawater@hyper.sh>
2018-12-06 18:07:44 +08:00
Sebastien Boeuf
018c8c1468 vendor: Update govmm vendoring
Shortlog:

f9b31c0 qemu: Allow disable-modern option from QMP
d617307 Run tests for the s390x build
b36b5a8 Contributors: Add Clare Chen to CONTRIBUTORS.md
b41939c Contributors: Add my name
dab4cf1 qmp: Add tests
5ea6da1 Verify govmm builds on s390x
ee75813 contributors: add my name
c80fc3b qemu: Add s390x support
ca477a1 Update source file headers
e68e005 Update the CONTRIBUTING.md
2b7db54 Add the CONTRIBUTORS.md file
b3b765c qemu: test Valid for Vsock for Context ID
3becff5 qemu: change of ContextID from uint32 to uint64
f30fd13 qmp: Output error detail when execute QMP command failed
7da6a4c qmp: fix mem-path properties for hotplug memory.
e4892e3 qemu/qmp: preparation for s390x support
110d2fa qemu/qmp: add new function ExecuteBlockdevAddWithCache
a0b0c86 qmp_test: Change QMP version from 2.6 to 2.9
10c36a1 qemu: add support for pidfile option

Fixes #983

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-12-06 00:13:15 -08:00
James O. D. Hunt
ea74b981d9 shim: Add trace config option
Add a new `enable_tracing` option to `configuration.toml` to enable
shim tracing.

Fixes #934.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-12-05 15:20:13 +00:00
Frank Cao
6edb3618f6 Merge pull request #952 from alicefr/utils_linux_fix
virtcontainers: change uint32 to uint64 for ioctl
2018-11-30 09:56:39 +08:00
Alice Frosi
04ce4c05df virtcontainers: change uint32 to uint64 for ioctl
The PR changes the parameter args from uint32 to uint64 for ioctl function.
That leads to an endianess bug.

Fixes: #947

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-11-29 14:51:06 +00:00
Sebastien Boeuf
fa9b15dafe virtcontainers: Return the appropriate container status
When our runtime is asked for the container status, we also handle
the scenario where the container is stopped if the shim process for
that container on the host has terminated.

In the current implementation, we retrieve the container status
before stopping the container, causing a wrong status to be returned.
The wait for the original go-routine's completion was done in a defer
within the caller of statusContainers(), resulting in the
statusContainer()'s values to return the pre-stopped value.

This bug is first observed when updating to docker v18.09/containerd
v1.2.0. With the current implementation, containerd-shim receives the
TaskExit when it detects kata-shim is terminating. When checking the
container state, however, it does not get the expected "stopped" value.

The following commit resolves the described issue by simplifying the
locking used around the status container calls. Originally
StatusContainer would request a read lock. If we needed to update the
container status in statusContainer, we'd start a go-routine which
would request a read-write lock, waiting for the original read lock to
be released.  Can't imagine a bug could linger in this logic. We now
just request a read-write lock in the caller (StatusContainer),
skipping the need for a separate go-routine and defer. This greatly
simplifies the logic, and removes the original bug.

Fixes #926

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-28 20:10:34 -08:00
James O. D. Hunt
9984636f5a kata-env: Show runtime trace setting
Show whether runtime tracing is enabled in the output of `kata-env`.

Fixes #936.

Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2018-11-23 16:29:30 +00:00
j00444339
cba7a882aa virtcontainers: fix sandbox store struct VFIODevice bug
Since struct VFIODevice needed to be stored into disk by storeSandboxDevices() function,
however struct VFIODevice has a field named "vfioDevs", which is named begin with lower-case,
so it can't be written into file by json.Marshal.And this bug will will cause hotplug vfio
device can not been removed correctly while container exits.

Fixes: #924

Signed-off-by: flyflypeng <jiangpengfei9@huawei.com>
2018-11-20 08:40:19 -05:00
Sebastien Boeuf
ccc41d7363 Merge pull request #911 from alicefr/memory_hotplug
virtcontainers: Add function supportGuestMemoryHotplug
2018-11-19 20:17:32 +00:00
Alice Frosi
0796f2e5a0 virtcontainers: Add function supportGuestMemoryHotplug
This PR defines a new function supportGuestMemoryHotplug that
clearly defines if the architecture supports memory hotplug. The function
can be reimplemented in virtcontainers/qemu_$arch.go file for each
architecture.

Fixes: #910

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-11-19 11:22:22 +00:00
Alice Frosi
d73f27c612 test: set arch for test TestHotplugRemoveMemory
The arch field needs to be set

Signed-off-by: Alice Frosi <afrosi@de.ibm.com>
2018-11-19 11:21:14 +00:00
Ace-Tang
4b9a471f29 virtcontainers: fix not close socket with ethtool
close socket after use ethtool.NewEthtool()

Fixes: #919

Signed-off-by: Ace-Tang <aceapril@126.com>
2018-11-19 10:42:37 +08:00
Archana Shinde
23e75f0f03 Merge pull request #895 from caoruidong/multi-hotplug
network: support hotplug a nic several times
2018-11-15 14:33:41 -08:00
Sebastien Boeuf
982381bff0 api: Cleanup StartContainer()
Simple patch reducing the complexity of StartContainer().

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:56 -08:00
Sebastien Boeuf
57773816b3 sandbox: Create and export Pause/ResumeContainer() to the API level
In order to support use cases such as containerd-shim-v2 where
we would have a long running process holding the sandbox pointer,
there would be no reason to call into the stateless functions
PauseContainer() and ResumeContainer(), which would recreate a
new sandbox pointer and the corresponding ones for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:50 -08:00
Sebastien Boeuf
b298ec4228 sandbox: Create and export ProcessListContainer() to the API level
In order to support use cases such as containerd-shim-v2 where
we would have a long running process holding the sandbox pointer,
there would be no reason to call into the stateless function
ProcessListContainer(), which would recreate a new sandbox pointer
and the corresponding ones for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:44 -08:00
Sebastien Boeuf
3add296f78 sandbox: Create and export KillContainer() to the API level
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function KillContainer(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:37 -08:00
Sebastien Boeuf
76537265cb sandbox: Create and export StopContainer() to the API level
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StopContainer(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:31 -08:00
Sebastien Boeuf
109e12aa56 sandbox: Export Stop() to the API level
In order to support use cases such as containerd-shim-v2 where we
would have a long running process holding the sandbox pointer, there
would be no reason to call into the stateless function StopSandbox(),
which would recreate a new sandbox pointer and the corresponding ones
for containers.

Fixes #903

Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
2018-11-12 15:15:24 -08:00