kata-containers

mirror of https://github.com/aljazceru/kata-containers.git synced 2025-12-23 17:24:18 +01:00

Author	SHA1	Message	Date
Julio Montes	cc29b8d4b6	Merge pull request #607 from amshinde/pass-disk-as-shared Pass qemu --share-rw option for hotplugging disks	2018-08-24 13:09:16 -05:00
Sebastien Boeuf	a1787da97c	virtcontainers: qemu: Don't shutdown QMP from hotplug The QMP shutdown is taken care of by the sandbox release, through a call to hypervisor.disconnect(). By shutting down the QMP at the qemu level directly, we are creating some unrecoverable errors by trying to close an already closed channel. This patch simply removes the faulty code, following the same design other hotplug functions are designed. Fixes #627 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-08-23 15:54:02 -07:00
James O. D. Hunt	d0679a6fd1	tracing: Add tracing support to virtcontainers Add additional `context.Context` parameters and `struct` fields to allow trace spans to be created by the `virtcontainers` internal functions, objects and sub-packages. Note that not every function is traced; we can add more traces as desired. Fixes #566. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2018-08-22 08:24:58 +01:00
Julio Montes	d6a773c90c	Merge pull request #595 from amshinde/use-main-bus-for-hotplug vfio: Add configuration to support VFIO hotplug on root bus	2018-08-21 11:09:49 -05:00
Archana Shinde	31e2925a9a	vfio: Add configuration to support VFIO hotplug on root bus We need this configuration due to a limitation in seabios firmware in handling hotplug for PCI devices with large BARS. Long term, this needs to be fixed in the firmware. Fixes #594 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-08-20 11:36:21 -07:00
Archana Shinde	70edc56fc1	disk: Pass the --share-rw option for hotplugging disks With qemu 2.10, a write lock was added for qcow images that prevents the same image to be passed more than once. This can be over-ridden using the --share-rw option which is desired for raw images. This solves an issue with running Kata with devicemapper using the privileged mode as in this case all devices on the host are passed to the container including the block device associated with the rootfs, causing it to be passed twice to qemu. Fixes #606 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-08-17 15:08:33 -07:00
Ruidong Cao	1a17200cc8	virtcontainers: add sandbox hotplug network API Add sandbox hotplug network API to meet design Signed-off-by: Ruidong Cao <caoruidong@huawei.com>	2018-08-16 16:10:10 +08:00
Peng Tao	bd5076101c	qemu: create vm directory before launching qemu Right now we create it in `createsandbox` and it would create the vm dir unnecessarily for fetchsandbox() and it ends up leaving an empty vm dir behind even after DeleteSandbox. Fixes: #547 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-08-03 16:40:02 +08:00
Peng Tao	568b65c275	qemu: remove redundant code It looks to be left over due to merge conflicts. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-08-03 16:28:56 +08:00
Sebastien Boeuf	16600efc1d	Merge pull request #531 from WeiZhang555/bugfix re-add: refactor device manager	2018-08-02 07:32:02 -07:00
Fupan Li	15860185d9	virtcontainers: fix the issue of cleanup the vm's path To use the filepath.Join() instead of the simple string append method to form the file path, otherwise it will lose the "/" between the two parts. Fixes #543. Signed-off-by: Fupan Li <lifupan@gmail.com>	2018-08-02 16:21:55 +08:00
James O. D. Hunt	fc0142ec8e	Merge pull request #527 from jodh-intel/remove-initcall-debug-kernel-option kernel: Remove initcall_debug boot option	2018-08-01 12:50:52 +01:00
James O. D. Hunt	a8f5e2becf	kernel: Remove initcall_debug boot option Remove the `initcall_debug` boot option from the kernel command-line as we don't need it any more and it generates a ton of boot messages that may well be impacting performance. Fixes #526. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2018-08-01 09:52:13 +01:00
James O. D. Hunt	487f9efa57	Merge pull request #536 from bergwolf/qmp_clear qemu: clear qmp state before wait for qemu process	2018-08-01 09:51:43 +01:00
Peng Tao	b200163de9	kata_agent: send sandbox id in CreateSandbox request And do not append sandbox id to kernel arguments since that would fail qemu args comparison in vm factory. Fixes: #523 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-08-01 11:18:44 +08:00
Julio Montes	052769196d	virtcontainers: implement function to cold plug vsocks `appendVSockPCI` function can be used to cold plug vocks, vhost file descriptor holds the context ID and it's inherit by QEMU process, ID must be unique and disable-modern prevents qemu from relying on fast MMIO. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-07-31 13:52:44 -05:00
Peng Tao	44a3a441aa	qemu: wait on disconnected channel in qmp shutdown That is how govmm ensures us that the qmp channel has been cleaned up entirely. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-31 18:34:37 +08:00
Peng Tao	c8b4fabc37	qemu: clear qmp state before wait for qemu process So that if there is any remaining state, we do not let it interfere with the new one. This should fix the occasional vm factory hang. Fixes: #535 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-31 11:48:40 +08:00
Zhang Wei	44c37bf774	devices: rename VFIODrive to VFIODev Rename VFIODrive to VFIODev, also rename device interface "GetDeviceDrive()" to "GetDeviceInfo()". Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2018-07-31 10:05:56 +08:00
Wei Zhang	5db5f42b71	devices: remove interface VhostUserDevice The interface "VhostUserDevice" has duplicate functions and fields with Device, so we can merge them into one interface and manage them with one group of interfaces. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-31 09:59:29 +08:00
Wei Zhang	1194154309	devices: use device manager to manage all devices Fixes #50 Previously the devices are created with device manager and laterly attached to hypervisor with "device.Attach()", this could work, but there's no way to remember the reference count for every device, which means if we plug one device to hypervisor twice, it's truly inserted twice, but actually we only need to insert once but use it in many places. Use device manager as a consolidated entrypoint of device management can give us a way to handle many "references" to single device, because it can save all devices and remember it's use count. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-31 09:59:29 +08:00
Sebastien Boeuf	927487c142	revert: "virtcontainers: support pre-add storage for frakti" This PR got merged while it had some issues with some shim processes being left behind after k8s testing. And because those issues were real issues introduced by this PR (not some random failures), now the master branch is broken and new pull requests cannot get the CI passing. That's the reason why this commit revert the changes introduced by this PR so that we can fix the master branch. Fixes #529 Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>	2018-07-27 09:39:56 -07:00
Zhang Wei	04f4f528f7	devices: rename VFIODrive to VFIODev Rename VFIODrive to VFIODev, also rename device interface "GetDeviceDrive()" to "GetDeviceInfo()". Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2018-07-26 14:15:52 +08:00
Wei Zhang	b54df7e127	devices: remove interface VhostUserDevice The interface "VhostUserDevice" has duplicate functions and fields with Device, so we can merge them into one interface and manage them with one group of interfaces. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-26 11:33:28 +08:00
Wei Zhang	2885eb0532	devices: use device manager to manage all devices Fixes #50 Previously the devices are created with device manager and laterly attached to hypervisor with "device.Attach()", this could work, but there's no way to remember the reference count for every device, which means if we plug one device to hypervisor twice, it's truly inserted twice, but actually we only need to insert once but use it in many places. Use device manager as a consolidated entrypoint of device management can give us a way to handle many "references" to single device, because it can save all devices and remember it's use count. Signed-off-by: Wei Zhang <zhangwei555@huawei.com>	2018-07-26 11:33:28 +08:00
Peng Tao	7a6f205970	virtcontainers: keep qmp connection when possible For each time a sandbox structure is created, we ensure s.Release() is called. Then we can keep the qmp connection as long as Sandbox pointer is alive. All VC interfaces are still stateless as s.Release() is called before each API returns. OTOH, for VCSandbox APIs, FetchSandbox() must be paired with s.Release, the same as before. Fixes: #500 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-23 08:37:55 +08:00
Peng Tao	c9bd12aa19	qemu: cleanup qmp channel setup and teardown Unify qmp channel setup and teardown. This also fixes the issue that sometimes qmp pointer is not reset after qmp is shutdown. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-23 08:36:58 +08:00
Peng Tao	28b6104710	qemu: prepare for vm templating support 1. support qemu migration save operation 2. setup vm templating parameters per hypervisor config 3. create vm storage path when it does not exist. This can happen when an empty guest is created without a sandbox. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-19 12:44:58 +08:00
Peng Tao	7f20dd89a3	hypervisor: cleanup valid method The boolean return value is not necessary. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-19 10:49:25 +08:00
Peng Tao	18e6a6effc	hypervisor: decouple hypervisor from sandbox A hypervisor implementation does not need to depend on a sandbox structure. Decouple them in preparation for vm factory. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-19 10:49:25 +08:00
Peng Tao	4ac675453f	qemu: remove append9PVolumes It is not used and we actully cannot append multiple 9pfs volumes to a guest. Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-19 10:49:25 +08:00
Sebastien Boeuf	cd842afca4	Merge pull request #417 from nitkon/maxmem virtcontainers: Set ppc64le maxmem depending on qemu version	2018-07-09 12:07:12 -07:00
Peng Tao	66a3e812f2	hypervisor/qemu: add memory hotplug support So that we can add more memory to an existing guest. Fixes: #469 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-07-09 15:29:50 +08:00
James O. D. Hunt	793a22083c	qemu: Pass sandboxID to agent for logging purposes Add a kernel command-line option that the agent can read to determine the sandbox ID of the VM. It can use this to create a `sandbox=` log field for improved log analysis. Fixes #465. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2018-07-04 13:50:06 +01:00
Peng Tao	8f329dbf48	qemu: clean up qmp channel We only need one qmp channel and it is qemu internal detail thus sandbox.go does not need to be aware of it. Fixes: #428 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-06-20 17:58:54 +08:00
Nitesh Konkar	d0bccabbe1	virtcontainers: Set ppc64le maxmem depending on qemu version The "Failed to allocate HTAB of requested size, try with smaller maxmem" error in ppc64le occurs when maxmem allocated is very high. This got fixed in qemu 2.10 and kernel 4.11. Hence put a maxmem restriction of 32GB per kata-container if qemu version less than 2.10 Fixes: #415 Signed-off-by: Nitesh Konkar <niteshkonkar@in.ibm.com>	2018-06-19 19:48:18 +05:30
Nitesh Konkar	baa553da07	virtcontainers: Get qemu suppport for ppc64le Fixes #302 Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com	2018-05-31 18:40:43 +05:30
Nitesh Konkar	4276c0c38e	virtcontainers/cli: refactor code Fixes #302 Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com	2018-05-31 17:58:35 +05:30
Julio Montes	4527a8066a	virtcontainers/qemu: honour CPU constrains Don't fail if a new container with a CPU constraint was added to a POD and no more vCPUs are available, instead apply the constraint and let kernel balance the resources. Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-05-14 17:33:31 -05:00
Julio Montes	07db945b09	virtcontainers/qemu: reduce memory footprint There is a relation between the maximum number of vCPUs and the memory footprint, if QEMU maxcpus option and kernel nr_cpus cmdline argument are big, then memory footprint is big, this issue only occurs if CPU hotplug support is enabled in the kernel, might be because of kernel needs to allocate resources to watch all sockets waiting for a CPU to be connected (ACPI event). For example ``` +---------------+-------------------------+ \| \| Memory Footprint (KB) \| +---------------+-------------------------+ \| NR_CPUS=240 \| 186501 \| +---------------+-------------------------+ \| NR_CPUS=8 \| 110684 \| +---------------+-------------------------+ ``` In order to do not affect CPU hotplug and allow to users to have containers with the same number of physical CPUs, this patch tries to mitigate the big memory footprint by using the actual number of physical CPUs as the maximum number of vCPUs for each container if `default_maxvcpus` is <= 0 in the runtime configuration file, otherwise `default_maxvcpus` is used as the maximum number of vCPUs. Before this patch a container with 256MB of RAM ``` total used free shared buff/cache available Mem: 195M 40M 113M 26M 41M 112M Swap: 0B 0B 0B ``` With this patch ``` total used free shared buff/cache available Mem: 236M 11M 188M 26M 36M 186M Swap: 0B 0B 0B ``` fixes #295 Signed-off-by: Julio Montes <julio.montes@intel.com>	2018-05-14 17:33:31 -05:00
James O. D. Hunt	bce9edd277	socket: Enforce socket length A Unix domain socket is limited to 107 usable bytes on Linux. However, not all code creating socket paths was checking for this limits. Created a new `utils.BuildSocketPath()` function (with tests) to encapsulate the logic and updated all code creating sockets to use it. Fixes #268. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>	2018-05-09 11:36:24 +01:00
Zhang Wei	366558ad5b	virtcontainers: refactor device.go to device manager Fixes #50 This is done for decoupling device management part from other parts. It seperate device.go to several dirs and files: ``` virtcontainers/device ├── api │ └── interface.go ├── config │ └── config.go ├── drivers │ ├── block.go │ ├── generic.go │ ├── utils.go │ ├── vfio.go │ ├── vhost_user_blk.go │ ├── vhost_user.go │ ├── vhost_user_net.go │ └── vhost_user_scsi.go └── manager ├── manager.go └── utils.go ``` * `api` contains interface definition of device management, so upper level caller should import and use the interface, and lower level should implement the interface. it's bridge to device drivers and callers. * `config` contains structed exported data. * `drivers` contains specific device drivers including block, vfio and vhost user devices. * `manager` exposes an external management package with a `DeviceManager`. Signed-off-by: Zhang Wei <zhangwei555@huawei.com>	2018-05-08 10:24:26 +08:00
Archana Shinde	718dbd2a71	device: Assign pci address for block devices Introduce a new field in Drive to store the PCI address if the drive is attached using virtio-blk. Assign PCI address in the format bridge-addr/device-addr. Since we need to assign the address while hotplugging, pass Drive by address. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-05-03 10:59:09 -07:00
Archana Shinde	dd927921c1	qemu: Return bridge itself with addDeviceToBridge instead of bridge bus Change the function to return the bridge itself that the device is attached to. This will allow bridge address to be used for determining the PCI slot of the device within the guest. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-05-03 10:59:08 -07:00
Archana Shinde	05c4ea39d0	qemu: Pass the pci/e address for qemu bridge Pass the slot address while attaching bridges. This is needed to determine the pci/e address of devices that are attached to the bridge. Fixes #210 Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-04-19 10:42:19 -07:00
Graham whaley	d6c3ec864b	license: SPDX: update all vc files to use SPDX style When imported, the vc files carried in the 'full style' apache license text, but the standard for kata is to use SPDX style. Update the relevant files to SPDX. Fixes: #227 Signed-off-by: Graham whaley <graham.whaley@intel.com>	2018-04-18 13:43:15 +01:00
Peng Tao	6107694930	runtime: rename pod to sandbox As agreed in [the kata containers API design](https://github.com/kata-containers/documentation/blob/master/design/kata-api-design.md), we need to rename pod notion to sandbox. The patch is a bit big but the actual change is done through the script: ``` sed -i -e 's/pod/sandbox/g' -e 's/Pod/Sandbox/g' -e 's/POD/SB/g' ``` The only expections are `pod_sandbox` and `pod_container` annotations, since we already pushed them to cri shims, we have to use them unchanged. Fixes: #199 Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-04-13 09:32:51 +08:00
Archana Shinde	82e42b5dc5	qemu: iothreads: Add iothread support for scsi Add a hypervisor configuration to specify if IO should be handled in a separate thread. Add support for iothreads for virtio-scsi for now. Since we attach all scsi drives to the same scsi controller, all the drives will be handled in a separate IO thread which would still give better performance. Going forward we need to assess if adding more controllers and attaching iothreasds to each of them with distributing drives among teh scsi controllers should be done, based on more performance analysis. Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>	2018-03-30 17:52:20 -07:00
Peng Tao	01f7e46984	Merge pull request #98 from bergwolf/initrd support to boot guest with an initrd image	2018-03-28 19:04:14 +08:00
Peng Tao	423e86405e	qemu: refector createPod() To fix CI complains: virtcontainers/qemu.go:248:⚠️ cyclomatic complexity 18 of function (*qemu).createPod() is high (> 15) (gocyclo) Signed-off-by: Peng Tao <bergwolf@gmail.com>	2018-03-27 15:58:41 +08:00

1 2

58 Commits