container's rootfs is a string type, which cannot represent a
block storage backed rootfs which hasn't been mounted.
Change it to a mount alike struct as below:
RootFs struct {
// Source specify the BlockDevice path
Source string
// Target specify where the rootfs is mounted if it has been mounted
Target string
// Type specifies the type of filesystem to mount.
Type string
// Options specifies zero or more fstab style mount options.
Options []string
// Mounted specifies whether the rootfs has be mounted or not
Mounted bool
}
If the container's rootfs has been mounted as before, then this struct can be
initialized as: RootFs{Target: <rootfs>, Mounted: true} to be compatible with
previous case.
Fixes:#1158
Signed-off-by: lifupan <lifupan@gmail.com>
gometalinter is deprecated and will be archived April '19. The
suggestion is to switch to golangci-lint which is apparently 5x faster
than gometalinter.
Partially Fixes: #1377
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
After code check and test, found VMCache can work with vsock.
Remove the code that prohibit them from working together.
Fixes: #1400
Signed-off-by: Hui Zhu <teawater@hyper.sh>
systemd-random-seed service fails if the rootfs is a read-only fs.
systemd-random-seed restores the random seed of the system at early
boot and saves it at shutdown, since kata containers are one boot machines
this service is not needed.
Signed-off-by: Julio Montes <julio.montes@intel.com>
We were considering all empty-dir k8s volumes as backed by tmpfs.
However they can be backed by a host directory as well.
Pass those as 9p volumes, while tmpfs volumes are handled as before,
namely creating a tmpfs directory inside the guest.
The only way to detect "Memory" empty-dirs is to actually check if the
volume is mounted as a tmpfs mount, since any information of k8s
"medium" is lost at the OCI layer.
Fixes#1341
Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
Fixes#1226
Add new flag "experimental" for supporting underworking features.
Some features are under developing which are not ready for release,
there're also some features which will break compatibility which is not
suitable to be merged into a kata minor release(x version in x.y.z)
For getting these features above merged earlier for more testing, we can
mark them as "experimental" features, and move them to formal features
when they are ready.
Signed-off-by: Wei Zhang <zhangwei555@huawei.com>
Reduce memory footprint ~7% by disabling some systemd services like
systemd-journald and systemd-udevd, those services are just consuming memory
and are not needed. For example kata-agent logs the errors through the proxy.
fixes#1339
Signed-off-by: Julio Montes <julio.montes@intel.com>
VMCache is a new function that creates VMs as caches before using it.
It helps speed up new container creation.
The function consists of a server and some clients communicating
through Unix socket. The protocol is gRPC in protocols/cache/cache.proto.
The VMCache server will create some VMs and cache them by factory cache.
It will convert the VM to gRPC format and transport it when gets
requestion from clients.
Factory grpccache is the VMCache client. It will request gRPC format
VM and convert it back to a VM. If VMCache function is enabled,
kata-runtime will request VM from factory grpccache when it creates
a new sandbox.
VMCache has two options.
vm_cache_number specifies the number of caches of VMCache:
unspecified or == 0 --> VMCache is disabled
> 0 --> will be set to the specified number
vm_cache_endpoint specifies the address of the Unix socket.
This commit just includes the core and the client of VMCache.
Currently, VM cache still cannot work with VM templating and vsock.
And just support qemu.
Fixes: #52
Signed-off-by: Hui Zhu <teawater@hyper.sh>
Check the "builtIn" first when updating the shim/proxy/agent,
thus can avoid checking the shim/proxy's binary files path which
is needless for "builtIn" type.
Fixes: #1314
Signed-off-by: fupan <lifupan@gmail.com>
If only initrd or rootfs image is installed,
allow to start Kata Containers without erroring
out.
Fixes: #1174
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
Function SetKernelParams is just to update the runtimeConfig according to itself.
It just around the configuration.
So this patch moves it to updateRuntimeConfig.
Fixes: #1106
Signed-off-by: Hui Zhu <teawater@hyper.sh>
Pass Seccomp profile to the agent only if
the configuration.toml allows it to be passed
and the agent/image is seccomp capable.
Fixes: #688
Signed-off-by: Nitesh Konkar niteshkonkar@in.ibm.com
This value will be plused to max memory of hypervisor.
It is the memory address space for the NVDIMM devie.
If set block storage driver (block_device_driver) to "nvdimm",
should set memory_offset to the size of block device.
Signed-off-by: Hui Zhu <teawater@hyper.sh>
Set block_device_driver to "nvdimm" will make the hypervisor use
the block device as NVDIMM disk.
Fixes: #1032
Signed-off-by: Hui Zhu <teawater@hyper.sh>
Start adding support for virtio-mmio devices starting with block.
The devices show within the vm as vda, vdb,... based on order of
insertion and such within the VM resemble virtio-blk devices.
They need to be explicitly differentiated to ensure that the
agent logic within the VM can discover and mount them appropropriately.
The agent uses PCI location to discover them for virtio-blk.
For virtio-mmio we need to use the predicted device name for now.
Note: Kata used a disk for the VM rootfs in the case of Firecracker.
(Instead of initrd or virtual-nvdimm). The Kata code today does not
handle this case properly.
For now as Firecracker is the only Hypervisor in Kata that
uses virtio-mmio directly offset the drive index to comprehend
this.
Longer term we should track if the rootfs is setup as a block
device explicitly.
Fixes: #1046
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
Signed-off-by: Manohar Castelino <manohar.r.castelino@intel.com>
By breaking down updateRuntimeConfig() into smaller functions, this
commit prevents the function to grow a Go complexity higher than 15.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>
In order to let the user choose firecracker hypervisor instead of
QEMU (from the configuration.toml), let's add it to the list of
supported hypervisors.
Fixes#1042
Depends-on: github.com/kata-containers/runtime#1044
Signed-off-by: Eric Ernst <eric.ernst@intel.com>
Vsock conflicts with factory, when both of them are enabled,
kata will try to create a new vm template which is useless,
thus it's better to return an error directly to let users know
that those two config cannot be enabled at the same time.
Fixes: #1055
Signed-off-by: fupan <lifupan@gmail.com>
Add block_device_cache_set, block_device_cache_direct and
block_device_cache_noflush.
They are cache-related options for block devices that are described in
https://github.com/qemu/qemu/blob/master/qapi/block-core.json.
block_device_cache_direct denotes whether use of O_DIRECT (bypass the host
page cache) is enabled. block_device_cache_noflush denotes whether flush
requests for the device are ignored.
The json said they are supported since 2.9.
So add block_device_cache_set to control the cache options set to block
devices or not. It will help to support the old version qemu.
Fixes: #956
Signed-off-by: Hui Zhu <teawater@hyper.sh>
If VM factory templating is enabled (`enable_template=true`), error if
the configured image is not an `initrd=` one.
Also add a note to the config file explaining that a normal image cannot
be used - only initrd images are supported.
Fixes#948.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Moved the checking routines in `LoadConfiguration()` to a new
`checkConfig()` function for clarity.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Move the VSOCK handling code higher up so that all the checking code is
gathered together at the end of `LoadConfiguration()`.
Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
Refactor the config related codes into a separated
package which can be shared with other cli programs
such as kata's shimv2.
Fixes: #787Fixes: #714
Signed-off-by: fupan <lifupan@gmail.com>
In order to reuse the same scheme across several components of the
runtime repository, we need to factorize the code handling signalling
through a common package.
The immediate use case will be to use this package from both the CLI
and the network monitor.
Signed-off-by: Sebastien Boeuf <sebastien.boeuf@intel.com>