Commit Graph

849 Commits

Author SHA1 Message Date
Ace-Tang
dc38ba77bd test: fix cgroup mock test
fix cgroup mock test because of containerd/cgroup vendor update

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-08-19 18:15:06 +08:00
Ace-Tang
6534357925 shim-v2: add network stat in metric
improve metric message, add network stat, base on agent PR: #538 and
containerd/cgroup PR #81

Fixes: #1976

Signed-off-by: ZeroMagic <anthonyliu@zju.edu.cn>
Signed-off-by: Ace-Tang <aceapril@126.com>
2019-08-19 18:15:06 +08:00
Peng Tao
e7457e6248 qemu: add logfile when debug is on
So that we can check qemu log to see if something goes wrong.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-16 12:58:25 +00:00
Peng Tao
aebc49692b qemu: fix memory prealloc option handling
Memory preallocation is just a property that hugepage, file backed
memory and memory-backend-ram can each choose to configure.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-16 12:58:25 +00:00
Peng Tao
6c77d76f24 qemu: check guest status with qmp query-status
When guest panics or stops with unexpected internal
error, qemu process might still be running but we can
find out such situation with qmp. Then monitor can still
report such failures to watchers.

Fixes: #1963
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-16 12:58:25 +00:00
Peng Tao
b3987e4786 Merge pull request #1933 from lifupan/noproxywatchconsole
add watchconsole for no_proxy type
2019-08-16 11:06:02 +08:00
Julio Montes
de4582eda3 Merge pull request #1959 from bergwolf/stopvm
qemu: do not try to stop qemu multiple times
2019-08-15 08:50:17 -05:00
Julio Montes
0bf48dca65 Merge pull request #1969 from bergwolf/detach
do not hotplug network device when stopping sandbox
2019-08-15 08:46:06 -05:00
Peng Tao
d90eba8593 network: always cold unplug network devices
We don't really need to unplug it from guest because we have
already stopped it. Just detach it and clean it up.

Fixes: #1968
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-15 00:02:52 -07:00
Peng Tao
d26ff71201 Revert: "sandbox: remove network before stopping vm"
This reverts commit 794e08e243.

It breaks vfio device passthru as we need to bind the device
back to host when removing the endpoint. And that is not possible
when qemu is still running (thus holding reference to the device).

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-15 00:02:44 -07:00
Eric Ernst
a5c7e6b934 Merge pull request #1962 from bergwolf/grpc-timeout
agent: add default timeout for grpc requests
2019-08-14 21:04:20 -07:00
Fupan Li
99e04ac8cd Merge pull request #1961 from bergwolf/pause-ready
container: do not pause a StateReady container
2019-08-14 08:54:59 +08:00
Eric Ernst
263f64829d Merge pull request #1957 from bergwolf/network-removal
sandbox: remove network before stopping vm
2019-08-13 09:32:21 -07:00
Julio Montes
5e631391bf Merge pull request #1942 from woshijpf/fix-hotplug-exceed-problem
virtcontainers: fix hotplug block/net devices execeed pciBridgeMaxCap…
2019-08-13 08:45:24 -05:00
Peng Tao
debc7d93ad agent: add default timeout for grpc requests
If guest is malfunctioning, we need a way to bail out. Add
a default timeout for most of the grpc requests so that the
runtime does not wait indefinitely.

Fixes: #1952
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-13 01:22:05 -07:00
Peng Tao
9d4050e0b1 container: do not pause a StateReady container
We can only pause a running container.

Fixes: #1960
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-13 01:19:36 -07:00
Peng Tao
b58ab66f05 qemu: do not try to stop qemu multiple times
We've cleaned it up the first time. Future stop will
only fail.

Fixes: #1958
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-13 01:13:06 -07:00
Peng Tao
794e08e243 sandbox: remove network before stopping vm
We might need to call hypervisor hotunplug to really remove
a network device. We cannot do it after stopping the VM.

Fixes: #1956
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-08-13 01:04:07 -07:00
lifupan
31ddb4d452 virtcontainers: add watchconsole for no_proxy type
For no proxy type, we also need the feature
of watch hypervisor's console to help debug.

Fixes:#1932

Signed-off-by: lifupan <lifupan@gmail.com>
2019-08-13 09:09:23 +08:00
Archana Shinde
3fc17e96fc vsock: Propogate error for vsock ioctl
Make error handling better by propogating error.

Fixes #1953

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2019-08-12 12:13:52 -07:00
Eric Ernst
cfedb06a19 Merge pull request #1936 from amshinde/ignore-routes-with-kernel-proto
network: Ignore routes with proto as "kernel"
2019-08-12 07:08:34 -07:00
Archana Shinde
565f14f685 acrn: Change the default network model for ACRN to macvtap
Drop the bits for bridged networking in ACRN and change the default
to macvtap. We should eventually change this to tcfilter with additional
testing.

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2019-08-09 13:01:54 -07:00
jiangpengfei
e467293a3e virtcontainers: fix hotplug pci devices execeed max capacity bug
add rollback operations when hotplug block/net devices execeed pciBridgeMaxCapacity

Fixes: #1941

Signed-off-by: jiangpengfei <jiangpengfei9@huawei.com>
2019-08-09 12:31:46 -04:00
Julio Montes
14474a49a2 Merge pull request #1921 from Ace-Tang/fix-remove-network
network: fix failed to remove network
2019-08-07 14:06:52 -05:00
GabyCT
a3eb19ca9b Merge pull request #1926 from devimc/topic/virtcontainers/loadKernelModules
virtcontainers: add support for loading kernel modules
2019-08-07 11:01:43 -05:00
Archana Shinde
df7cf77a08 network: Ignore routes with proto as "kernel"
Routes with proto "kernel" are routes that are automatically added
by the kernel.
It is a route added automatically when you assign an address to an
interface which is not /32.
With this commit, these routes are ignored. The guest kernel
would add these routes on the guest side. A corresponding commit on the
agent side would no longer delete these routes while updating them.

Without this commit, netlink gives an error complaining that a route
already exists when you try to add a route with the same dest subnet.

Something like:
dest: 192.168.1.0/24 device:net1 source:192.168.1.217 scope:253
dest: 192.168.1.0/24 device:net2 source:192.168.1.218 scope:253

Depends-on: github.com/kata-containers/agent#624

Fixes: #1811

Signed-off-by: Archana Shinde <archana.m.shinde@intel.com>
2019-08-06 21:39:11 +00:00
Julio Montes
355b9c003d virtcontainers: add support for loading kernel modules
The list of kernel modules can be passed to the runtime through the
configuration file or using OCI annotations. In both cases, a list paramentes
can be specified for each module.

fixes #1925

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-08-06 20:55:49 +00:00
Eryu Guan
a9168a3fc9 virtiofs: wait for virtiofsd process to release its resources
We start virtiofsd in foreground (-f option), so we should wait for it
to reclaim its resources to avoid zombie process when qemu or virtiofsd
got killed unexpectedly.

Fixes: #1934
Signed-off-by: Eryu Guan <eguan@linux.alibaba.com>
2019-08-06 14:55:22 +08:00
Ganesh Maharaj Mahalingam
6e1e6a2297 virtiofs: fix virtiofs crash when cache=none
When virtio_fs_cache is set to none, the mount options for the folder
inside the guest should not contain the dax option else it leads to
invalid address errors and a crash of the daemon on the host.

Fixes: #1907
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
2019-08-01 13:26:34 -07:00
Ace-Tang
50c3e56aeb network: fix failed to remove network
in create sandbox, if process error, should remove network without judge
NetNsCreated is true, since network is created by kata and should be
removed by kata, and network.Remove has judged if need to delete netns
depend on NetNsCreated

Fixes: #1920

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-07-30 20:17:09 +08:00
Julio Montes
7668aeb526 virtcontainers: support SMP die
CPU topology has changed in QEMU 4.1: socket > die > core > thread.
die option must be specified in order to hotplug CPUs on x86_64

Depends-on: github.com/kata-containers/packaging#657

fixes #1913

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-07-26 21:18:24 +00:00
Jose Carlos Venegas Munoz
4bd3ea848d Merge pull request #1895 from Ace-Tang/pass-vendor-to-qmp
qemu: support vfio pass x-pci-vendor-id and x-pci-device-id pass
2019-07-26 11:39:39 -05:00
Fupan Li
943136e18b Merge pull request #1899 from bergwolf/ut
Fix UT failures with non-root
2019-07-25 11:46:08 +08:00
Wei Zhang
f3d0978c3f persist: improve readability
Address some comments for code readability, also add some unit tests.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-07-23 17:10:00 +08:00
Wei Zhang
3bfbbd666d persist: merge "network.json"
Merge "network.json" into "persist.json" so that new store can manage
network part.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-07-23 17:10:00 +08:00
Wei Zhang
99cf3f80d7 persist: merge "agent.json"
Manage "agent.json" with new store.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-07-23 17:10:00 +08:00
Wei Zhang
7d5e48f1b5 persist: manage "hypervisor.json" with new store
Fixes #803

Merge "hypervisor.json" into "persist.json", so the new store can take
care of hypervisor data now.

Signed-off-by: Wei Zhang <weizhang555.zw@gmail.com>
2019-07-23 17:09:11 +08:00
Peng Tao
d5d7d82eeb vc: move container mount cleanup to container.go
For one thing, it is container specific resource so it should not
be cleaned up by the agent. For another thing, we can make container
stop to force cleanup these host mountpoints regardless of hypervisor
and agent liveness.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Peng Tao
262484de68 monitor: watch hypervisor
When hypervisor process is dead, notify watchers and mark agent dead.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Peng Tao
67c401c059 agent: use hypervisor pid as backup proxy pid for non-kata proxy cases
Then we can check hypervisor liveness in those cases to avoid long
timeout when connecting to the agent when hypervisor is dead.

For kata-agent, we still use the kata-proxy pid for the same purpose.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Peng Tao
835b6e9e1b sandbox: do not fail SIGKILL
Once we have found the container, we should never fail SIGKILL.
It is possible to fail to send SIGKILL because hypervisor might
be gone already. If we fail SIGKILL, upper layer cannot really
proceed to clean things up.

Also there is no need to save sandbox here as we did not change
any state.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Peng Tao
bc4460e12f sandbox: support force stop
When force is true, ignore any guest related errors. This can
be used to stop a sandbox when hypervisor process is dead.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:29:32 -07:00
Peng Tao
4130913ed7 agent: mark agent dead when failing to connect
Whenever we fail to connect, do not make any more attempts.
More attempts are possible during cleanup phase but we should
not try to connect any more there.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:27:52 -07:00
Peng Tao
c472a01006 container: allow to stop a paused container
When a container is paused and something goes terribly
wrong, we still need to be able to clean thing up. A paused
container should be able to transit to stopped state as well
so that we can delete it properly.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:27:52 -07:00
Peng Tao
f886c0bf35 vc: drop container SetPid API
It is not used by anyone.

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-22 19:27:52 -07:00
Peng Tao
3063391334 ut: skip TestBindUnmountContainerRootfsENOENTNotError for non-root
mount syscall requires root.

Fixes: #1898
Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-19 08:44:51 -07:00
Peng Tao
c4583f4486 ut: skip TestStartNetworkMonitor for non-root
It requires root to manipulate netns and otherwise fails
like below:

=== RUN   TestStartNetworkMonitor
--- FAIL: TestStartNetworkMonitor (0.00s)
        Error Trace:    sandbox_test.go:1481
        Error:          Expected nil, but got: &errors.errorString{s:"Error switching to ns /proc/6648/task/6651/ns/net: operation not permitted"}

Signed-off-by: Peng Tao <bergwolf@hyper.sh>
2019-07-19 08:36:43 -07:00
Julio Montes
f2423e7d7c virtcontainers: convert virtcontainers tests to testify/assert
Convert virtcontainers tests to testify/assert to make the virtcontainers
tests more readable.

fixes #156

Signed-off-by: Julio Montes <julio.montes@intel.com>
2019-07-19 15:28:45 +00:00
Ace-Tang
50e263d943 qemu: support vfio pass x-pci-vendor-id and x-pci-device-id pass
since some vendor id like 1ded can not be identified by virtio-pci
driver, so need to pass a specified vendor id to qemu.

Fixes: #1894

Signed-off-by: Ace-Tang <aceapril@126.com>
2019-07-19 21:04:22 +08:00
Peng Tao
d987a30367 Merge pull request #1799 from bergwolf/template
qemu: use x-ignore-shared to implement vm template
2019-07-18 10:38:59 +08:00