mirror of
https://github.com/aljazceru/kata-containers.git
synced 2025-12-18 23:04:20 +01:00
monitor: Fix monitor race condition doing hypervisor.check()
The thread monitor will check if the agent and the VMM are alive every second in a blocking thread. The Cloud hypervisor API server is single-threaded, if the monitor does a `check()`, while a slow request is still in progress, the monitor check() method will timeout. The monitor thread will stop all the shim-v2 execution. This commit modifies the monitor thread to make it check the status of the hypervisor after 5 seconds. Additionally, the `check()` method from cloud-hypervisor will use the method `clh.isClhRunning(timeout)` with a 10 seconds timeout. The monitor function does no timeout, so even if `hypervisor.check()` takes more 10 seconds, the isClhRunning method handles errors doing a VmmPing and retry in case of errors until the timeout is reached. Reduce the time to the next check to 5 should not affect any functionality, but it will reduce the overhead polling the hypervisor. Fixes: #2777 Signed-off-by: Carlos Venegas <jose.carlos.venegas.munoz@intel.com>
This commit is contained in:
committed by
Carlos Venegas
parent
3d0fe433c6
commit
55412044df
@@ -14,7 +14,7 @@ import (
|
||||
)
|
||||
|
||||
const (
|
||||
defaultCheckInterval = 1 * time.Second
|
||||
defaultCheckInterval = 5 * time.Second
|
||||
watcherChannelSize = 128
|
||||
)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user