shutdown: Don't sever console watcher too early

Fixed logic used to handle static agent tracing. For a standard (untraced) hypervisor shutdown, the runtime kills the VM process once the workload has finished. But if static agent tracing is enabled, the agent running inside the VM is responsible for the shutdown. The existing code handled this scenario but did not wait for the hypervisor process to end. The outcome of this being that the console watcher thread was killed too early. Although not a problem for an untraced system, if static agent tracing was enabled, the logs from the hypervisor would be truncated, missing the crucial final stages of the agents shutdown sequence. The fix necessitated adding a new parameter to the `stopSandbox()` API, which if true requests the runtime hypervisor logic simply to wait for the hypervisor process to exit rather than killing it. Fixes: #1696. Signed-off-by: James O. D. Hunt <james.o.hunt@intel.com>
2025-12-22 16:54:25 +01:00 · 2021-04-15 12:01:19 +01:00
parent 51ab870091
commit 9256e590dc
9 changed files with 51 additions and 28 deletions
--- a/src/runtime/virtcontainers/sandbox.go
+++ b/src/runtime/virtcontainers/sandbox.go
@@ -1027,7 +1027,7 @@ func (s *Sandbox) startVM(ctx context.Context) (err error) {

 	defer func() {
 		if err != nil {
-			s.hypervisor.stopSandbox(ctx)
+			s.hypervisor.stopSandbox(ctx, false)
 		}
 	}()

@@ -1081,14 +1081,9 @@ func (s *Sandbox) stopVM(ctx context.Context) error {
 		s.Logger().WithError(err).WithField("sandboxid", s.id).Warning("Agent did not stop sandbox")
 	}

-	if s.disableVMShutdown {
-		// Do not kill the VM - allow the agent to shut it down
-		// (only used to support static agent tracing).
-		return nil
-	}
-
 	s.Logger().Info("Stopping VM")
-	return s.hypervisor.stopSandbox(ctx)
+
+	return s.hypervisor.stopSandbox(ctx, s.disableVMShutdown)
 }

 func (s *Sandbox) addContainer(c *Container) error {