Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()
authorDexuan Cui <decui@microsoft.com>
Sat, 5 Sep 2020 02:55:55 +0000 (19:55 -0700)
committerGreg Kroah-Hartman <gregkh@linuxfoundation.org>
Wed, 23 Sep 2020 10:59:55 +0000 (12:59 +0200)
commit5037cc2307239e2df3f2232b5a9efccdbb555a04
tree742f8f83781fac2ac5c1771df40b0a75f997bd09
parent5fc19caaf6b759fc80e5fec6ae56f4bb25a1cae0
Drivers: hv: vmbus: hibernation: do not hang forever in vmbus_bus_resume()

[ Upstream commit 19873eec7e13fda140a0ebc75d6664e57c00bfb1 ]

After we Stop and later Start a VM that uses Accelerated Networking (NIC
SR-IOV), currently the VF vmbus device's Instance GUID can change, so after
vmbus_bus_resume() -> vmbus_request_offers(), vmbus_onoffer() can not find
the original vmbus channel of the VF, and hence we can't complete()
vmbus_connection.ready_for_resume_event in check_ready_for_resume_event(),
and the VM hangs in vmbus_bus_resume() forever.

Fix the issue by adding a timeout, so the resuming can still succeed, and
the saved state is not lost, and according to my test, the user can disable
Accelerated Networking and then will be able to SSH into the VM for
further recovery. Also prevent the VM in question from suspending again.

The host will be fixed so in future the Instance GUID will stay the same
across hibernation.

Fixes: d8bd2d442bb2 ("Drivers: hv: vmbus: Resume after fixing up old primary channels")
Signed-off-by: Dexuan Cui <decui@microsoft.com>
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Link: https://lore.kernel.org/r/20200905025555.45614-1-decui@microsoft.com
Signed-off-by: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
drivers/hv/vmbus_drv.c