Comments (14)
I am trying to repro the issue:
PS C:\Users\Administrator> docker run jenkins/agent:jdk17-windowsservercore-ltsc2022 powershell.exe -c echo hello
hello
PS C:\Users\Administrator> get-hotfix
Source Description HotFixID InstalledBy InstalledOn
WIN-35ILNJ... Update KB5022507 5/5/2023 12:00:00 AM
WIN-35ILNJ... Security Update KB5026370 5/5/2023 12:00:00 AM
WIN-35ILNJ... Update KB5026493 5/5/2023 12:00:00 AM
Can you call get-hotfix to see the packages that installed on your system?
Thanks,
Howard.
from windows-containers.
Today I installed the base image that had been tested as working. A fresh image, not the "corporate" image, via a standard deployment.
Then installed each monthly patch, and tested each iteration of the updates.
SW_DVD9_Win_Server_STD_CORE_2022_2108.20_64Bit_English_DC_STD_MLF_X23-42932
C:\Windows\WinSxS\amd64_hyperv-compute-host-service_31bf3856ad364e35_10.0.20348.1668_none_a001fbb8ddff3a5d\vmcompute.exe
C:\Windows\WinSxS\amd64_hyperv-compute-host-service_31bf3856ad364e35_10.0.20348.1726_none_9ff35834de0abdb0\vmcompute.exe
C:\Windows\WinSxS\amd64_hyperv-compute-host-service_31bf3856ad364e35_10.0.20348.1787_none_9ff94328de056f5b\vmcompute.exe
C:\Windows\WinSxS\amd64_hyperv-compute-host-service_31bf3856ad364e35_10.0.20348.1850_none_9fec11d6de0f8be0\vmcompute.exe
C:\Windows\WinSxS\amd64_hyperv-compute-host-service_31bf3856ad364e35_10.0.20348.1906_none_9fdbb7c8de1cc2e4\vmcompute.exe
C:\Windows\WinSxS\amd64_hyperv-compute-host-service_31bf3856ad364e35_10.0.20348.1970_none_9fe3427ede15da7f\vmcompute.exe
I was not able to reproduce the error.
I am continuing to investigate.
I am leaning towards some "scanning" and "security" software as a possible cause, however will update once I have completed more testing.
from windows-containers.
Possibly related:
from windows-containers.
when the server gets deployed (without updates) it works and has the following.
get-hotfix
Source Description HotFixID InstalledBy InstalledOn
------ ----------- -------- ----------- -----------
server001 Update KB5028956 NT AUTHORITY\SYSTEM 9/22/2023 12:00:00 AM
server001 Security Update KB5012170 NT AUTHORITY\SYSTEM 9/21/2023 12:00:00 AM
server001 Security Update KB5029250 NT AUTHORITY\SYSTEM 9/22/2023 12:00:00 AM
server001 Update KB5029395 NT AUTHORITY\SYSTEM 9/21/2023 12:00:00 AM
#-----------------------------------------
Package Identity : Package_for_DotNetRollup31bf3856ad364e35amd64~~10.0.4614.6
State : Installed
Release Type : Update
Install Time : 4/5/2023 12:21 AM
Package Identity : Package_for_RollupFix31bf3856ad364e35amd64~~20348.1668.1.8
State : Installed
Release Type : Security Update
Install Time : 4/5/2023 12:33 AM
Package Identity : Package_for_ServicingStack_166331bf3856ad364e35amd64~~20348.1663.1.1
State : Installed
Release Type : Security Update
Install Time : 4/5/2023 12:20 AM
#-----------------------------------------
Which is from a server 2022 base image: Win_Server_2022_2108.20
#-----------------------------------------
Then as part of the deployment process when we install the latest patch it breaks.
Get-HotFix
Source Description HotFixID InstalledBy InstalledOn
------ ----------- -------- ----------- -----------
server002 Update KB5029928 NT AUTHORITY\SYSTEM 9/28/2023 12:00:00 AM
server002 Security Update KB5012170 NT AUTHORITY\SYSTEM 9/28/2023 12:00:00 AM
server002 Security Update KB5030216 NT AUTHORITY\SYSTEM 9/28/2023 12:00:00 AM
server002 Security Update KB5025314 4/5/2023 12:00:00 AM
server002 Update KB5030369 NT AUTHORITY\SYSTEM 9/28/2023 12:00:00 AM
encountered an error during hcs::System::Start:context deadline exceeded.
I have tested on Server Core and with GUI and same result.
Based on the other thread it likely wasn't this latest roll up or patch that had the breaking change... however I haven't been able to test on different patch versions. Also when I uninstalled this update the server was still broken, I noticed vmcompute.exe was an even earlier version in this case (after uninstall).
I am using the exact same process to deploy all of the servers so they are identical other than the working servers did not have the updates applied at build time.
Then I noticed the vmcomute.exe did have some changes between the versions.
we should test with this since it's a smaller image and straight from mcr... not that it makes that much difference.
docker run mcr.microsoft.com/windows/nanoserver:ltsc2022 ping localhost -r 5
also docker pull
works fine, it's just run that times out.
This is the patch to install top test a repro...
Security patch
September 12, 2023—KB5030216 (OS Build 20348.1970) - Microsoft Support
https://support.microsoft.com/en-us/topic/september-12-2023-kb5030216-os-build-20348-1970-34d4aff3-fd05-4270-b288-4ab6379c7f81
from windows-containers.
@Howard-Haiyang-Hao also thank you for the follow up.
If there are some lower level tests or logs that we can look into, please let us know.
from windows-containers.
Adding below log from the Application logs:
6:25:02PM - Warning
Syscall did not complete within operation timeout. This may indicate a platform issue. If it appears to be making no forward progress, obtain the stacks and see if there is a syscall stuck in the platform API for a significant length of time. [traceID=bb85376784de4797faa7730c7d12ce40 spanID=323e301be76b14aa timeout=4m0s]
6:25:02PM - Error
failed to start container [error=container 6ac12218a7dfcf3e6507274fc678b46fd8f30f2224dcbc0bb27c05c58b543641 encountered an error during hcs::System::Start: context deadline
from windows-containers.
Hi @Howard-Haiyang-Hao were you able to get the update installed? Any luck with the repro?
from windows-containers.
Adding some further information.
- Expanded testing from VMware host machines to Hyper-V host machines. No change in experience.
docker run --isolation=process mcr.microsoft.com/windows/nanoserver:ltsc2022 ping localhost -r 5
ltsc2022: Pulling from windows/nanoserver
452f3ad1d32c: Pulling fs layer
452f3ad1d32c: Verifying Checksum
452f3ad1d32c: Download complete
452f3ad1d32c: Pull complete
Digest: sha256:6562c9a2580260e4f2e3a081cc2cf1d960899cbce7c4568fb851e4848ca50e07
Status: Downloaded newer image for mcr.microsoft.com/windows/nanoserver:ltsc2022
docker: Error response from daemon: container 62f69d70bdfc7a8bcb86e4d824221cb3ed0381e48fb2f6b2aebefab59a12a7b8 encountered
an error during hcs::System::Start: context deadline exceeded.
Event logs
warning
Syscall did not complete within operation timeout. This may indicate a platform issue. If it appears to be making no forward progress, obtain the stacks and see if there is a syscall stuck in the platform API for a significant length of time. [spanID=be1a8d4cdaa4cc79 traceID=3c3d99e94347624bac4b3aa9b63f8d8b timeout=4m0s]
errors
failed to start container [module=libcontainerd namespace=moby container=62f69d70bdfc7a8bcb86e4d824221cb3ed0381e48fb2f6b2aebefab59a12a7b8 error=container 62f69d70bdfc7a8bcb86e4d824221cb3ed0381e48fb2f6b2aebefab59a12a7b8 encountered an error during hcs::System::Start: context deadline exceeded]
failed to cleanup after a failed Start [module=libcontainerd namespace=moby container=62f69d70bdfc7a8bcb86e4d824221cb3ed0381e48fb2f6b2aebefab59a12a7b8 error=container 62f69d70bdfc7a8bcb86e4d824221cb3ed0381e48fb2f6b2aebefab59a12a7b8 encountered an error during hcs::System::Terminate: context deadline exceeded]
Handler for POST /v1.43/containers/62f69d70bdfc7a8bcb86e4d824221cb3ed0381e48fb2f6b2aebefab59a12a7b8/start returned error: container 62f69d70bdfc7a8bcb86e4d824221cb3ed0381e48fb2f6b2aebefab59a12a7b8 encountered an error during hcs::System::Start: context deadline exceeded
OSinfo
WindowsBuildLabEx : 20348.1.amd64fre.fe_release.210507-1500
WindowsCurrentVersion : 6.3
WindowsEditionId : ServerStandard
WindowsInstallationType : Server
WindowsInstallDateFromRegistry : 10/2/2023 3:38:09 PM
WindowsProductId : 00454-10000-00001-AA777
WindowsProductName : Windows Server 2022 Standard
OsName : Microsoft Windows Server 2022 Standard
OsType : WINNT
OsOperatingSystemSKU : StandardServerEdition
OsVersion : 10.0.20348
get-hotfix | ft -AutoSize
Source Description HotFixID InstalledBy InstalledOn
------ ----------- -------- ----------- -----------
BWILKINSON-VM1 Update KB5029928 NT AUTHORITY\SYSTEM 10/2/2023 12:00:00 AM
BWILKINSON-VM1 Security Update KB5012170 NT AUTHORITY\SYSTEM 10/2/2023 12:00:00 AM
BWILKINSON-VM1 Security Update KB5030216 NT AUTHORITY\SYSTEM 10/2/2023 12:00:00 AM
BWILKINSON-VM1 Security Update KB5025314 4/5/2023 12:00:00 AM
BWILKINSON-VM1 Update KB5030369 NT AUTHORITY\SYSTEM 10/2/2023 12:00:00 AM
from windows-containers.
I am out of cycle for deploying more nodes right now, however I expect to continue doing so in the next few days.
Once I have more data, I will provide a further update.
from windows-containers.
I will close this for now.
At this time, I don't have a root cause... however I will come back later and add notes in case it's useful for others.
I do have a way to make the vmcompute start working again... however I am just not definitive on exactly why or which single things breaks the service.
1 time restarting vmcompute (and then rebooting) brought the containers online
1 time uninstalling 2 enterprise monitoring apps brought the containers online
1 time uninstalling 1 enterprise monitoring app brought the containers online (also I have reproduced this several times).
I can only assume that the vmcompute is sensitive to something, however I am unsure what exactly OR why uninstalling the apps allowed the containers to start as normal. Also reinstalling the same combination of apps did not break the container services.
So I have a workaround, however not any sure solution on what exactly the issue is or was.
I will provide an update in the future.
from windows-containers.
In case it's helpful for others...
In our environment I have been able to reproduce the issue many times.
-
The container timeout error occurs.
- Syscall did not complete within operation timeout. This may indicate a platform issue. If it appears to be making no forward progress, obtain the stacks and see if there is a syscall stuck in the platform API for a significant length of time. [timeout=4m0s traceID=6b59dd25c46e317d63303f745be2ed33 spanID=540b21032cc68875]
- failed to cleanup after a failed Start [namespace=moby container=1b4b560791f9c2fcf36aeafb6251c6bc883b4b1a788f9178fba4d87e5b186324 error=container 1b4b560791f9c2fcf36aeafb6251c6bc883b4b1a788f9178fba4d87e5b186324 encountered an error during hcs::System::Terminate: context deadline exceeded module=libcontainerd]
- Handler for POST /v1.43/containers/1b4b560791f9c2fcf36aeafb6251c6bc883b4b1a788f9178fba4d87e5b186324/start returned error: container 1b4b560791f9c2fcf36aeafb6251c6bc883b4b1a788f9178fba4d87e5b186324 encountered an error during hcs::System::Start: context deadline exceeded
-
Stop the "SplunkForwarder" Service.
-
The container starts successfully.
-
Start the "SplunkForwarder" the containers continue to work.
I have not seen any regression on this afterwards.
I have a Guess that it maybe related to the network setup for the "Microsoft-Windows-Hyper-V-VmSwitch" etc.
- However this is only based on reviewing logs in the Application and System Events between stopping/starting the Splunk service as well as running docker.
If I have more time I will continue to capture these states as I deploy more new resources in the future.
from windows-containers.
@Howard-Haiyang-Hao @akarshm just a ping for this issue, which I closed, however is worth follow up in:
from windows-containers.
@brwilkinson ,
stopping the splunk forwarding svc, restarting docker service, and starting the splunk fwd service seems to have work for me. Thanks a million for that update.
Any further issues? Reoccurrences?
from windows-containers.
This is being tracked over on
from windows-containers.
Related Issues (20)
- L2Bridge not working correctly on Windows server 2022 - 21H2 HOT 10
- Support Window 10 base image for Windows Service 2022 HOT 1
- Not able to download windows container images HOT 3
- WIN API LoadLibraryA is failling when loading a DLL created by Intel Fortran compiler in Windows Nanoserver container HOT 6
- Loading Minifilters in HyperV containers HOT 3
- Docker in Docker with Windows Containers HOT 2
- pull image from PS command on Docker Desktop Window 10Ent HOT 13
- Plans for Windows 10/11 Containers in the Future HOT 4
- Issues Installing Oracle Instant Client on Windows Container: Network Sharing Problem HOT 2
- `ERROR_ACCESS_DENIED` when trying to open for writing _hidden_ files located in mounted folder inside containers HOT 8
- How to deploy dotnet microservice in iis server inside Docker container HOT 4
- Get-WindowsFeature is not available in mcr.microsoft.com/windows:ltsc2019 HOT 7
- Process Isolation is very slow as compared to HyperV Containers on Server 2019 HOT 5
- regsvr32 fails when registering DAO350.DLL HOT 3
- install-docker scripts failing when passing -useDHCP due to deprecated dependency HOT 4
- MSIX installation support HOT 4
- [MicrosoftDocs] Insufficient Documentation for Windows Server 2022 HOT 3
- [MicrosoftDocs] gMSA Documentation HOT 5
- Issue with missing DLL in runtime:4.8-windowsservercore-ltsc2022 HOT 5
- Windows containers shutdown not working properly on docker (no SIGTERM?) HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from windows-containers.