I am seing intermittent crashes of my Domino V11.0.1 server on Windows/2016 10.0 [64-bit] (Build 9200), PlatID=2, (2 Processors) when the agent manager runs a LotusScript agent. I have also seen this kind of crash on another server in a customer environment.
The agent ran on my server for about a week in a 5 minute schedule before the crash occurred, while the customer server already crashed after a couple of hours.
On the server console we saw:
[1BF4:0002-15E4] Thread=[1BF4:0002-15E4] [1BF4:0002-15E4] Stack base=0xA9BCE790, Stack size = 20432 bytes [1BF4:0002-15E4] PANIC: Object handle is invalid
The crash stack in the NSD shows the following
# thread 1/17: [ nAMgr: 1bf4: 15e4] FATAL THREAD (Panic) FP=0xB1A9BC7EB8, PC=0x7FFB68F05A84, SP=0xB1A9BC7EB8 stkbase=0xB1A9BD0000, total stksize=86016, used stksize=33096 EAX=0x00000004, EBX=0x00000000, ECX=0x00000b20, EDX=0x00000000 ESI=0x000927c0, EDI=0x00000b20, CS=0x00000033, SS=0x0000002b DS=0x00000000, ES=0x00000000, FS=0x00000000, GS=0x00000000 Flags=0x1700000246 # [ 1] 0x7FFB68F05A84 ntdll.ZwWaitForSingleObject+20 (10,0,0,B1A9BC7FD0) [ 2] 0x7FFB65F04DAF KERNELBASE.WaitForSingleObjectEx+143 (10,B1A9BC8680,7FFB00000000,b20) @[ 3] 0x7FFB55A8D430 nnotes.OSRunExternalScript+1808 (0,0,424,0) @[ 4] 0x7FFB55A897FC nnotes.FRTerminateWindowsResources+1532 (0,23164920CC0,0,1) @[ 5] 0x7FFB55A8B383 nnotes.OSFaultCleanupExt+1395 (0,4fd0,0,B1A9BC9940) @[ 6] 0x7FFB55A8AE07 nnotes.OSFaultCleanup+23 (4fd0,B1A9BC8E30,0,7FFB567F6668) @[ 7] 0x7FFB55AF7A76 nnotes.OSNTUnhandledExceptionFilter+390 (B1A9BC9820,7FFB570A2568,B1A9BC9940,FFFFE804495CCD3) @[ 8] 0x7FFB55A8E06A nnotes.Panic+1066 (30,12585AE001FDDC1,0,2b4) @[ 9] 0x7FFB55A8D943 nnotes.Halt+35 (23165F43FE8,9,0,0) @[10] 0x7FFB566F1AD1 nnotes.HANDLEDereference+113 (B1A9BCC980,7FFB4E4DE1F2,23170103018,7FFB4E556D38) @[11] 0x7FFB5674B956 nnotes.InitDbContextExt+310 (23170103018,0,23170103018,0) @[12] 0x7FFB56745F64 nnotes.NSFDbUserGetbTrans+36 (67200004,7FFB585E3E80,23164F80018,0) @[13] 0x7FFB56125640 nnotes.ClientSearchFill+80 (2000141c,7FFB200003C5,23164F80018,7FFB00000000) @[14] 0x7FFB562E6899 nnotes.QueueFill2+73 (23170102618,2000141c,0,0) @[15] 0x7FFB562E690B nnotes.QueueGet+27 (23170102618,7FFB585E3E80,23164F80018,23170102618) @[16] 0x7FFB4E559586 nlsxbe.ANServer::ANSVNextDbFile+214 (0,B1A9BCD8C0,109,0) @[17] 0x7FFB4E558E0E nlsxbe.ANServer::ANDispatchMethod+270 (B1A9BCD8C0,0,23170102718,7FFB5702A5F7) @[18] 0x7FFB4E4EB50F nlsxbe.ANCLASSCONTROL+7887 (23164C17358,7FFB00000109,B1A9BCD840,B1A9BCD8C0) @[19] 0x7FFB56F97F3F nnotes.LSsInstance::AdtCallBack+319 (2317022D6C8,2311399B9C0,1,23164C17358) @[20] 0x7FFB56FCE9C2 nnotes.LScObjCli::ProdMethodCall+82 (2317022D6C8,0,23,38) @[21] 0x7FFB56FC4C5B nnotes.LSsThread::AdtCallMethod+219 (7fff,2317022E558,B1A9BCD9A8,2311399B9C0) @[22] 0x7FFB56FBF3D2 nnotes.LSsThread::NRun+9922 (23164BB1B08,B1A9BC000B,0,24702531) @[23] 0x7FFB56FBFD51 nnotes.LSsThread::Run+449 (2311399B9C0,2316E307FA8,0,2) @[24] 0x7FFB56F6AC88 nnotes.LSIThread::RunInternal+104 (12585AE001FDDC1,0,0,12585AE00213D3A) @[25] 0x7FFB56F6AF42 nnotes.LSIThread::RunToCompletion+386 (2316E2F1E28,2316E2F1E28,B1A9BCDD10,12585AE00213D3A) @[26] 0x7FFB56F65DEE nnotes.CLSIDocument::RunScript+878 (B1A9BCEC00,2316E2FF9E8,B1A9BCEC00,0) @[27] 0x7FFB561F5958 nnotes.CRawActionLotusScript::Run+648 (2,B1A9BCE410,B1A9BCEC00,200017cf) @[28] 0x7FFB561EE147 nnotes.CRawAction::Execute+391 (2316E300828,0,23100000000,0) @[29] 0x7FFB561E9FDC nnotes.CAssistant::Run+4236 (12585AE00000000,B1A9BCEBC8,2316E2F1E28,23100000000) @[30] 0x7FFB5D825334 namgrdll.RunTask+2900 (B1A9BCF808,7FFB000001D2,7FFB00000000,23100000000) @[31] 0x7FFB5D8246D9 namgrdll.ProcessMessage+361 (0,1,140,23138BE6AC8) @[32] 0x7FFB5D823D2B namgrdll.ExecutiveMain+315 (23164B8E120,1,3,1) @[33] 0x7FFB5D826C3C namgrdll.AddInMain+412 (0,23164B8E108,0,0) @[34] 0x7FF691AA1037 nAMgr.NotesMain+55 (0,0,7FF691AA0000,B1A9BCFC60) @[35] 0x7FF691AA11D0 nAMgr.notes_main+336 (7FFB654859F8,0,0,3) @[36] 0x7FF691AA1078 nAMgr.main+24 (0,0,0,0) @[37] 0x7FF691AA14E0 nAMgr.__scrt_common_main_seh+268 (0,0,0,0) [38] 0x7FFB68D884D4 KERNEL32.BaseThreadInitThunk+20 (0,0,0,0) [39] 0x7FFB68ECE871 ntdll.RtlUserThreadStart+33 (0,0,0,0)
On the server, I have set DEBUG_LS_DUMP=1.
The call stack identifies the getNextDatabase method of the NotesDbDirectory class as the problematic part of the code.
<@@ ------ LotusScript Interpreter -> Call Stack for [ nAMgr: 1bf4: 15e4] (Time 07:49:05) ------ @@> Source database is: 'nuke-server.nsf' [2] GETNEXTDATABASE [1] RUN_WITH_NOTESEXT @ line number 49 [0] INITIALIZE @ line number 3 ** Detach from process [ nAMgr: 1bf4]
The agent uses the NotesDbDirectory class and iterates over all .nsf file on the Domino server.
Option Declare
Dim g_session As NotesSession
Sub Initialize
Set g_session = New NotesSession()
End Sub
Public Sub run_with_notesext()
On Error GoTo handle_err
Dim dbdir As New NotesDbDirectory(g_session.currentDatabase().Server)
Dim db As NotesDatabase
Set db = dbdir.GetFirstDatabase(1247)
While Not (db Is Nothing)
' do stuff with db ...
Set db = dbdir.GetNextDatabase
Wend
End Sub
I have opened a case (#CS0141246) with HCL support.
I wonder is it related to these problems ? – https://xpagesandmore.blogspot.com/2020/07/domino-1101-agent-manager-problems.html
It is not the amgr that is crashing, but the executed code. The issue is logged under SPR# JCUSBRSHVA