[View:/cfs-file/__key/communityserver-discussions-components-files/18/Test1.txt:320:240][View:/cfs-file/__key/communityserver-discussions-components-files/18/Error.txt:320:240][View:/cfs-file/__key/communityserver-discussions-components-files/18/Database-errors_5F00_all.txt:320:240]
HI All,
Yesterday we have faced issue database got crashed.During that time we found below errors found in database log files.
Previously when we faced issue we thought some network levl issues and also we got same info from knowledge base also then we restarted database.
23:31:34 SQLSRV2 6: SYSTEM ERROR: Memory violation. (49
23:31:34 SQLSRV2 8: SYSTEM ERROR: Memory violation. (49)
23:31:35 Usr 43: SYSTEM ERROR: rmlocate: Invalid RM block for area 6 , requested Dbkey 461129088 and located Dbkey 253. (10809)
23:31:35 Usr 43: ** Save file named core for analysis by Progress Software Corporation. (439)
23:31:35 Usr 43: Corrupt block detected when attempting to release a buffer. (4232)
23:31:35 Usr 43: bmReleaseBuffer: Error occurred in area 6, block number: 7205141, extent: /progdata/prd/crndb1/prdcrn.d1. (10560)
23:31:35 Usr 43: Writing block 7205141 to log file. Please save and send the log file to Progress Software Corp. for investigation. (10561)
23:31:35 Usr 43: SYSTEM DEBUG: Database buffer block
23:31:35 Usr 43: pbktbl = 0x30bf3f44
23:31:35 Usr 43: pbktbl->qself = 0x58bf3f44
23:31:35 Usr 43: XBKBUF(pbktbl->qself) = 0x30bf3f44
23:31:35 Usr 43: pbktbl->bt_qbuf = 0x786347bc
23:31:35 Usr 43: XBKBUF(pbktbl->bt_qbuf) = 0x706347bc
23:31:35 Usr 43: pbkbuf = 0x706347bc
23:31:35 Usr 43: Block dbkey = 0x1b7c4580 bt_offset = 0x0
But this issue is happened second time..
I would like to know if any block got corrupted during any batch jobs or reports run. why i'm saying this is issue happening during non-business hours only.
I would like to test the above errors.
How we can re produce the same errors.I want to test this errors in any test database.Any suggestion how we can reproduce or how we can resolve this issue.
23:31:34 SQLSRV2 6: SYSTEM ERROR: Memory violation. (49
23:31:34 SQLSRV2 8: SYSTEM ERROR: Memory violation. (49)
23:31:35 Usr 43: SYSTEM ERROR: rmlocate: Invalid RM block for area 6 , requested Dbkey 461129088 and located Dbkey 253. (10809)
23:31:35 Usr 43: ** Save file named core for analysis by Progress Software Corporation. (439)
23:31:35 Usr 43: Corrupt block detected when attempting to release a buffer. (4232)
23:31:35 Usr 43: bmReleaseBuffer: Error occurred in area 6, block number: 7205141, extent: /progdata/prd/crndb1/prdcrn.d1. (10560)
And also during this incident in all our database log files i found below errors..
Any idea.
23:31:34 SQLSRV2 6: SYSTEM ERROR: Memory violation. (49)
23:31:34 SQLSRV2 8: SYSTEM ERROR: Memory violation. (49)
23:31:35 Usr 43: SYSTEM ERROR: rmlocate: Invalid RM block for area 6 , requested Dbkey 461129088 and located Dbkey 253. (10809)
23:31:35 Usr 43: ** Save file named core for analysis by Progress Software Corporation. (439)
23:31:35 Usr 43: Corrupt block detected when attempting to release a buffer. (4232)
23:31:35 Usr 43: bmReleaseBuffer: Error occurred in area 6, block number: 7205141, extent: /progdata/prd/crndb1/prdcrn.d1. (10560)
23:31:35 Usr 43: Writing block 7205141 to log file. Please save and send the log file to Progress Software Corp. for investigation. (10561)
23:31:35 Usr 43: SYSTEM DEBUG: Database buffer block
23:31:35 Usr 43: pbktbl = 0x30bf3f44
23:31:35 Usr 43: pbktbl->qself = 0x58bf3f44
23:31:35 Usr 43: XBKBUF(pbktbl->qself) = 0x30bf3f44
23:31:35 Usr 43: pbktbl->bt_qbuf = 0x786347bc
23:31:35 Usr 43: XBKBUF(pbktbl->bt_qbuf) = 0x706347bc
23:31:35 Usr 43: pbkbuf = 0x706347bc
23:31:35 Usr 43: Block dbkey = 0x1b7c4580 bt_offset = 0x0
Please find attached errors log files details
Please help me!!..
23:49:07 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:49:12 SRV 13: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:49:35 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:49:42 SRV 13: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:49:49 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:49:57 SRV 24: Connection timed out on socket=92 for usernum 2, attempt disconnect. (1280)
23:50:05 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:50:20 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:50:35 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:50:36 SRV 28: Server's received count 1 does not equal client(1)'s send count 1414744096. (1055)
23:50:50 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:50:50 SRV 27: Server's received count 1 does not equal client(1)'s send count 1414744096. (1055)
23:51:06 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:51:20 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:51:36 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:51:41 SRV 24: Connection timed out on socket=92 for usernum 2, attempt disconnect. (1280)
23:51:50 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:52:07 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:52:18 SRV 24: Connection timed out on socket=92 for usernum 2, attempt disconnect. (1280)
23:52:21 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:52:37 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:52:51 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:52:53 SRV 24: Connection timed out on socket=92 for usernum 2, attempt disconnect. (1280)
23:53:07 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:53:21 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:53:37 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:53:38 SRV 24: Connection timed out on socket=92 for usernum 2, attempt disconnect. (1280)
23:53:49 SRV 16: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:53:51 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:54:07 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:54:09 SRV 24: Connection timed out on socket=92 for usernum 2, attempt disconnect. (1280)
23:54:21 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:54:38 SRV 28: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
23:54:51 SRV 24: Connection timed out on socket=92 for usernum 2, attempt disconnect. (1280)
23:54:52 SRV 27: Connection timed out on socket=91 for usernum 1, attempt disconnect. (1280)
Error in Database : prdreyb
===================================
23:37:59 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:37:59 SRV 24: Server's received count 1 does not equal client(1)'s send count 1414744096. (1055)
23:38:30 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:39:00 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:39:31 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:40:01 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:40:31 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:41:01 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:41:32 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:42:02 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:42:32 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:43:02 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:44:18 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:44:48 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:46:01 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:46:46 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:47:16 SRV 24: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:49:27 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:50:15 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:50:51 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:51:26 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:52:12 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:52:42 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:53:28 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:53:49 SRV 31: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:53:53 SRV 40: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:53:58 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:54:28 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
23:54:28 SRV 33: Server's received count 1 does not equal client(1)'s send count 1414744096. (1055)
23:54:58 SRV 33: Connection timed out on socket=31 for usernum 1, attempt disconnect. (1280)
Error in Database : prdslc
===================================
Error in Database : prdslcb
===================================
23:48:52 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:48:57 BROKER 1: Connection timed out on socket=60 for usernum 1, attempt disconnect. (1280)
23:49:18 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:49:27 BROKER 1: Connection timed out on socket=60 for usernum 1, attempt disconnect. (1280)
23:49:30 SRV 23: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:49:39 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:49:48 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:50:15 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:50:19 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:50:49 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:50:51 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:51:19 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:51:27 SRV 23: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:51:36 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:51:49 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:52:03 SRV 23: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:52:07 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:52:20 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:52:38 SRV 23: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:52:49 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:52:50 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:53:19 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:53:24 SRV 23: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:53:49 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:53:50 SRV 28: Server's received count 1 does not equal client(1)'s send count 1414744096. (1055)
23:53:54 SRV 23: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:54:05 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:54:08 SRV 11: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:54:20 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:54:35 SRV 27: Connection timed out on socket=62 for usernum 2, attempt disconnect. (1280)
23:54:36 SRV 23: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
23:54:50 SRV 28: Connection timed out on socket=61 for usernum 1, attempt disconnect. (1280)
Does generating the dbanalys for area 6 produce the error ?
Progress version? V8 something?
> 23:31:35 Usr 43: Corrupt block detected when attempting to release a buffer. (4232)
Block on disk is NOT corrupted.
> 23:31:34 SQLSRV2 6: SYSTEM ERROR: Memory violation. (49)
/Before/ the message 49 and hence before 4232 something went wrong in memory of SQL server. Process could change any data in its address space including the shared memory of the connected database. Memory violation error means that the data corruption went too far: the process tried to change the memory outside its address space.
The main question is what SQL server was doing before it rised the memory violation error.
Due to remote connection from other source like SQl is causing this issue
how to find which sql server process is causing this error from database level.
And also our Progress version is 9.1e and OS : AIX 5.3
> 23:31:35 Usr 43: Writing block 7205141 to log file. Please save and send the log file to Progress Software Corp. for investigation. (10561)
Dump saved the "garbage" that replaced the correct data in shared memory. It's a footprint that can help to find out the case of the corruption. I bet the dump does not contain data from a real db block.
how we can find data corruption,
Except protrace file no file got generated
Can you attach db log with all messages during db crash?
George
please find attached for your reference...
23:31:35 Usr 43: Block buffer dump follows, from address 0x706347bc: 23:31:35 Usr 43: 0000: 0000 00fd 00fa 000d 0333 3438 0331 3431 23:31:35 Usr 43: 0010: 0002 3031 ff00 0000 0000 0000 0000 0000 23:31:35 Usr 43: 0020: 0000 fd00 0000 0000 025c 2000 013f e700 23:31:35 Usr 43: 0030: 0e00 5700 9100 be00 dd00 ed01 1a01 2afa 23:31:35 Usr 43: 0040: 000c 0002 008c 0171 fdfd fdfd fdff 0a53 23:31:35 Usr 43: 0050: 4d30 3030 322d 5553 4102 5c20 025c 2006 23:31:35 Usr 43: 0060: 4953 532d 534f 0853 4f36 3337 3136 3502 23:31:35 Usr 43: 0070: 8020 0480 3555 0f02 8020 0200 2000 0245 23:31:35 Usr 43: 0080: 4102 5c20 0853 4f36 3337 3136 3508 534f 23:31:35 Usr 43: 0090: 3633 3731 3635 0008 3131 3034 3834 3238 23:31:35 Usr 43: 00a0: 0832 3330 3633 3338 3500 0004 8312 009f 23:31:35 Usr 43: 00b0: 0000 0482 1694 2f04 01b6 85f1 0482 3280 23:31:35 Usr 43: 00c0: 9f00 0008 3233 3639 3534 3835 0002 5c20 23:31:35 Usr 43: 00d0: 0200 2005 6261 7463 6807 3533 3438 3030 23:31:35 Usr 43: 00e0: 3102 5c20 0442 3636 3000 0000 0001 0200 23:31:35 Usr 43: 00f0: 0003 5553 4402 801f 0003 014a cd04 8443 23:31:35 Usr 43: 0100: 954f 0431 3030 3005 574f 4246 4700 0262 23:31:35 Usr 43: 0110: 5200 0000 0000 0000 0000 0000 0000 0000 23:31:35 Usr 43: 0120: 0000 fdfd fdfd fd00 0000 0000 0000 0000 23:31:35 Usr 43: 0130: 0a78 7866 786d 3035 612e 7000 0431 3030 23:31:35 Usr 43: 0140: 3000 0000 fd00 fa00 0d03 3334 3803 3134
It's a fragment of data block. Block header is lost. The records contains the character fields that mainly store the numbers like "348" (<- 33 3438) or, for example, "SO637165" (<- 53 4f36 3337 3136 35). Can you identify a table by these patterns?
how to identify those tables.
kindly let me know i will share those details
how to reproduce these error.
23:31:35 Usr 43: ** Save file named core for analysis by Progress Software Corporation. (439)
23:31:35 Usr 43: Corrupt block detected when attempting to release a buffer. (4232)
23:31:35 Usr 43: bmReleaseBuffer: Error occurred in area 6, block number: 7205141, extent: /progdata/prd/crndb1/prdcrn.d1. (10560)
23:31:35 Usr 43: Writing block 7205141 to log file. Please save and send the log file to Progress Software Corp. for investigation. (10561)
23:31:35 Usr 43: SYSTEM DEBUG: Database buffer block
what does the error saying ?
> what does the error saying ?
At some point in time one process has read the block number 7205141 from disk (from the prdcrn.d1 file). The block has passed the sanity checks - it was not corrupted. Later (at 23:31:35) user 43 decided to evict the block from buffer pool and to replace it by another block from disk. At this moment the sanity checks have found the corruption in the block's header. The corruption was left by some unknown "bad" process who updated the shared memory using a wrong offset.We don't know when it happened - there were no errors. We known only a footprint left by the process. Block dump contains the table's IDs that the "bad" process was updating but you posted only a part of block dump (only 336 bytes of 8192). The real contents of the block reported in the errors do not matter - the block was just an "incidental victim" of the "bad" process.
It's really hard to envistigate a root case of the errors 49 and 4232. Contact Progress technical support.
I created new database same as production database in test server.
During restart of test databases i found errors in database log file.
Thu Jul 20 15:27:44 2017
15:27:44 prostrct create session begin for pgresdba on /dev/pts/2. (451)
15:27:45 prostrct create session end. (334)
Thu Jul 20 15:28:23 2017
15:28:23 procopy session begin for pgresdba on /dev/pts/2. (451)
15:28:24 procopy session end. (334)
Thu Jul 20 15:28:24 2017
15:28:24 procopy get default SQL DBA session begin for pgresdba on /dev/pts/2. (451)
15:28:24 procopy get default SQL DBA session end. (334)
Thu Jul 20 15:30:35 2017
15:30:35 prorest session begin for pgresdba on /dev/pts/2. (451)
15:30:41 Full restore started. (1368)
15:57:09 Full restore completed. (1369)
15:57:10 prorest session end. (334)
Thu Jul 20 16:08:18 2017
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.db is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d1 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d2 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d3 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d4 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d5 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d6 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d7 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d8 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d9 is on a remote device. (9466)
16:08:18 BROKER 0: File /progdata/bkup/test/prdcrn_tst.d10 is on a remote device. (9466)
HI George,
I tried to re-produce these issue and got same errors.
please attached test1 doc and let me know whether our database got corrupted.
14:37:17 BROKER 0: This database has not been fully restored and may be damaged. (1621)
It's not the same error. ;-)
I have taken dbanalys report for test database which is created with backup which is taken after issue.
Using below errors we can identify anything
AREA "Schema Area" : 6 BLOCK ANALYSIS
-------------------------------------------------
[Warning] RM block found that should be in the RM free chain. (2802)
dbkey=256534912, free space=8170, # of free directories=64, hold=0
(3865)
11599786 block(s) found in the area.
i taken dbanalys using backup which is taken after the issue.
can i know what exactly the issue.
AREA "Schema Area" : 6 BLOCK ANALYSIS
-------------------------------------------------
[Warning] RM block found that should be in the RM free chain. (2802)
dbkey=256534912, free space=8170, # of free directories=64, hold=0
(3865)
11599786 block(s) found in the area.
You asked if you could identify anything with the data you had -
000015059 - How to determine what table a given RECID resides in when using Progress Version 9 or later
Carey, we can't find a table based on a /wrong/ dbkey. Record's offset inside the block can be wrong as well (as in this case). Even an area number can be wrong. And finally even if you will use the correct dbkey but the header of the specified block contains a wrong dbkey then the FIND statement will crash.
> i taken dbanalys using backup which is taken after the issue.
You need to rebuild the backup because the message in your attached Test1.txt file says that "this database has not been fully restored". Can you describe all your steps during backup and restore?