WiredTiger error - read checksum error / illegal file format or internal value

Hi, CryoSPARC isn’t starting properly. Logs for processes other than the database give port connection errors but I think those arise from the database issue (I may be wrong).

Here’s the initial section where I see an error:

2022-03-12T20:50:28.111-0600 I NETWORK  [thread1] connection accepted from 10.200.86.173:44864 #7 (3 connections now open)
2022-03-12T20:50:28.111-0600 I NETWORK  [conn7] received client metadata from 10.200.86.173:44864 conn7: { driver: { name: "PyMongo", version: "3.11.0" }, os: { type: "Linux", name: "Li
nux", architecture: "x86_64", version: "4.18.0-348.12.2.el8_5.x86_64" }, platform: "CPython 3.7.8.final.0" }
2022-03-12T20:50:31.140-0600 I NETWORK  [thread1] connection accepted from 10.200.86.173:45132 #8 (4 connections now open)
2022-03-12T20:50:31.140-0600 I NETWORK  [conn8] received client metadata from 10.200.86.173:45132 conn8: { driver: { name: "PyMongo", version: "3.11.0" }, os: { type: "Linux", name: "Li
nux", architecture: "x86_64", version: "4.18.0-348.12.2.el8_5.x86_64" }, platform: "CPython 3.7.8.final.0" }
2022-03-12T20:50:31.558-0600 E STORAGE  [conn6] WiredTiger error (0) [1647139831:558835][1562506:0x7fe4795fe700], file:collection-34--4350081423769996363.wt, WT_CURSOR.search: read chec
ksum error for 8192B block at offset 1676926976: block header checksum of 2985351156 doesn't match expected checksum of 3798962218
2022-03-12T20:50:31.558-0600 E STORAGE  [conn6] WiredTiger error (0) [1647139831:558874][1562506:0x7fe4795fe700], file:collection-34--4350081423769996363.wt, WT_CURSOR.search: collectio
n-34--4350081423769996363.wt: encountered an illegal file format or internal value
2022-03-12T20:50:31.558-0600 E STORAGE  [conn6] WiredTiger error (-31804) [1647139831:558880][1562506:0x7fe4795fe700], file:collection-34--4350081423769996363.wt, WT_CURSOR.search: the 
process must exit and restart: WT_PANIC: WiredTiger library panic
2022-03-12T20:50:31.558-0600 I -        [conn6] Fatal Assertion 28558 at src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp 361
2022-03-12T20:50:31.558-0600 I -        [conn6] 

***aborting after fassert() failure


2022-03-12T20:50:31.577-0600 F -        [conn6] Got signal: 6 (Aborted).

Then immediately after that, I get the following backtrace:

2022-03-12T20:50:31.577-0600 F -        [conn6] Got signal: 6 (Aborted).

 0x561f5dfd9ac1 0x561f5dfd8cd9 0x561f5dfd91bd 0x7fe498795c20 0x7fe4983f537f 0x7fe4983dfdb5 0x561f5d2ade97 0x561f5dd10b66 0x561f5d2b7b46 0x561f5d2b7d62 0x561f5d2b7fc4 0x561f5e8e10c5 0x56
1f5e8facfe 0x561f5e902023 0x561f5e922f40 0x561f5e8ef039 0x561f5e93fc08 0x561f5dd0405e 0x561f5d66efdb 0x561f5d615af6 0x561f5d637633 0x561f5d62b8b6 0x561f5d637633 0x561f5d6092a8 0x561f5d9
3b622 0x561f5d93d9c8 0x561f5d93e67c 0x561f5d8f7a82 0x561f5d8f85eb 0x561f5d5243b8 0x561f5d4fb05f 0x561f5d4fc741 0x561f5db11720 0x561f5d7158b2 0x561f5d7178b6 0x561f5d318c7d 0x561f5d3195ad
 0x561f5df59b31 0x7fe49878b17a 0x7fe4984badc3
----- BEGIN BACKTRACE -----
{"backtrace":[{"b":"561F5CAA6000","o":"1533AC1","s":"_ZN5mongo15printStackTraceERSo"},{"b":"561F5CAA6000","o":"1532CD9"},{"b":"561F5CAA6000","o":"15331BD"},{"b":"7FE498783000","o":"12C2
0"},{"b":"7FE4983BE000","o":"3737F","s":"gsignal"},{"b":"7FE4983BE000","o":"21DB5","s":"abort"},{"b":"561F5CAA6000","o":"807E97","s":"_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj"}
,{"b":"561F5CAA6000","o":"126AB66"},{"b":"561F5CAA6000","o":"811B46","s":"__wt_eventv"},{"b":"561F5CAA6000","o":"811D62","s":"__wt_err"},{"b":"561F5CAA6000","o":"811FC4","s":"__wt_panic"},{"b":"561F5CAA6000","o":"1E3B0C5","s":"__wt_bm_read"},{"b":"561F5CAA6000","o":"1E54CFE","s":"__wt_bt_read"},{"b":"561F5CAA6000","o":"1E5C023","s":"__wt_page_in_func"},{"b":"561F5CAA6000","o":"1E7CF40","s":"__wt_row_search"},{"b":"561F5CAA6000","o":"1E49039","s":"__wt_btcur_search"},{"b":"561F5CAA6000","o":"1E99C08"},{"b":"561F5CAA6000","o":"125E05E","s":"_ZN5mongo21WiredTigerRecordStore6Cursor9seekExactERKNS_8RecordIdE"},{"b":"561F5CAA6000","o":"BC8FDB","s":"_ZN5mongo16WorkingSetCommon5fetchEPNS_16OperationContextEPNS_10WorkingSetEmNS_11unowned_ptrINS_20SeekableRecordCursorEEE"},{"b":"561F5CAA6000","o":"B6FAF6","s":"_ZN5mongo10FetchStage6doWorkEPm"},{"b":"561F5CAA6000","o":"B91633","s":"_ZN5mongo9PlanStage4workEPm"},{"b":"561F5CAA6000","o":"B858B6","s":"_ZN5mongo10LimitStage6doWorkEPm"},{"b":"561F5CAA6000","o":"B91633","s":"_ZN5mongo9PlanStage4workEPm"},{"b":"561F5CAA6000","o":"B632A8","s":"_ZN5mongo15CachedPlanStage12pickBestPlanEPNS_15PlanYieldPolicyE"},{"b":"561F5CAA6000","o":"E95622","s":"_ZN5mongo12PlanExecutor12pickBestPlanENS0_11YieldPolicyEPKNS_10CollectionE"},{"b":"561F5CAA6000","o":"E979C8","s":"_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS0_11YieldPolicyE"},{"b":"561F5CAA6000","o":"E9867C","s":"_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionENS0_11YieldPolicyE"},{"b":"561F5CAA6000","o":"E51A82","s":"_ZN5mongo11getExecutorEPNS_16OperationContextEPNS_10CollectionESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS5_EENS_12PlanExecutor11YieldPolicyEm"},{"b":"561F5CAA6000","o":"E525EB","s":"_ZN5mongo15getExecutorFindEPNS_16OperationContextEPNS_10CollectionERKNS_15NamespaceStringESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS8_EENS_12PlanExecutor11YieldPolicyE"},{"b":"561F5CAA6000","o":"A7E3B8","s":"_ZN5mongo7FindCmd3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE"},{"b":"561F5CAA6000","o":"A5505F","s":"_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE"},{"b":"561F5CAA6000","o":"A56741","s":"_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE"},{"b":"561F5CAA6000","o":"106B720","s":"_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE"},{"b":"561F5CAA6000","o":"C6F8B2"},{"b":"561F5CAA6000","o":"C718B6","s":"_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE"},{"b":"561F5CAA6000","o":"872C7D","s":"_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE"},{"b":"561F5CAA6000","o":"8735AD"},{"b":"561F5CAA6000","o":"14B3B31"},{"b":"7FE498783000","o":"817A"},{"b":"7FE4983BE000","o":"FCDC3","s":"clone"}],"processInfo":{ "mongodbVersion" : "3.4.10", "gitVersion" : "078f28920cb24de0dd479b5ea6c66c644f6326e9", "compiledModules" : [], "uname" : { "sysname" : "Linux", "release" : "4.18.0-348.12.2.el8_5.x86_64", "version" : "#1 SMP Mon Jan 17 07:06:06 EST 2022", "machine" : "x86_64" }, "somap" : [ { "b" : "561F5CAA6000", "elfType" : 3, "buildId" : "D9AB5C91FBC6F740604F4BC28348FE33EC87DEC2" }, { "b" : "7FFD038E8000", "path" : "linux-vdso.so.1", "elfType" : 3, "buildId" : "2299B1AB099FB947B2D00C527E2EA346E8C644D8" }, { "b" : "7FE499349000", "path" : "/home/clee2/software/cryosparc2_hpc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/libpython3.7m.so", "elfType" : 3 }, { "b" : "7FE49985B000", "path" : "/home/clee2/software/cryosparc2_hpc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/libtiff.so", "elfType" : 3 }, { "b" : "7FE499141000", "path" : "/lib64/librt.so.1", "elfType" : 3, "buildId" : "4EF322FE9CC7BBE1C91485914246A64EC83A5220" }, { "b" : "7FE498F3D000", "path" : "/lib64/libdl.so.2", "elfType" : 3, "buildId" : "CA2A5477D8E8D44D258F696ED662B4C5314E41FD" }, { "b" : "7FE498BBB000", "path" : "/lib64/libm.so.6", "elfType" : 3, "buildId" : "377F483CE8B794170D3BB5A1D75438E766AF267E" }, { "b" : "7FE4989A3000", "path" : "/lib64/libgcc_s.so.1", "elfType" : 3, "buildId" : "F4AB03961FED72ED3181A4BEC36CC3D22C3CF16A" }, { "b" : "7FE498783000", "path" : "/lib64/libpthread.so.0", "elfType" : 3, "buildId" : "7A60D9380CC2284FE162F9900B28F4DBDB7029FD" }, { "b" : "7FE4983BE000", "path" : "/lib64/libc.so.6", "elfType" : 3, "buildId" : "ACEB5A6A8E8000A295B4024B2754D117433411EF" }, { "b" : "7FE4996B3000", "path" : "/lib64/ld-linux-x86-64.so.2", "elfType" : 3, "buildId" : "523B1C63A3F4B12DA75660BF483D63560694D81F" }, { "b" : "7FE4981BA000", "path" : "/lib64/libutil.so.1", "elfType" : 3, "buildId" : "C5168D39B1A665267AE61EFBB5E7272E6BF9C73B" }, { "b" : "7FE4997C8000", "path" : "/home/clee2/software/cryosparc2_hpc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/./libwebp.so.7", "elfType" : 3 }, { "b" : "7FE4996FD000", "path" : "/home/clee2/software/cryosparc2_hpc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/./libzstd.so.1", "elfType" : 3 }, { "b" : "7FE498191000", "path" : "/home/clee2/software/cryosparc2_hpc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/./liblzma.so.5", "elfType" : 3 }, { "b" : "7FE498153000", "path" : "/home/clee2/software/cryosparc2_hpc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/./libjpeg.so.9", "elfType" : 3 }, { "b" : "7FE4996E1000", "path" : "/home/clee2/software/cryosparc2_hpc/cryosparc_master/deps/anaconda/envs/cryosparc_master_env/lib/./libz.so.1", "elfType" : 3 } ] }}
 mongod(_ZN5mongo15printStackTraceERSo+0x41) [0x561f5dfd9ac1]
 mongod(+0x1532CD9) [0x561f5dfd8cd9]
 mongod(+0x15331BD) [0x561f5dfd91bd]
 libpthread.so.0(+0x12C20) [0x7fe498795c20]
 libc.so.6(gsignal+0x10F) [0x7fe4983f537f]
 libc.so.6(abort+0x127) [0x7fe4983dfdb5]
 mongod(_ZN5mongo32fassertFailedNoTraceWithLocationEiPKcj+0x0) [0x561f5d2ade97]
 mongod(+0x126AB66) [0x561f5dd10b66]
mongod(__wt_eventv+0x3D7) [0x561f5d2b7b46]
 mongod(__wt_err+0x9D) [0x561f5d2b7d62]
 mongod(__wt_panic+0x2E) [0x561f5d2b7fc4]
 mongod(__wt_bm_read+0x135) [0x561f5e8e10c5]
 mongod(__wt_bt_read+0xAE) [0x561f5e8facfe]
 mongod(__wt_page_in_func+0x1303) [0x561f5e902023]
 mongod(__wt_row_search+0x660) [0x561f5e922f40]
 mongod(__wt_btcur_search+0x7C9) [0x561f5e8ef039]
 mongod(+0x1E99C08) [0x561f5e93fc08]
 mongod(_ZN5mongo21WiredTigerRecordStore6Cursor9seekExactERKNS_8RecordIdE+0x4E) [0x561f5dd0405e]
 mongod(_ZN5mongo16WorkingSetCommon5fetchEPNS_16OperationContextEPNS_10WorkingSetEmNS_11unowned_ptrINS_20SeekableRecordCursorEEE+0xAB) [0x561f5d66efdb]
 mongod(_ZN5mongo10FetchStage6doWorkEPm+0x106) [0x561f5d615af6]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x561f5d637633]
 mongod(_ZN5mongo10LimitStage6doWorkEPm+0x76) [0x561f5d62b8b6]
 mongod(_ZN5mongo9PlanStage4workEPm+0x63) [0x561f5d637633]
 mongod(_ZN5mongo15CachedPlanStage12pickBestPlanEPNS_15PlanYieldPolicyE+0x198) [0x561f5d6092a8]
 mongod(_ZN5mongo12PlanExecutor12pickBestPlanENS0_11YieldPolicyEPKNS_10CollectionE+0xF2) [0x561f5d93b622]
 mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEENS0_11YieldPolicyE+0x2D8) [0x561f5d93d9c8]
 mongod(_ZN5mongo12PlanExecutor4makeEPNS_16OperationContextESt10unique_ptrINS_10WorkingSetESt14default_deleteIS4_EES3_INS_9PlanStageES5_IS8_EES3_INS_13QuerySolutionES5_ISB_EES3_INS_14CanonicalQueryES5_ISE_EEPKNS_10CollectionENS0_11YieldPolicyE+0xEC) [0x561f5d93e67c]
 mongod(_ZN5mongo11getExecutorEPNS_16OperationContextEPNS_10CollectionESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS5_EENS_12PlanExecutor11YieldPolicyEm+0x132) [0x561f5d8f7a82]
 mongod(_ZN5mongo15getExecutorFindEPNS_16OperationContextEPNS_10CollectionERKNS_15NamespaceStringESt10unique_ptrINS_14CanonicalQueryESt14default_deleteIS8_EENS_12PlanExecutor11YieldPolicyE+0x8B) [0x561f5d8f85eb]
 mongod(_ZN5mongo7FindCmd3runEPNS_16OperationContextERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEERNS_7BSONObjEiRS8_RNS_14BSONObjBuilderE+0xC98) [0x561f5d5243b8]
 mongod(_ZN5mongo7Command3runEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS3_21ReplyBuilderInterfaceE+0x4FF) [0x561f5d4fb05f]
 mongod(_ZN5mongo7Command11execCommandEPNS_16OperationContextEPS0_RKNS_3rpc16RequestInterfaceEPNS4_21ReplyBuilderInterfaceE+0xF81) [0x561f5d4fc741]
 mongod(_ZN5mongo11runCommandsEPNS_16OperationContextERKNS_3rpc16RequestInterfaceEPNS2_21ReplyBuilderInterfaceE+0x240) [0x561f5db11720]
 mongod(+0xC6F8B2) [0x561f5d7158b2]
 mongod(_ZN5mongo16assembleResponseEPNS_16OperationContextERNS_7MessageERNS_10DbResponseERKNS_11HostAndPortE+0x746) [0x561f5d7178b6]
 mongod(_ZN5mongo23ServiceEntryPointMongod12_sessionLoopERKSt10shared_ptrINS_9transport7SessionEE+0x1FD) [0x561f5d318c7d]
 mongod(+0x8735AD) [0x561f5d3195ad]
 mongod(+0x14B3B31) [0x561f5df59b31]
 libpthread.so.0(+0x817A) [0x7fe49878b17a]
 libc.so.6(clone+0x43) [0x7fe4984badc3]
-----  END BACKTRACE  -----

When I execute cryosparcm restart, it hangs after the following:

[clee2@svlpcryosparc01 cryosparc_master]$ cryosparcm restart
CryoSPARC is running.
Stopping cryoSPARC
command_core: stopped
command_rtp: stopped
command_vis: stopped
Shut down
Starting cryoSPARC System master process..
CryoSPARC is not already running.
database: started
Database configuration is OK.
command_core: started
command_core connection succeeded
command_core startup successful
command_vis: started
command_rtp: started
command_rtp connection succeeded

All files for the master instance are accessed via NFS.

Here’s what I get when executing “cryosparcm mongo”:

[clee2@svlpcryosparc01 cryosparc_master]$ cryosparcm mongo
MongoDB shell version v3.4.10
connecting to: mongodb://localhost:50001/meteor
2022-03-12T21:45:54.415-0600 W NETWORK  [thread1] Failed to connect to 127.0.0.1:50001, in(checking socket for error after poll), reason: Connection refused
2022-03-12T21:45:54.423-0600 E QUERY    [thread1] Error: couldn't connect to server localhost:50001, connection attempt failed :
connect@src/mongo/shell/mongo.js:237:13
@(connect):1:6
exception: connect failed

But I checked and there was nothing running on the relevant set of ports previously (50000+).

@shockacone Your database may be corrupted. Is the hardware functioning reliably? Is there enough free capacity on the volume that holds CRYOSPARC_DB_PATH?
If you have recently recently backed up (recommended) your database, you may restore it.
If there is no backup, you may attempt repair.
Please be aware that repair is not a fully functional substitute for a backup/restore operation; documentation for a more recent mongo version states that repair discards corrupt data.

1 Like

Hi @wtempel,

I think what you’ve said here is correct. Luckily we have daily snapshots of data so I was able to use an earlier version of the same database. Would you send me the instructions for a repair so I can learn from it in case it’s needed?

–Shaker

1 Like

@shockacone I am glad to hear that you have snapshots available. Please have a look at the instructions in that post from some time ago. I additionally recommend to confirm the installed mongo version:

  1. cryosparcm mongo
  2. db.version()

and refer to version-specific information, like this link for version 3.4.

1 Like