-----============= acceptance-small: replay-dual ============----- Thu Apr 18 20:16:24 EDT 2024 excepting tests: 14b 21b skipping tests SLOW=no: 21b === replay-dual: start setup 20:16:28 (1713485788) === Starting client oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 Started clients oleg440-client.virtnet: 192.168.204.140@tcp:/lustre on /mnt/lustre2 type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) oleg440-client.virtnet: executing check_config_client /mnt/lustre oleg440-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg440-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8800a9fc3800.idle_timeout=debug osc.lustre-OST0000-osc-ffff8800b6030000.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800a9fc3800.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800b6030000.idle_timeout=debug disable quota as required oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all osd-ldiskfs.track_declares_assert=1 === replay-dual: finish setup 20:16:35 (1713485795) === debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0a: expired recovery with lost client ========================================================== 20:16:36 (1713485796) Check file is LU482_FAILED=/tmp/replay-dual.lu482.gRJOXL UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 50 open/close in 0.41 seconds: 121.55 ops/second fail_loc=0x80000514 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:16:40 (1713485800) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:16:54 (1713485814) targets are mounted 20:16:54 (1713485814) facet_failover done Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 - unlinked 0 (time 1713485895 ; total 0 ; last 0) total: 50 unlinks in 1 seconds: 50.000000 unlinks/second PASS 0a (101s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0b: lost client during waiting for next transno ========================================================== 20:18:18 (1713485898) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:18:22 (1713485902) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:18:36 (1713485916) targets are mounted 20:18:36 (1713485916) facet_failover done Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 PASS 0b (119s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 1: |X| simple create ================= 20:20:18 (1713486018) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre mcreate: cannot create `/mnt/lustre/fsa-oleg440-client.virtnet' with mode 0100644: File exists Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:20:22 (1713486022) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:20:35 (1713486035) targets are mounted 20:20:35 (1713486035) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 1 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 2: |X| mkdir adir ==================== 20:20:44 (1713486044) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:20:47 (1713486047) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:21:01 (1713486061) targets are mounted 20:21:01 (1713486061) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 2 (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 3: |X| mkdir adir, mkdir adir/bdir === 20:21:12 (1713486072) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:21:18 (1713486078) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:21:33 (1713486093) targets are mounted 20:21:33 (1713486093) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 3 (29s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 4: |X| mkdir adir (-EEXIST), mkdir adir/bdir ========================================================== 20:21:43 (1713486103) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1916 1285772 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre mkdir: cannot create directory '/mnt/lustre/adir': File exists Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:21:52 (1713486112) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:22:08 (1713486128) targets are mounted 20:22:08 (1713486128) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 4 (32s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 5: open, unlink |X| close ============ 20:22:17 (1713486137) multiop /mnt/lustre2/a vo_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7508 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:22:21 (1713486141) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:22:36 (1713486156) targets are mounted 20:22:36 (1713486156) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 5 (26s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 6: open1, open2, unlink |X| close1 [fail mds1] close2 ========================================================== 20:22:45 (1713486165) multiop /mnt/lustre2/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7508 multiop /mnt/lustre/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7508 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:22:49 (1713486169) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:23:02 (1713486182) targets are mounted 20:23:02 (1713486182) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 6 (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 8: replay of resent request ========== 20:23:11 (1713486191) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x119 fail_loc=0 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:23:31 (1713486211) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:23:45 (1713486225) targets are mounted 20:23:45 (1713486225) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 8 (41s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 9: resending a replayed create ======= 20:23:54 (1713486234) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:23:58 (1713486238) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:24:11 (1713486251) targets are mounted 20:24:11 (1713486251) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (38s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 10: resending a replayed unlink ====== 20:24:33 (1713486273) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:24:37 (1713486277) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:24:51 (1713486291) targets are mounted 20:24:51 (1713486291) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 10 (37s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 11: both clients timeout during replay ========================================================== 20:25:11 (1713486311) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x0119 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:25:15 (1713486315) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:25:28 (1713486328) targets are mounted 20:25:28 (1713486328) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 17 sec fail_loc=0 PASS 11 (37s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 12: open resend timeout ============== 20:25:50 (1713486350) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre multiop /mnt/lustre/f12.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7508 fail_loc=0x80000302 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:25:54 (1713486354) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:26:08 (1713486368) targets are mounted 20:26:08 (1713486368) facet_failover done fail_loc=0 /mnt/lustre/f12.replay-dual /mnt/lustre/f12.replay-dual has type file OK PASS 12 (23s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 13: close resend timeout ============= 20:26:14 (1713486374) multiop /mnt/lustre/f13.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7508 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000115 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:26:18 (1713486378) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:26:32 (1713486392) targets are mounted 20:26:32 (1713486392) facet_failover done fail_loc=0 /mnt/lustre/f13.replay-dual /mnt/lustre/f13.replay-dual has type file OK PASS 13 (22s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_14b skipping ALWAYS excluded test 14b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15a: timeout waiting for lost client during replay, 1 client completes ========================================================== 20:26:38 (1713486398) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.22 seconds: 114.54 ops/second total: 1 open/close in 0.01 seconds: 107.18 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:26:42 (1713486402) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:26:56 (1713486416) targets are mounted 20:26:56 (1713486416) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713486487 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 PASS 15a (90s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15c: remove multiple OST orphans ===== 20:28:10 (1713486490) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:28:40 (1713486520) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:28:54 (1713486534) targets are mounted 20:28:54 (1713486534) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 PASS 15c (120s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 16: fail MDS during recovery (3571) == 20:30:12 (1713486612) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.20 seconds: 124.00 ops/second total: 1 open/close in 0.01 seconds: 123.72 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:30:16 (1713486616) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:30:30 (1713486630) targets are mounted 20:30:30 (1713486630) facet_failover done Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:30:51 (1713486651) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:31:03 (1713486663) targets are mounted 20:31:03 (1713486663) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713486737 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 PASS 16 (127s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 17: fail OST during recovery (3571) == 20:32:20 (1713486740) total: 25 open/close in 0.18 seconds: 140.10 ops/second total: 1 open/close in 0.01 seconds: 94.88 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1984 1285704 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing ost1 on oleg440-server Stopping /mnt/lustre-ost1 (opts:) on oleg440-server 20:32:24 (1713486744) shut down Failover ost1 to oleg440-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-OST0000 20:32:38 (1713486758) targets are mounted 20:32:38 (1713486758) facet_failover done Failing ost1 on oleg440-server Stopping /mnt/lustre-ost1 (opts:) on oleg440-server 20:32:59 (1713486779) shut down Failover ost1 to oleg440-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-OST0000 20:33:12 (1713486792) targets are mounted 20:33:12 (1713486792) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713486869 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 PASS 17 (130s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 18: ldlm_handle_enqueue succeeds on evicted export (3822) ========================================================== 20:34:32 (1713486872) debug=+dlmtrace fail_loc=0x8000030b using seed 2245708777 running for 500 iterations total: 500 stats in 0 seconds: inf stats/second ldlm.namespaces.MGC192.168.204.140@tcp.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800a99f2800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800aa858800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800a99f2800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800aa858800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800a99f2800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800aa858800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800a99f2800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800aa858800.early_lock_cancel=0 fail_loc=0x80000305 Error in opening file "/mnt/lustre2/d18.replay-dual/f18.replay-dual"(flags=O_RDONLY) 2: No such file or directory ldlm.namespaces.MGC192.168.204.140@tcp.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800a99f2800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800aa858800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800a99f2800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800aa858800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800a99f2800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800aa858800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800a99f2800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800aa858800.early_lock_cancel=1 fail_loc=0 fail_loc=0 PASS 18 (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 19: resend of open request =========== 20:35:19 (1713486919) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre fail_loc=0x157 - open/close 0 (time 1713487007.90 total 86.02 last 0.00) total: 1 open/close in 86.02 seconds: 0.01 ops/second fail_loc=0 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:36:49 (1713487009) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:37:03 (1713487023) targets are mounted 20:37:03 (1713487023) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 19 (111s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 20: recovery time is not increasing == 20:37:11 (1713487031) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:37:14 (1713487034) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:37:28 (1713487048) targets are mounted 20:37:28 (1713487048) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:39:53 (1713487193) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:40:07 (1713487207) targets are mounted 20:40:07 (1713487207) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 PASS 20 (321s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 21a: commit on sharing =============== 20:42:34 (1713487354) mdt.lustre-MDT0000.commit_on_sharing=1 mdt.lustre-MDT0001.commit_on_sharing=1 Replay barrier on lustre-MDT0000 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:42:38 (1713487358) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:42:52 (1713487372) targets are mounted 20:42:52 (1713487372) facet_failover done Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 mdt.lustre-MDT0000.commit_on_sharing=0 mdt.lustre-MDT0001.commit_on_sharing=0 PASS 21a (158s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_21b skipping SLOW test 21b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply & fail, c2 mkdir dir1/dir ========================================================== 20:45:15 (1713487515) fail_loc=0x119 Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 20:45:17 (1713487517) shut down Failover mds2 to oleg440-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 20:45:30 (1713487530) targets are mounted 20:45:30 (1713487530) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2000 1285688 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1824 1285864 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 82.95 ops/second total: 2 open/close in 0.01 seconds: 164.42 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:45:39 (1713487539) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:45:53 (1713487553) targets are mounted 20:45:53 (1713487553) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22a (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22b: c1 lfs mkdir -i 1 d1, M1 drop reply & fail M0/M1, c2 mkdir d1/dir ========================================================== 20:46:01 (1713487561) fail_loc=0x119 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 20:46:05 (1713487565) shut down Failover mds1 to oleg440-server mount facets: mds1 Failover mds2 to oleg440-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 20:46:31 (1713487591) targets are mounted 20:46:31 (1713487591) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 88.62 ops/second total: 2 open/close in 0.01 seconds: 170.08 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:46:40 (1713487600) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:46:54 (1713487614) targets are mounted 20:46:54 (1713487614) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22b (60s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22c: c1 lfs mkdir -i 1 d1, M1 drop update & fail M1, c2 mkdir d1/dir ========================================================== 20:47:03 (1713487623) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:47:06 (1713487626) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:47:18 (1713487638) targets are mounted 20:47:18 (1713487638) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 91.63 ops/second total: 2 open/close in 0.01 seconds: 173.97 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:47:27 (1713487647) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:47:41 (1713487661) targets are mounted 20:47:41 (1713487661) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22c (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22d: c1 lfs mkdir -i 1 d1, M1 drop update & fail M0/M1,c2 mkdir d1/dir ========================================================== 20:47:50 (1713487670) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 20:48:01 (1713487681) shut down Failover mds1 to oleg440-server mount facets: mds1 Failover mds2 to oleg440-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 20:48:22 (1713487702) targets are mounted 20:48:22 (1713487702) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1960 1285728 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1784 1285904 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 86.12 ops/second total: 2 open/close in 0.01 seconds: 205.63 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:48:30 (1713487710) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:48:43 (1713487723) targets are mounted 20:48:43 (1713487723) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22d (61s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23a: c1 rmdir d1, M1 drop reply and fail, client2 mkdir d1 ========================================================== 20:48:52 (1713487732) fail_loc=0x119 fail_loc=0 Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 20:49:00 (1713487740) shut down Failover mds2 to oleg440-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 20:49:13 (1713487753) targets are mounted 20:49:13 (1713487753) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.01 seconds: 137.44 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:49:22 (1713487762) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:49:36 (1713487776) targets are mounted 20:49:36 (1713487776) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23a (51s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23b: c1 rmdir d1, M1 drop reply and fail M0/M1, c2 mkdir d1 ========================================================== 20:49:45 (1713487785) fail_loc=0x119 fail_loc=0 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 20:49:54 (1713487794) shut down Failover mds1 to oleg440-server mount facets: mds1 Failover mds2 to oleg440-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 20:50:14 (1713487814) targets are mounted 20:50:14 (1713487814) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 130.14 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:50:24 (1713487824) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:50:38 (1713487838) targets are mounted 20:50:38 (1713487838) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23b (60s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23c: c1 rmdir d1, M0 drop update reply and fail M0, c2 mkdir d1 ========================================================== 20:50:47 (1713487847) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:50:50 (1713487850) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:51:02 (1713487862) targets are mounted 20:51:02 (1713487862) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 116.90 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:51:11 (1713487871) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:51:25 (1713487885) targets are mounted 20:51:25 (1713487885) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23c (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23d: c1 rmdir d1, M0 drop update reply and fail M0/M1, c2 mkdir d1 ========================================================== 20:51:34 (1713487894) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 20:51:46 (1713487906) shut down Failover mds1 to oleg440-server mount facets: mds1 Failover mds2 to oleg440-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 20:52:06 (1713487926) targets are mounted 20:52:06 (1713487926) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1956 1285732 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1780 1285908 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 107.39 ops/second Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:52:14 (1713487934) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:52:28 (1713487948) targets are mounted 20:52:28 (1713487948) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23d (61s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 24: reconstruct on non-existing object ========================================================== 20:52:37 (1713487957) fail_loc=0x119 fail_loc=0 truncate: cannot truncate '/mnt/lustre/f24.replay-dual' to length 100: No such file or directory PASS 24 (87s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 25: replay|resend ==================== 20:54:05 (1713488045) 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.00281782 s, 182 kB/s fail_loc=0x304 fail_loc=0x80000325 Failing ost1 on oleg440-server Stopping /mnt/lustre-ost1 (opts:) on oleg440-server 20:54:08 (1713488048) shut down Failover ost1 to oleg440-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-OST0000 20:54:21 (1713488061) targets are mounted 20:54:21 (1713488061) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 6515: 4675 Terminated LUSTRE="/home/green/git/lustre-release/lustre" bash -c "multiop /mnt/lustre2/f25.replay-dual Ow512" fail_loc=0 PASS 25 (21s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 26: dbench and tar with mds failover ========================================================== 20:54:27 (1713488067) Starting client oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre Started clients oleg440-client.virtnet: 192.168.204.140@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) Started tar loop with pid 6297 Started dbench loop with 6298 striped dir -i0 -c2 -H all_char /mnt/lustre2/d26.replay-dual/run_dbench striped dir -i0 -c2 -H crush2 /mnt/lustre/d26.replay-dual/run_tar looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Thu Apr 18 20:54:29 EDT 2024 waiting for dbench pid 6339 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs failed to create barrier semaphore 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2272 1285416 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2080 1285608 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 26120 3555840 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 27644 7161336 1% /mnt/lustre 1 231 8.88 MB/sec warmup 1 sec latency 20.416 ms 1 481 8.38 MB/sec warmup 2 sec latency 21.181 ms 1 711 7.07 MB/sec warmup 3 sec latency 288.297 ms 1 985 5.48 MB/sec warmup 4 sec latency 14.681 ms test_26 fail mds1 1 times 1 1389 5.20 MB/sec warmup 5 sec latency 15.434 ms Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 1 1555 4.35 MB/sec warmup 6 sec latency 416.545 ms 20:54:35 (1713488075) shut down 1 1555 3.73 MB/sec warmup 7 sec latency 1416.771 ms 1 1555 3.26 MB/sec warmup 8 sec latency 2416.999 ms 1 1555 2.90 MB/sec warmup 9 sec latency 3417.207 ms 1 1555 2.61 MB/sec warmup 10 sec latency 4417.415 ms 1 1555 2.37 MB/sec warmup 11 sec latency 5417.587 ms 1 1555 2.17 MB/sec warmup 12 sec latency 6417.734 ms 1 1555 2.01 MB/sec warmup 13 sec latency 7417.896 ms 1 1555 1.86 MB/sec warmup 14 sec latency 8418.078 ms 1 1555 1.74 MB/sec warmup 15 sec latency 9418.220 ms 1 1555 1.63 MB/sec warmup 16 sec latency 10418.417 ms Failover mds1 to oleg440-server mount facets: mds1 1 1555 1.54 MB/sec warmup 17 sec latency 11418.625 ms 1 1555 1.45 MB/sec warmup 18 sec latency 12418.794 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 1555 1.37 MB/sec warmup 19 sec latency 13418.928 ms oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:54:49 (1713488089) targets are mounted 20:54:49 (1713488089) facet_failover done 1 1555 0.00 MB/sec execute 1 sec latency 15419.244 ms 1 1555 0.00 MB/sec execute 2 sec latency 16419.430 ms 1 1555 0.00 MB/sec execute 3 sec latency 17419.630 ms 1 1555 0.00 MB/sec execute 4 sec latency 18419.771 ms oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1791 0.04 MB/sec execute 5 sec latency 18722.178 ms 1 2115 0.10 MB/sec execute 6 sec latency 12.792 ms 1 2389 0.27 MB/sec execute 7 sec latency 47.613 ms 1 2797 0.73 MB/sec execute 8 sec latency 21.566 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2956 1284732 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2592 1285096 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 31508 3565300 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 6220 3586732 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 37728 7152032 1% /mnt/lustre 1 3354 1.03 MB/sec execute 9 sec latency 14.691 ms 1 3763 1.37 MB/sec execute 10 sec latency 13.228 ms 1 3979 1.27 MB/sec execute 11 sec latency 15.381 ms 1 4271 1.19 MB/sec execute 12 sec latency 13.112 ms test_26 fail mds2 2 times 1 4552 1.15 MB/sec execute 13 sec latency 14.193 ms Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 1 4740 1.15 MB/sec execute 14 sec latency 423.672 ms 20:55:03 (1713488103) shut down 1 4740 1.07 MB/sec execute 15 sec latency 1423.786 ms 1 4740 1.00 MB/sec execute 16 sec latency 2423.883 ms 1 4740 0.94 MB/sec execute 17 sec latency 3424.003 ms 1 4740 0.89 MB/sec execute 18 sec latency 4424.116 ms 1 4740 0.84 MB/sec execute 19 sec latency 5424.214 ms 1 4740 0.80 MB/sec execute 20 sec latency 6424.325 ms 1 4740 0.76 MB/sec execute 21 sec latency 7424.433 ms 1 4740 0.73 MB/sec execute 22 sec latency 8424.550 ms 1 4740 0.70 MB/sec execute 23 sec latency 9424.680 ms 1 4740 0.67 MB/sec execute 24 sec latency 10424.767 ms Failover mds2 to oleg440-server mount facets: mds2 1 4740 0.64 MB/sec execute 25 sec latency 11424.880 ms 1 4740 0.62 MB/sec execute 26 sec latency 12425.001 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 4740 0.59 MB/sec execute 27 sec latency 13425.119 ms oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 1 4740 0.57 MB/sec execute 28 sec latency 14425.242 ms Started lustre-MDT0001 20:55:17 (1713488117) targets are mounted 20:55:17 (1713488117) facet_failover done 1 4740 0.55 MB/sec execute 29 sec latency 15425.378 ms 1 4740 0.53 MB/sec execute 30 sec latency 16425.616 ms 1 4740 0.52 MB/sec execute 31 sec latency 17425.886 ms 1 4740 0.50 MB/sec execute 32 sec latency 18426.100 ms oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 5089 0.58 MB/sec execute 33 sec latency 18453.583 ms 1 5375 0.57 MB/sec execute 34 sec latency 17.013 ms 1 5743 0.56 MB/sec execute 35 sec latency 10.975 ms 1 6109 0.61 MB/sec execute 36 sec latency 42.998 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4532 1283156 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2428 1285260 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 39680 3557704 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 10544 3582580 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 50224 7140284 1% /mnt/lustre 1 6578 0.72 MB/sec execute 37 sec latency 20.814 ms 1 7164 0.83 MB/sec execute 38 sec latency 15.455 ms 1 7427 0.84 MB/sec execute 39 sec latency 18.401 ms 1 7691 0.83 MB/sec execute 40 sec latency 14.724 ms test_26 fail mds1 3 times Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 1 8002 0.82 MB/sec execute 41 sec latency 16.760 ms 1 8106 0.81 MB/sec execute 42 sec latency 561.668 ms 1 8106 0.79 MB/sec execute 43 sec latency 1561.831 ms 1 8106 0.77 MB/sec execute 44 sec latency 2561.994 ms 1 8106 0.76 MB/sec execute 45 sec latency 3562.162 ms 1 8106 0.74 MB/sec execute 46 sec latency 4562.322 ms 1 8106 0.72 MB/sec execute 47 sec latency 5562.485 ms 20:55:36 (1713488136) shut down 1 8106 0.71 MB/sec execute 48 sec latency 6562.664 ms 1 8106 0.69 MB/sec execute 49 sec latency 7562.820 ms 1 8106 0.68 MB/sec execute 50 sec latency 8562.942 ms 1 8106 0.67 MB/sec execute 51 sec latency 9563.090 ms 1 8106 0.65 MB/sec execute 52 sec latency 10563.189 ms 1 8106 0.64 MB/sec execute 53 sec latency 11563.333 ms 1 8106 0.63 MB/sec execute 54 sec latency 12563.498 ms 1 8106 0.62 MB/sec execute 55 sec latency 13563.689 ms 1 8106 0.61 MB/sec execute 56 sec latency 14563.887 ms 1 8106 0.60 MB/sec execute 57 sec latency 15564.128 ms Failover mds1 to oleg440-server mount facets: mds1 1 8106 0.59 MB/sec execute 58 sec latency 16564.407 ms 1 8106 0.58 MB/sec execute 59 sec latency 17564.508 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 8106 0.57 MB/sec execute 60 sec latency 18564.610 ms oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:55:50 (1713488150) targets are mounted 20:55:50 (1713488150) facet_failover done 1 8106 0.56 MB/sec execute 61 sec latency 19564.773 ms 1 8106 0.55 MB/sec execute 62 sec latency 20564.900 ms 1 8106 0.54 MB/sec execute 63 sec latency 21565.017 ms 1 8106 0.53 MB/sec execute 64 sec latency 22565.138 ms 1 8106 0.52 MB/sec execute 65 sec latency 23565.260 ms oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 8243 0.53 MB/sec execute 66 sec latency 24160.617 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 8601 0.57 MB/sec execute 67 sec latency 15.593 ms 1 8895 0.57 MB/sec execute 68 sec latency 16.292 ms 1 9254 0.56 MB/sec execute 69 sec latency 12.473 ms 1 9565 0.58 MB/sec execute 70 sec latency 15.572 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5108 1282580 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3096 1284592 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 52048 3548496 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 20292 3574380 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 72340 7122876 2% /mnt/lustre 1 10068 0.64 MB/sec execute 71 sec latency 17.827 ms 1 10611 0.70 MB/sec execute 72 sec latency 16.070 ms 1 10911 0.71 MB/sec execute 73 sec latency 14.296 ms 1 11136 0.71 MB/sec execute 74 sec latency 15.704 ms test_26 fail mds2 4 times Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 1 11407 0.70 MB/sec execute 75 sec latency 64.403 ms 20:56:05 (1713488165) shut down 1 11407 0.69 MB/sec execute 76 sec latency 1064.620 ms 1 11407 0.68 MB/sec execute 77 sec latency 2064.814 ms 1 11407 0.67 MB/sec execute 78 sec latency 3065.013 ms 1 11407 0.67 MB/sec execute 79 sec latency 4065.212 ms 1 11407 0.66 MB/sec execute 80 sec latency 5065.379 ms 1 11407 0.65 MB/sec execute 81 sec latency 6065.570 ms 1 11407 0.64 MB/sec execute 82 sec latency 7065.757 ms 1 11407 0.63 MB/sec execute 83 sec latency 8065.936 ms 1 11407 0.63 MB/sec execute 84 sec latency 9066.127 ms 1 11407 0.62 MB/sec execute 85 sec latency 10066.315 ms Failover mds2 to oleg440-server mount facets: mds2 1 11407 0.61 MB/sec execute 86 sec latency 11066.478 ms 1 11407 0.60 MB/sec execute 87 sec latency 12066.617 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 11407 0.60 MB/sec execute 88 sec latency 13066.766 ms oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 1 11407 0.59 MB/sec execute 89 sec latency 14066.879 ms Started lustre-MDT0001 20:56:18 (1713488178) targets are mounted 20:56:18 (1713488178) facet_failover done 1 11407 0.58 MB/sec execute 90 sec latency 15067.055 ms 1 11407 0.58 MB/sec execute 91 sec latency 16067.241 ms 1 11407 0.57 MB/sec execute 92 sec latency 17067.429 ms 1 11407 0.57 MB/sec execute 93 sec latency 18067.616 ms 1 11439 0.56 MB/sec execute 94 sec latency 18958.758 ms oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 11720 0.57 MB/sec execute 95 sec latency 16.983 ms 1 12115 0.60 MB/sec execute 96 sec latency 14.325 ms 1 12386 0.59 MB/sec execute 97 sec latency 24.502 ms 1 12737 0.59 MB/sec execute 98 sec latency 13.212 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5272 1282416 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3284 1284404 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 53912 3549212 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 24600 3578684 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 78512 7127896 2% /mnt/lustre 1 13083 0.60 MB/sec execute 99 sec latency 51.511 ms 1 cleanup 100 sec 0 cleanup 101 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 2055 29.429 24160.604 Close 1530 13.674 18453.565 Rename 86 11.684 17.803 Unlink 392 4.602 18.504 Qpathinfo 1911 2.271 16.891 Qfileinfo 336 0.454 2.409 Qfsinfo 325 0.900 4.966 Sfileinfo 166 6.322 16.469 Find 717 28.451 18722.160 WriteX 1016 2.131 16.889 ReadX 3191 0.116 43.924 LockX 6 1.472 1.647 UnlockX 6 1.511 1.650 Flush 139 9.669 281.869 Throughput 0.599552 MB/sec 1 clients 1 procs max_latency=24160.617 ms stopping dbench on /mnt/lustre at Thu Apr 18 20:56:30 EDT 2024 with return code 0 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished striped dir -i0 -c2 -H crush /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Thu Apr 18 20:56:31 EDT 2024 waiting for dbench pid 10906 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec test_26 fail mds1 5 times 1 of 1 processes prepared for launch 0 sec releasing clients Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 1 232 8.88 MB/sec warmup 1 sec latency 19.664 ms 1 339 6.13 MB/sec warmup 2 sec latency 581.890 ms 1 339 4.09 MB/sec warmup 3 sec latency 1582.074 ms 1 339 3.07 MB/sec warmup 4 sec latency 2582.270 ms 1 339 2.45 MB/sec warmup 5 sec latency 3582.444 ms 1 339 2.04 MB/sec warmup 6 sec latency 4582.615 ms 20:56:38 (1713488198) shut down 1 339 1.75 MB/sec warmup 7 sec latency 5582.767 ms 1 339 1.53 MB/sec warmup 8 sec latency 6582.904 ms 1 339 1.36 MB/sec warmup 9 sec latency 7583.031 ms 1 339 1.23 MB/sec warmup 10 sec latency 8583.195 ms 1 339 1.11 MB/sec warmup 11 sec latency 9583.383 ms 1 339 1.02 MB/sec warmup 12 sec latency 10583.574 ms 1 339 0.94 MB/sec warmup 13 sec latency 11583.749 ms 1 339 0.88 MB/sec warmup 14 sec latency 12583.907 ms 1 339 0.82 MB/sec warmup 15 sec latency 13584.078 ms 1 339 0.77 MB/sec warmup 16 sec latency 14584.290 ms Failover mds1 to oleg440-server mount facets: mds1 1 339 0.72 MB/sec warmup 17 sec latency 15584.465 ms 1 339 0.68 MB/sec warmup 18 sec latency 16584.585 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 339 0.65 MB/sec warmup 19 sec latency 17584.742 ms oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:56:52 (1713488212) targets are mounted 20:56:52 (1713488212) facet_failover done 1 339 0.00 MB/sec execute 1 sec latency 19585.123 ms 1 339 0.00 MB/sec execute 2 sec latency 20585.260 ms 1 339 0.00 MB/sec execute 3 sec latency 21585.449 ms 1 339 0.00 MB/sec execute 4 sec latency 22585.600 ms 1 339 0.00 MB/sec execute 5 sec latency 23585.751 ms oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 618 1.45 MB/sec execute 6 sec latency 23638.501 ms 1 902 1.35 MB/sec execute 7 sec latency 20.161 ms 1 1188 1.35 MB/sec execute 8 sec latency 19.851 ms 1 1572 1.54 MB/sec execute 9 sec latency 31.019 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4296 1283392 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2508 1285180 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 31728 3564072 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 12160 3584912 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 43888 7148984 1% /mnt/lustre 1 1898 1.41 MB/sec execute 10 sec latency 23.042 ms 1 2269 1.33 MB/sec execute 11 sec latency 13.565 ms 1 2607 1.40 MB/sec execute 12 sec latency 15.743 ms 1 3305 1.78 MB/sec execute 13 sec latency 13.686 ms test_26 fail mds2 6 times Failing mds2 on oleg440-server Stopping /mnt/lustre-mds2 (opts:) on oleg440-server 1 3760 1.96 MB/sec execute 14 sec latency 10.544 ms 20:57:06 (1713488226) shut down 1 3816 1.83 MB/sec execute 15 sec latency 769.339 ms 1 3816 1.72 MB/sec execute 16 sec latency 1769.498 ms 1 3816 1.62 MB/sec execute 17 sec latency 2769.650 ms 1 3816 1.53 MB/sec execute 18 sec latency 3769.772 ms 1 3816 1.45 MB/sec execute 19 sec latency 4769.952 ms 1 3816 1.37 MB/sec execute 20 sec latency 5770.266 ms 1 3816 1.31 MB/sec execute 21 sec latency 6770.399 ms 1 3816 1.25 MB/sec execute 22 sec latency 7770.510 ms 1 3816 1.20 MB/sec execute 23 sec latency 8770.611 ms 1 3816 1.15 MB/sec execute 24 sec latency 9770.781 ms Failover mds2 to oleg440-server mount facets: mds2 1 3816 1.10 MB/sec execute 25 sec latency 10770.904 ms 1 3816 1.06 MB/sec execute 26 sec latency 11771.067 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 3816 1.02 MB/sec execute 27 sec latency 12771.184 ms 1 3816 0.98 MB/sec execute 28 sec latency 13771.404 ms oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 20:57:20 (1713488240) targets are mounted 20:57:20 (1713488240) facet_failover done 1 3816 0.95 MB/sec execute 29 sec latency 14771.596 ms 1 3816 0.92 MB/sec execute 30 sec latency 15771.788 ms 1 3816 0.89 MB/sec execute 31 sec latency 16771.943 ms 1 3816 0.86 MB/sec execute 32 sec latency 17772.108 ms 1 3844 0.83 MB/sec execute 33 sec latency 18643.796 ms oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4064 0.82 MB/sec execute 34 sec latency 14.589 ms striped dir -i0 -c2 -H all_char /mnt/lustre/d26.replay-dual/run_tar 1 4311 0.80 MB/sec execute 35 sec latency 29.807 ms 1 4557 0.80 MB/sec execute 36 sec latency 19.965 ms 1 4942 0.89 MB/sec execute 37 sec latency 16.414 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2644 1285044 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2244 1285444 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 33196 3569056 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 8776 3593932 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 41972 7162988 1% /mnt/lustre 1 5190 0.87 MB/sec execute 38 sec latency 15.203 ms 1 5327 0.85 MB/sec execute 39 sec latency 416.680 ms 1 5664 0.84 MB/sec execute 40 sec latency 16.687 ms 1 6012 0.85 MB/sec execute 41 sec latency 13.821 ms test_26 fail mds1 7 times Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 1 6542 0.96 MB/sec execute 42 sec latency 14.781 ms 20:57:34 (1713488254) shut down 1 7113 1.05 MB/sec execute 43 sec latency 16.419 ms 1 7402 1.06 MB/sec execute 44 sec latency 15.324 ms 1 7627 1.04 MB/sec execute 45 sec latency 16.439 ms 1 7627 1.02 MB/sec execute 46 sec latency 1016.570 ms 1 7627 1.00 MB/sec execute 47 sec latency 2016.733 ms 1 7627 0.98 MB/sec execute 48 sec latency 3016.882 ms 1 7627 0.96 MB/sec execute 49 sec latency 4017.007 ms 1 7627 0.94 MB/sec execute 50 sec latency 5017.178 ms 1 7627 0.92 MB/sec execute 51 sec latency 6017.343 ms 1 7627 0.90 MB/sec execute 52 sec latency 7017.512 ms Failover mds1 to oleg440-server mount facets: mds1 1 7627 0.89 MB/sec execute 53 sec latency 8017.664 ms 1 7627 0.87 MB/sec execute 54 sec latency 9017.789 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 7627 0.85 MB/sec execute 55 sec latency 10017.914 ms 1 7627 0.84 MB/sec execute 56 sec latency 11018.044 ms oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:57:48 (1713488268) targets are mounted 20:57:48 (1713488268) facet_failover done 1 7627 0.82 MB/sec execute 57 sec latency 12018.180 ms 1 7627 0.81 MB/sec execute 58 sec latency 13018.312 ms 1 7627 0.80 MB/sec execute 59 sec latency 14018.463 ms 1 7627 0.78 MB/sec execute 60 sec latency 15018.607 ms 1 7627 0.77 MB/sec execute 61 sec latency 16018.734 ms oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 7921 0.77 MB/sec execute 62 sec latency 16050.761 ms 1 8241 0.78 MB/sec execute 63 sec latency 20.945 ms tar: Unexpected EOF in archive tar: Unexpected EOF in archive 1 8621 0.81 MB/sec execute 64 sec latency 14.083 ms 1 8916 0.80 MB/sec execute 65 sec latency 21.713 ms tar: Error is not recoverable: exiting now dbench killed by signal 15 stopping dbench on /mnt/lustre at Thu Apr 18 20:57:57 EDT 2024 with return code 0 10906 pts/0 S+ 0:00 dbench -c client.txt 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100 killed dbench main pid 10906 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished PASS 26 (216s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 28: lock replay should be ordered: waiting after granted ========================================================== 20:58:04 (1713488284) 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00309162 s, 1.3 MB/s fail_loc=0x80000324 fail_loc=0x32a Failing ost1 on oleg440-server Stopping /mnt/lustre-ost1 (opts:) on oleg440-server 20:58:10 (1713488290) shut down Failover ost1 to oleg440-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00309513 s, 1.3 MB/s pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-OST0000 20:58:23 (1713488303) targets are mounted 20:58:23 (1713488303) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 28 (26s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 29: replay vs update with the same xid ========================================================== 20:58:31 (1713488311) SKIP: replay-dual test_29 needs >= 2 clients SKIP 29 (1s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 30: layout lock replay is not blocked on IO ========================================================== 20:58:34 (1713488314) 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.00735772 s, 5.6 MB/s 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.00894114 s, 4.6 MB/s fail_loc=0x32e fail_val=4 Failing mds1 on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server 20:58:36 (1713488316) shut down Failover mds1 to oleg440-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 20:58:49 (1713488329) targets are mounted 20:58:49 (1713488329) facet_failover done 160+0 records in 160+0 records out 81920 bytes (82 kB) copied, 20.1222 s, 4.1 kB/s oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 30 (23s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 31: deadlock on file_remove_privs and occupied mod rpc slots ========================================================== 20:58:59 (1713488339) Failing ost1 on oleg440-server Stopping /mnt/lustre-ost1 (opts:) on oleg440-server 20:59:01 (1713488341) shut down Creating to objid 3201 on ost lustre-OST0000... total: 32 open/close in 0.16 seconds: 200.85 ops/second at_max=0 fail_loc=0x80001420 file /mnt/lustre2/d31.replay-dual/mdtdir/f31.replay-dual is ready Failover ost1 to oleg440-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-OST0000 20:59:14 (1713488354) targets are mounted 20:59:14 (1713488354) facet_failover done oleg440-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in IDLE FULL state after 0 sec pids: 18753 18754 18759 18760 18761 18762 18763 18764 18765 at_max=600 PASS 31 (20s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 32: gap in update llog shouldn't break recovery ========================================================== 20:59:20 (1713488360) fail_loc=0x0000131d fail_val=10 fail_loc=0x726 Stopping /mnt/lustre-mds2 (opts:) on oleg440-server Stopping /mnt/lustre-mds1 (opts:) on oleg440-server fail_loc=0 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2488 1285200 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2036 1285652 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1548 3605472 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3072 7210968 1% /mnt/lustre PASS 32 (14s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 33: Check for OBD_INCOMPAT_MULTI_RPCS in last_rcvd after abort_recovery ========================================================== 20:59:35 (1713488375) at_min=60 Stopping /mnt/lustre-mds2 (opts:) on oleg440-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 oleg440-client.virtnet: executing wait_import_state_mount REPLAY_WAIT mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in REPLAY_WAIT state after 0 sec oleg440-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec affected facets: mds2 oleg440-server: oleg440-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg440-server: *.lustre-MDT0001.recovery_status status: COMPLETE Stopping /mnt/lustre-mds2 (opts:) on oleg440-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg440-server: oleg440-server.virtnet: executing set_default_debug -1 all pdsh@oleg440-client: oleg440-server: ssh exited with exit code 1 Started lustre-MDT0001 Starting client: oleg440-client.virtnet: -o user_xattr,flock oleg440-server@tcp:/lustre /mnt/lustre2 oleg440-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL DISCONN state after 4 sec affected facets: mds2 oleg440-server: oleg440-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg440-server: *.lustre-MDT0001.recovery_status status: COMPLETE at_min=5 PASS 33 (35s) debug_raw_pointers=0 debug_raw_pointers=0 == replay-dual test complete, duration 2627 sec ========== 21:00:11 (1713488411) === replay-dual: start cleanup 21:00:11 (1713488411) === Stopping clients: oleg440-client.virtnet /mnt/lustre2 (opts:) Stopping client oleg440-client.virtnet /mnt/lustre2 opts: === replay-dual: finish cleanup 21:00:12 (1713488412) ===