-----============= acceptance-small: replay-dual ============----- Tue Apr 16 17:20:49 EDT 2024 excepting tests: 14b 21b skipping tests SLOW=no: 21b === replay-dual: start setup 17:20:53 (1713302453) === Starting client oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 Started clients oleg304-client.virtnet: 192.168.203.104@tcp:/lustre on /mnt/lustre2 type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) oleg304-client.virtnet: executing check_config_client /mnt/lustre oleg304-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg304-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8800aa526800.idle_timeout=debug osc.lustre-OST0000-osc-ffff8800b5d60000.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800aa526800.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800b5d60000.idle_timeout=debug disable quota as required oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all osd-ldiskfs.track_declares_assert=1 === replay-dual: finish setup 17:21:00 (1713302460) === debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0a: expired recovery with lost client ========================================================== 17:21:01 (1713302461) Check file is LU482_FAILED=/tmp/replay-dual.lu482.SfCXox UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 50 open/close in 0.38 seconds: 133.08 ops/second fail_loc=0x80000514 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:21:05 (1713302465) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:21:18 (1713302478) targets are mounted 17:21:18 (1713302478) facet_failover done Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 - unlinked 0 (time 1713302558 ; total 0 ; last 0) total: 50 unlinks in 1 seconds: 50.000000 unlinks/second PASS 0a (99s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0b: lost client during waiting for next transno ========================================================== 17:22:41 (1713302561) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:22:44 (1713302564) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:22:58 (1713302578) targets are mounted 17:22:58 (1713302578) facet_failover done Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 PASS 0b (94s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 1: |X| simple create ================= 17:24:16 (1713302656) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:24:19 (1713302659) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:24:33 (1713302673) targets are mounted 17:24:33 (1713302673) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 1 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 2: |X| mkdir adir ==================== 17:24:42 (1713302682) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:24:45 (1713302685) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:24:58 (1713302698) targets are mounted 17:24:58 (1713302698) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 2 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 3: |X| mkdir adir, mkdir adir/bdir === 17:25:07 (1713302707) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:25:11 (1713302711) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:25:25 (1713302725) targets are mounted 17:25:25 (1713302725) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 3 (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 4: |X| mkdir adir (-EEXIST), mkdir adir/bdir ========================================================== 17:25:33 (1713302733) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1916 1285772 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre mkdir: cannot create directory '/mnt/lustre/adir': File exists Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:25:37 (1713302737) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:25:50 (1713302750) targets are mounted 17:25:50 (1713302750) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 4 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 5: open, unlink |X| close ============ 17:25:59 (1713302759) multiop /mnt/lustre2/a vo_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7524 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:26:02 (1713302762) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:26:16 (1713302776) targets are mounted 17:26:16 (1713302776) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 5 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 6: open1, open2, unlink |X| close1 [fail mds1] close2 ========================================================== 17:26:25 (1713302785) multiop /mnt/lustre2/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7524 multiop /mnt/lustre/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7524 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:26:28 (1713302788) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:26:42 (1713302802) targets are mounted 17:26:42 (1713302802) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 6 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 8: replay of resent request ========== 17:26:50 (1713302810) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x119 fail_loc=0 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:27:10 (1713302830) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:27:24 (1713302844) targets are mounted 17:27:24 (1713302844) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 8 (41s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 9: resending a replayed create ======= 17:27:33 (1713302853) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:27:36 (1713302856) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:27:50 (1713302870) targets are mounted 17:27:50 (1713302870) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (42s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 10: resending a replayed unlink ====== 17:28:15 (1713302895) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:28:19 (1713302899) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:28:32 (1713302912) targets are mounted 17:28:32 (1713302912) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 10 (38s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 11: both clients timeout during replay ========================================================== 17:28:55 (1713302935) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x0119 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:28:58 (1713302938) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:29:12 (1713302952) targets are mounted 17:29:12 (1713302952) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 17 sec fail_loc=0 PASS 11 (37s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 12: open resend timeout ============== 17:29:33 (1713302973) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre multiop /mnt/lustre/f12.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7524 fail_loc=0x80000302 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:29:37 (1713302977) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:29:51 (1713302991) targets are mounted 17:29:51 (1713302991) facet_failover done fail_loc=0 /mnt/lustre/f12.replay-dual /mnt/lustre/f12.replay-dual has type file OK PASS 12 (22s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 13: close resend timeout ============= 17:29:57 (1713302997) multiop /mnt/lustre/f13.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7524 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000115 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:30:00 (1713303000) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:30:14 (1713303014) targets are mounted 17:30:14 (1713303014) facet_failover done fail_loc=0 /mnt/lustre/f13.replay-dual /mnt/lustre/f13.replay-dual has type file OK PASS 13 (22s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_14b skipping ALWAYS excluded test 14b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15a: timeout waiting for lost client during replay, 1 client completes ========================================================== 17:30:21 (1713303021) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.18 seconds: 141.93 ops/second total: 1 open/close in 0.01 seconds: 136.89 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:30:24 (1713303024) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:30:39 (1713303039) targets are mounted 17:30:39 (1713303039) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713303111 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 PASS 15a (92s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15c: remove multiple OST orphans ===== 17:31:54 (1713303114) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:32:21 (1713303141) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:32:36 (1713303156) targets are mounted 17:32:36 (1713303156) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 PASS 15c (115s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 16: fail MDS during recovery (3571) == 17:33:51 (1713303231) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.18 seconds: 141.83 ops/second total: 1 open/close in 0.01 seconds: 141.60 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:33:55 (1713303235) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:34:09 (1713303249) targets are mounted 17:34:09 (1713303249) facet_failover done Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:34:30 (1713303270) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:34:43 (1713303283) targets are mounted 17:34:43 (1713303283) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713303356 ; total 0 ; last 0) total: 25 unlinks in 1 seconds: 25.000000 unlinks/second Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 PASS 16 (127s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 17: fail OST during recovery (3571) == 17:36:00 (1713303360) total: 25 open/close in 0.18 seconds: 135.72 ops/second total: 1 open/close in 0.01 seconds: 75.15 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1984 1285704 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing ost1 on oleg304-server Stopping /mnt/lustre-ost1 (opts:) on oleg304-server 17:36:03 (1713303363) shut down Failover ost1 to oleg304-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-OST0000 17:36:17 (1713303377) targets are mounted 17:36:17 (1713303377) facet_failover done Failing ost1 on oleg304-server Stopping /mnt/lustre-ost1 (opts:) on oleg304-server 17:36:38 (1713303398) shut down Failover ost1 to oleg304-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-OST0000 17:36:52 (1713303412) targets are mounted 17:36:52 (1713303412) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713303488 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 PASS 17 (130s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 18: ldlm_handle_enqueue succeeds on evicted export (3822) ========================================================== 17:38:11 (1713303491) debug=+dlmtrace fail_loc=0x8000030b using seed 895344540 running for 500 iterations total: 500 stats in 0 seconds: inf stats/second ldlm.namespaces.MGC192.168.203.104@tcp.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800aa629800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800b6c9b800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800aa629800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800b6c9b800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800aa629800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800b6c9b800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800aa629800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800b6c9b800.early_lock_cancel=0 fail_loc=0x80000305 Error in opening file "/mnt/lustre2/d18.replay-dual/f18.replay-dual"(flags=O_RDONLY) 2: No such file or directory ldlm.namespaces.MGC192.168.203.104@tcp.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800aa629800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800b6c9b800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800aa629800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800b6c9b800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800aa629800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800b6c9b800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800aa629800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800b6c9b800.early_lock_cancel=1 fail_loc=0 fail_loc=0 PASS 18 (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 19: resend of open request =========== 17:38:58 (1713303538) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre fail_loc=0x157 - open/close 0 (time 1713303626.75 total 86.02 last 0.00) total: 1 open/close in 86.02 seconds: 0.01 ops/second fail_loc=0 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:40:28 (1713303628) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:40:42 (1713303642) targets are mounted 17:40:42 (1713303642) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 19 (111s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 20: recovery time is not increasing == 17:40:51 (1713303651) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:40:55 (1713303655) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:41:09 (1713303669) targets are mounted 17:41:09 (1713303669) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:43:38 (1713303818) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:43:53 (1713303833) targets are mounted 17:43:53 (1713303833) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 PASS 20 (327s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 21a: commit on sharing =============== 17:46:19 (1713303979) mdt.lustre-MDT0000.commit_on_sharing=1 mdt.lustre-MDT0001.commit_on_sharing=1 Replay barrier on lustre-MDT0000 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:46:23 (1713303983) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:46:37 (1713303997) targets are mounted 17:46:37 (1713303997) facet_failover done Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 mdt.lustre-MDT0000.commit_on_sharing=0 mdt.lustre-MDT0001.commit_on_sharing=0 PASS 21a (159s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_21b skipping SLOW test 21b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply & fail, c2 mkdir dir1/dir ========================================================== 17:49:00 (1713304140) fail_loc=0x119 Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 17:49:02 (1713304142) shut down Failover mds2 to oleg304-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 17:49:15 (1713304155) targets are mounted 17:49:15 (1713304155) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2000 1285688 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1824 1285864 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 62.19 ops/second total: 2 open/close in 0.01 seconds: 145.00 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:49:25 (1713304165) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:49:39 (1713304179) targets are mounted 17:49:39 (1713304179) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22a (47s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22b: c1 lfs mkdir -i 1 d1, M1 drop reply & fail M0/M1, c2 mkdir d1/dir ========================================================== 17:49:48 (1713304188) fail_loc=0x119 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 17:49:52 (1713304192) shut down Failover mds1 to oleg304-server mount facets: mds1 Failover mds2 to oleg304-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 17:50:18 (1713304218) targets are mounted 17:50:18 (1713304218) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 67.70 ops/second total: 2 open/close in 0.02 seconds: 125.01 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:50:28 (1713304228) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:50:42 (1713304242) targets are mounted 17:50:42 (1713304242) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22b (62s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22c: c1 lfs mkdir -i 1 d1, M1 drop update & fail M1, c2 mkdir d1/dir ========================================================== 17:50:52 (1713304252) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:50:55 (1713304255) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:51:09 (1713304269) targets are mounted 17:51:09 (1713304269) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 98.97 ops/second total: 2 open/close in 0.01 seconds: 226.09 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:51:17 (1713304277) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:51:32 (1713304292) targets are mounted 17:51:32 (1713304292) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22c (47s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22d: c1 lfs mkdir -i 1 d1, M1 drop update & fail M0/M1,c2 mkdir d1/dir ========================================================== 17:51:40 (1713304300) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 17:51:52 (1713304312) shut down Failover mds1 to oleg304-server mount facets: mds1 Failover mds2 to oleg304-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 17:52:13 (1713304333) targets are mounted 17:52:13 (1713304333) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1960 1285728 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1784 1285904 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 92.81 ops/second total: 2 open/close in 0.01 seconds: 179.22 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:52:22 (1713304342) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:52:36 (1713304356) targets are mounted 17:52:36 (1713304356) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22d (64s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23a: c1 rmdir d1, M1 drop reply and fail, client2 mkdir d1 ========================================================== 17:52:45 (1713304365) fail_loc=0x119 fail_loc=0 Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 17:52:53 (1713304373) shut down Failover mds2 to oleg304-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 17:53:06 (1713304386) targets are mounted 17:53:06 (1713304386) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 125.03 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:53:15 (1713304395) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:53:29 (1713304409) targets are mounted 17:53:29 (1713304409) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23a (52s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23b: c1 rmdir d1, M1 drop reply and fail M0/M1, c2 mkdir d1 ========================================================== 17:53:39 (1713304419) fail_loc=0x119 fail_loc=0 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 17:53:48 (1713304428) shut down Failover mds1 to oleg304-server mount facets: mds1 Failover mds2 to oleg304-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 17:54:09 (1713304449) targets are mounted 17:54:09 (1713304449) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 130.25 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:54:19 (1713304459) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:54:32 (1713304472) targets are mounted 17:54:32 (1713304472) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23b (60s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23c: c1 rmdir d1, M0 drop update reply and fail M0, c2 mkdir d1 ========================================================== 17:54:41 (1713304481) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:54:45 (1713304485) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:54:58 (1713304498) targets are mounted 17:54:58 (1713304498) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 113.78 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:55:06 (1713304506) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:55:20 (1713304520) targets are mounted 17:55:20 (1713304520) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23c (47s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23d: c1 rmdir d1, M0 drop update reply and fail M0/M1, c2 mkdir d1 ========================================================== 17:55:30 (1713304530) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 17:55:41 (1713304541) shut down Failover mds1 to oleg304-server mount facets: mds1 Failover mds2 to oleg304-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 17:56:01 (1713304561) targets are mounted 17:56:01 (1713304561) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1956 1285732 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1780 1285908 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 67.49 ops/second Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 17:56:12 (1713304572) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:56:26 (1713304586) targets are mounted 17:56:26 (1713304586) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23d (64s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 24: reconstruct on non-existing object ========================================================== 17:56:36 (1713304596) fail_loc=0x119 fail_loc=0 truncate: cannot truncate '/mnt/lustre/f24.replay-dual' to length 100: No such file or directory PASS 24 (87s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 25: replay|resend ==================== 17:58:05 (1713304685) 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.00277213 s, 185 kB/s fail_loc=0x304 fail_loc=0x80000325 Failing ost1 on oleg304-server Stopping /mnt/lustre-ost1 (opts:) on oleg304-server 17:58:08 (1713304688) shut down Failover ost1 to oleg304-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-OST0000 17:58:21 (1713304701) targets are mounted 17:58:21 (1713304701) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 6515: 4665 Terminated LUSTRE="/home/green/git/lustre-release/lustre" bash -c "multiop /mnt/lustre2/f25.replay-dual Ow512" fail_loc=0 PASS 25 (21s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 26: dbench and tar with mds failover ========================================================== 17:58:28 (1713304708) Starting client oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre Started clients oleg304-client.virtnet: 192.168.203.104@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) Started tar loop with pid 6289 Started dbench loop with 6290 striped dir -i0 -c2 -H crush2 /mnt/lustre2/d26.replay-dual/run_dbench striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre/d26.replay-dual/run_tar looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Tue Apr 16 17:58:29 EDT 2024 waiting for dbench pid 6331 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs failed to create barrier semaphore 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 263 9.92 MB/sec warmup 1 sec latency 21.632 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2196 1285492 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2060 1285628 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 26100 3555972 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 27644 7161448 1% /mnt/lustre 1 513 8.86 MB/sec warmup 2 sec latency 23.248 ms 1 769 7.12 MB/sec warmup 3 sec latency 338.675 ms 1 1086 5.73 MB/sec warmup 4 sec latency 13.810 ms test_26 fail mds1 1 times 1 1497 5.21 MB/sec warmup 5 sec latency 13.603 ms Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 1 1692 4.37 MB/sec warmup 6 sec latency 341.693 ms 17:58:36 (1713304716) shut down 1 1692 3.75 MB/sec warmup 7 sec latency 1341.830 ms 1 1692 3.28 MB/sec warmup 8 sec latency 2342.007 ms 1 1692 2.91 MB/sec warmup 9 sec latency 3342.199 ms 1 1692 2.62 MB/sec warmup 10 sec latency 4342.411 ms 1 1692 2.38 MB/sec warmup 11 sec latency 5342.635 ms 1 1692 2.19 MB/sec warmup 12 sec latency 6342.860 ms 1 1692 2.02 MB/sec warmup 13 sec latency 7343.070 ms 1 1692 1.87 MB/sec warmup 14 sec latency 8343.259 ms 1 1692 1.75 MB/sec warmup 15 sec latency 9343.462 ms 1 1692 1.64 MB/sec warmup 16 sec latency 10343.600 ms Failover mds1 to oleg304-server mount facets: mds1 1 1692 1.54 MB/sec warmup 17 sec latency 11343.749 ms 1 1692 1.46 MB/sec warmup 18 sec latency 12343.901 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 1692 1.38 MB/sec warmup 19 sec latency 13344.077 ms oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:58:50 (1713304730) targets are mounted 17:58:50 (1713304730) facet_failover done 1 1692 0.00 MB/sec execute 1 sec latency 15344.313 ms 1 1692 0.00 MB/sec execute 2 sec latency 16344.583 ms 1 1692 0.00 MB/sec execute 3 sec latency 17344.872 ms 1 1692 0.00 MB/sec execute 4 sec latency 18345.025 ms 1 1692 0.00 MB/sec execute 5 sec latency 19345.181 ms oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1946 0.04 MB/sec execute 6 sec latency 19468.440 ms 1 2173 0.08 MB/sec execute 7 sec latency 22.552 ms 1 2442 0.24 MB/sec execute 8 sec latency 21.703 ms 1 2896 0.73 MB/sec execute 9 sec latency 21.130 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2932 1284756 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2596 1285092 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 5308 3595180 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 30028 3566708 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 35336 7161888 1% /mnt/lustre 1 3209 0.81 MB/sec execute 10 sec latency 28.492 ms 1 3624 1.13 MB/sec execute 11 sec latency 23.950 ms 1 3813 1.13 MB/sec execute 12 sec latency 25.383 ms 1 4015 1.06 MB/sec execute 13 sec latency 19.608 ms 1 4226 1.00 MB/sec execute 14 sec latency 25.780 ms test_26 fail mds2 2 times Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 1 4446 0.97 MB/sec execute 15 sec latency 49.712 ms 17:59:05 (1713304745) shut down 1 4564 0.93 MB/sec execute 16 sec latency 398.478 ms 1 4564 0.87 MB/sec execute 17 sec latency 1398.647 ms 1 4564 0.82 MB/sec execute 18 sec latency 2398.828 ms 1 4564 0.78 MB/sec execute 19 sec latency 3399.014 ms 1 4564 0.74 MB/sec execute 20 sec latency 4399.234 ms 1 4564 0.70 MB/sec execute 21 sec latency 5399.396 ms 1 4564 0.67 MB/sec execute 22 sec latency 6399.553 ms 1 4564 0.64 MB/sec execute 23 sec latency 7399.781 ms 1 4564 0.62 MB/sec execute 24 sec latency 8400.025 ms 1 4564 0.59 MB/sec execute 25 sec latency 9400.188 ms Failover mds2 to oleg304-server mount facets: mds2 1 4564 0.57 MB/sec execute 26 sec latency 10400.312 ms 1 4564 0.55 MB/sec execute 27 sec latency 11400.466 ms 1 4564 0.53 MB/sec execute 28 sec latency 12400.632 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 4564 0.51 MB/sec execute 29 sec latency 13400.834 ms oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all 1 4564 0.49 MB/sec execute 30 sec latency 14400.985 ms pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 17:59:20 (1713304760) targets are mounted 17:59:20 (1713304760) facet_failover done 1 4564 0.48 MB/sec execute 31 sec latency 15401.183 ms 1 4564 0.46 MB/sec execute 32 sec latency 16401.440 ms 1 4564 0.45 MB/sec execute 33 sec latency 17401.672 ms 1 4564 0.44 MB/sec execute 34 sec latency 18401.921 ms oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 4698 0.45 MB/sec execute 35 sec latency 19038.167 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4983 0.53 MB/sec execute 36 sec latency 26.109 ms 1 5141 0.51 MB/sec execute 37 sec latency 32.479 ms 1 5292 0.50 MB/sec execute 38 sec latency 29.547 ms 1 5462 0.49 MB/sec execute 39 sec latency 29.293 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4148 1283540 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2588 1285100 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 11804 3583356 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 39016 3552164 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 50820 7135520 1% /mnt/lustre 1 5681 0.49 MB/sec execute 40 sec latency 19.726 ms 1 5913 0.51 MB/sec execute 41 sec latency 22.313 ms 1 6111 0.52 MB/sec execute 42 sec latency 26.621 ms 1 6458 0.60 MB/sec execute 43 sec latency 24.185 ms test_26 fail mds1 3 times 1 6827 0.64 MB/sec execute 44 sec latency 27.036 ms Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 1 7174 0.70 MB/sec execute 45 sec latency 18.178 ms 17:59:35 (1713304775) shut down 1 7415 0.71 MB/sec execute 46 sec latency 18.568 ms 1 7599 0.70 MB/sec execute 47 sec latency 23.744 ms 1 7627 0.69 MB/sec execute 48 sec latency 841.872 ms 1 7627 0.67 MB/sec execute 49 sec latency 1842.133 ms 1 7627 0.66 MB/sec execute 50 sec latency 2842.373 ms 1 7627 0.65 MB/sec execute 51 sec latency 3842.548 ms 1 7627 0.63 MB/sec execute 52 sec latency 4842.746 ms 1 7627 0.62 MB/sec execute 53 sec latency 5842.920 ms 1 7627 0.61 MB/sec execute 54 sec latency 6843.095 ms 1 7627 0.60 MB/sec execute 55 sec latency 7843.334 ms Failover mds1 to oleg304-server mount facets: mds1 1 7627 0.59 MB/sec execute 56 sec latency 8843.481 ms 1 7627 0.58 MB/sec execute 57 sec latency 9843.680 ms 1 7627 0.57 MB/sec execute 58 sec latency 10843.869 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 7627 0.56 MB/sec execute 59 sec latency 11844.081 ms 1 7627 0.55 MB/sec execute 60 sec latency 12844.306 ms oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 17:59:51 (1713304791) targets are mounted 17:59:51 (1713304791) facet_failover done 1 7627 0.54 MB/sec execute 61 sec latency 13844.503 ms 1 7627 0.53 MB/sec execute 62 sec latency 14844.705 ms 1 7627 0.52 MB/sec execute 63 sec latency 15844.939 ms 1 7627 0.52 MB/sec execute 64 sec latency 16845.181 ms 1 7627 0.51 MB/sec execute 65 sec latency 17845.623 ms 1 7627 0.50 MB/sec execute 66 sec latency 18845.815 ms 1 7627 0.49 MB/sec execute 67 sec latency 19846.011 ms oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 7821 0.49 MB/sec execute 68 sec latency 20037.548 ms 1 8025 0.49 MB/sec execute 69 sec latency 25.110 ms 1 8287 0.50 MB/sec execute 70 sec latency 20.814 ms 1 8568 0.54 MB/sec execute 71 sec latency 21.762 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4452 1283236 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2532 1285156 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 16136 3587496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 46288 3556848 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 62424 7144344 1% /mnt/lustre 1 8751 0.53 MB/sec execute 72 sec latency 37.388 ms 1 8973 0.53 MB/sec execute 73 sec latency 25.846 ms 1 9254 0.52 MB/sec execute 74 sec latency 20.239 ms 1 9500 0.53 MB/sec execute 75 sec latency 25.363 ms 1 9702 0.54 MB/sec execute 76 sec latency 38.075 ms test_26 fail mds2 4 times Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 1 10094 0.59 MB/sec execute 77 sec latency 20.679 ms 18:00:08 (1713304808) shut down 1 10275 0.59 MB/sec execute 78 sec latency 759.732 ms 1 10275 0.58 MB/sec execute 79 sec latency 1759.948 ms 1 10275 0.58 MB/sec execute 80 sec latency 2760.107 ms 1 10275 0.57 MB/sec execute 81 sec latency 3760.298 ms 1 10275 0.56 MB/sec execute 82 sec latency 4760.548 ms 1 10275 0.56 MB/sec execute 83 sec latency 5760.783 ms 1 10275 0.55 MB/sec execute 84 sec latency 6761.014 ms 1 10275 0.54 MB/sec execute 85 sec latency 7761.218 ms 1 10275 0.54 MB/sec execute 86 sec latency 8761.452 ms 1 10275 0.53 MB/sec execute 87 sec latency 9761.787 ms 1 10275 0.52 MB/sec execute 88 sec latency 10762.073 ms Failover mds2 to oleg304-server mount facets: mds2 1 10275 0.52 MB/sec execute 89 sec latency 11762.311 ms 1 10275 0.51 MB/sec execute 90 sec latency 12762.509 ms 1 10275 0.51 MB/sec execute 91 sec latency 13762.700 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 10275 0.50 MB/sec execute 92 sec latency 14762.906 ms oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 1 10275 0.50 MB/sec execute 93 sec latency 15763.125 ms Started lustre-MDT0001 18:00:23 (1713304823) targets are mounted 18:00:23 (1713304823) facet_failover done 1 10275 0.49 MB/sec execute 94 sec latency 16763.393 ms 1 10275 0.49 MB/sec execute 95 sec latency 17763.656 ms 1 10275 0.48 MB/sec execute 96 sec latency 18763.814 ms 1 10275 0.48 MB/sec execute 97 sec latency 19763.993 ms oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 10404 0.48 MB/sec execute 98 sec latency 20274.011 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 10764 0.51 MB/sec execute 99 sec latency 20.881 ms 1 cleanup 100 sec 0 cleanup 101 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 1542 37.097 20273.991 Close 1150 2.420 7.503 Rename 67 16.372 28.866 Unlink 297 5.897 15.744 Qpathinfo 1449 3.306 19.127 Qfileinfo 259 0.642 4.915 Qfsinfo 265 1.123 9.592 Sfileinfo 116 9.039 18.540 Find 562 38.932 20037.491 WriteX 791 27.482 19468.426 ReadX 2620 0.115 12.711 LockX 6 2.560 3.140 UnlockX 6 2.648 3.072 Flush 102 11.657 49.699 Throughput 0.512387 MB/sec 1 clients 1 procs max_latency=20274.011 ms stopping dbench on /mnt/lustre at Tue Apr 16 18:00:30 EDT 2024 with return code 0 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4736 1282952 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2840 1284848 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 7196 3585340 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 22472 3581292 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 29668 7166632 1% /mnt/lustre '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Tue Apr 16 18:00:32 EDT 2024 waiting for dbench pid 10808 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 181 7.32 MB/sec warmup 1 sec latency 36.104 ms 1 365 6.52 MB/sec warmup 2 sec latency 28.402 ms 1 513 5.91 MB/sec warmup 3 sec latency 30.090 ms test_26 fail mds1 5 times Failing mds1 on oleg304-server 1 793 5.36 MB/sec warmup 4 sec latency 30.412 ms Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 1 903 4.34 MB/sec warmup 5 sec latency 600.796 ms 18:00:37 (1713304837) shut down 1 903 3.61 MB/sec warmup 6 sec latency 1600.973 ms 1 903 3.10 MB/sec warmup 7 sec latency 2601.213 ms 1 903 2.71 MB/sec warmup 8 sec latency 3601.397 ms 1 903 2.41 MB/sec warmup 9 sec latency 4601.661 ms 1 903 2.17 MB/sec warmup 10 sec latency 5601.980 ms 1 903 1.97 MB/sec warmup 11 sec latency 6602.126 ms 1 903 1.81 MB/sec warmup 12 sec latency 7602.296 ms 1 903 1.67 MB/sec warmup 13 sec latency 8602.516 ms 1 903 1.55 MB/sec warmup 14 sec latency 9602.758 ms 1 903 1.45 MB/sec warmup 15 sec latency 10602.954 ms Failover mds1 to oleg304-server mount facets: mds1 1 903 1.36 MB/sec warmup 16 sec latency 11603.135 ms 1 903 1.28 MB/sec warmup 17 sec latency 12603.327 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 903 1.20 MB/sec warmup 18 sec latency 13603.553 ms 1 903 1.14 MB/sec warmup 19 sec latency 14603.757 ms oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 18:00:52 (1713304852) targets are mounted 18:00:52 (1713304852) facet_failover done 1 903 0.00 MB/sec execute 1 sec latency 16604.101 ms 1 903 0.00 MB/sec execute 2 sec latency 17604.355 ms 1 903 0.00 MB/sec execute 3 sec latency 18604.608 ms 1 903 0.00 MB/sec execute 4 sec latency 19604.811 ms oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 1022 0.19 MB/sec execute 5 sec latency 20025.724 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1368 0.68 MB/sec execute 6 sec latency 16.676 ms 1 1562 0.63 MB/sec execute 7 sec latency 25.495 ms 1 1709 0.57 MB/sec execute 8 sec latency 32.696 ms 1 1937 0.53 MB/sec execute 9 sec latency 18.229 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5348 1282340 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3416 1284272 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 34844 3559444 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 26616 3567764 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 61460 7127208 1% /mnt/lustre 1 2324 0.61 MB/sec execute 10 sec latency 13.314 ms 1 2423 0.58 MB/sec execute 11 sec latency 488.805 ms 1 2773 0.86 MB/sec execute 12 sec latency 25.566 ms test_26 fail mds2 6 times 1 3210 0.97 MB/sec execute 13 sec latency 43.846 ms Failing mds2 on oleg304-server Stopping /mnt/lustre-mds2 (opts:) on oleg304-server 1 3563 1.20 MB/sec execute 14 sec latency 68.855 ms 1 3563 1.12 MB/sec execute 15 sec latency 1069.024 ms 1 3563 1.05 MB/sec execute 16 sec latency 2069.236 ms 1 3563 0.98 MB/sec execute 17 sec latency 3069.392 ms 1 3563 0.93 MB/sec execute 18 sec latency 4069.565 ms 1 3563 0.88 MB/sec execute 19 sec latency 5069.802 ms 1 3563 0.84 MB/sec execute 20 sec latency 6070.038 ms 18:01:12 (1713304872) shut down 1 3563 0.80 MB/sec execute 21 sec latency 7070.266 ms 1 3563 0.76 MB/sec execute 22 sec latency 8070.487 ms 1 3563 0.73 MB/sec execute 23 sec latency 9070.673 ms 1 3563 0.70 MB/sec execute 24 sec latency 10070.837 ms 1 3563 0.67 MB/sec execute 25 sec latency 11070.985 ms 1 3563 0.64 MB/sec execute 26 sec latency 12071.181 ms 1 3563 0.62 MB/sec execute 27 sec latency 13071.401 ms 1 3563 0.60 MB/sec execute 28 sec latency 14071.598 ms 1 3563 0.58 MB/sec execute 29 sec latency 15071.802 ms 1 3563 0.56 MB/sec execute 30 sec latency 16071.979 ms Failover mds2 to oleg304-server mount facets: mds2 1 3563 0.54 MB/sec execute 31 sec latency 17072.090 ms 1 3563 0.52 MB/sec execute 32 sec latency 18072.241 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 3563 0.51 MB/sec execute 33 sec latency 19072.443 ms 1 3563 0.49 MB/sec execute 34 sec latency 20072.600 ms oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 18:01:27 (1713304887) targets are mounted 18:01:27 (1713304887) facet_failover done 1 3563 0.48 MB/sec execute 35 sec latency 21072.715 ms 1 3563 0.46 MB/sec execute 36 sec latency 22072.837 ms 1 3563 0.45 MB/sec execute 37 sec latency 23073.097 ms 1 3563 0.44 MB/sec execute 38 sec latency 24073.308 ms 1 3563 0.43 MB/sec execute 39 sec latency 25073.463 ms oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 3634 0.42 MB/sec execute 40 sec latency 25787.335 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 3846 0.44 MB/sec execute 41 sec latency 19.110 ms 1 3978 0.44 MB/sec execute 42 sec latency 30.042 ms 1 4106 0.43 MB/sec execute 43 sec latency 31.301 ms 1 4280 0.42 MB/sec execute 44 sec latency 40.051 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4900 1282788 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3064 1284624 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 47396 3554648 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 35948 3569256 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 83344 7123904 2% /mnt/lustre 1 4521 0.43 MB/sec execute 45 sec latency 19.300 ms 1 4763 0.45 MB/sec execute 46 sec latency 21.700 ms 1 5014 0.50 MB/sec execute 47 sec latency 22.600 ms 1 5175 0.49 MB/sec execute 48 sec latency 46.271 ms test_26 fail mds1 7 times 1 5386 0.49 MB/sec execute 49 sec latency 44.171 ms Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 1 5561 0.48 MB/sec execute 50 sec latency 434.147 ms 18:01:43 (1713304903) shut down 1 5561 0.47 MB/sec execute 51 sec latency 1434.411 ms 1 5561 0.46 MB/sec execute 52 sec latency 2434.690 ms 1 5561 0.45 MB/sec execute 53 sec latency 3434.921 ms 1 5561 0.44 MB/sec execute 54 sec latency 4435.269 ms 1 5561 0.44 MB/sec execute 55 sec latency 5435.536 ms 1 5561 0.43 MB/sec execute 56 sec latency 6435.799 ms 1 5561 0.42 MB/sec execute 57 sec latency 7436.070 ms 1 5561 0.41 MB/sec execute 58 sec latency 8436.322 ms 1 5561 0.41 MB/sec execute 59 sec latency 9436.535 ms 1 5561 0.40 MB/sec execute 60 sec latency 10436.755 ms Failover mds1 to oleg304-server mount facets: mds1 1 5561 0.39 MB/sec execute 61 sec latency 11436.936 ms 1 5561 0.39 MB/sec execute 62 sec latency 12437.202 ms 1 5561 0.38 MB/sec execute 63 sec latency 13437.326 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 5561 0.37 MB/sec execute 64 sec latency 14437.446 ms oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 1 5561 0.37 MB/sec execute 65 sec latency 15437.618 ms Started lustre-MDT0000 18:01:57 (1713304917) targets are mounted 18:01:57 (1713304917) facet_failover done 1 5561 0.36 MB/sec execute 66 sec latency 16437.821 ms 1 5561 0.36 MB/sec execute 67 sec latency 17438.071 ms 1 5561 0.35 MB/sec execute 68 sec latency 18438.376 ms 1 5561 0.35 MB/sec execute 69 sec latency 19438.594 ms 1 5585 0.34 MB/sec execute 70 sec latency 20284.201 ms oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 5933 0.36 MB/sec execute 71 sec latency 16.448 ms 1 6155 0.37 MB/sec execute 72 sec latency 20.662 ms 1 6792 0.44 MB/sec execute 73 sec latency 19.911 ms 1 7260 0.49 MB/sec execute 74 sec latency 10.017 ms 1 7475 0.50 MB/sec execute 75 sec latency 35.500 ms 1 7778 0.50 MB/sec execute 76 sec latency 15.062 ms 1 8082 0.50 MB/sec execute 77 sec latency 38.957 ms 1 8527 0.55 MB/sec execute 78 sec latency 12.462 ms 1 8801 0.54 MB/sec execute 79 sec latency 26.749 ms 1 9127 0.54 MB/sec execute 80 sec latency 20.004 ms 1 9522 0.55 MB/sec execute 81 sec latency 20.796 ms 1 10083 0.61 MB/sec execute 82 sec latency 13.094 ms 1 10637 0.66 MB/sec execute 83 sec latency 19.404 ms 1 10954 0.67 MB/sec execute 84 sec latency 12.243 ms 1 11169 0.67 MB/sec execute 85 sec latency 56.965 ms 1 11440 0.66 MB/sec execute 86 sec latency 25.370 ms 1 11718 0.67 MB/sec execute 87 sec latency 24.906 ms 1 12052 0.70 MB/sec execute 88 sec latency 27.763 ms 1 12346 0.70 MB/sec execute 89 sec latency 21.152 ms dbench killed by signal 15 stopping dbench on /mnt/lustre at Tue Apr 16 18:02:22 EDT 2024 with return code 0 10808 pts/0 S+ 0:00 dbench -c client.txt 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100 killed dbench main pid 10808 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished PASS 26 (235s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 28: lock replay should be ordered: waiting after granted ========================================================== 18:02:25 (1713304945) 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00292692 s, 1.4 MB/s fail_loc=0x80000324 fail_loc=0x32a Failing ost1 on oleg304-server Stopping /mnt/lustre-ost1 (opts:) on oleg304-server 18:02:35 (1713304955) shut down Failover ost1 to oleg304-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00391271 s, 1.0 MB/s Started lustre-OST0000 18:02:48 (1713304968) targets are mounted 18:02:48 (1713304968) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 28 (30s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 29: replay vs update with the same xid ========================================================== 18:02:57 (1713304977) SKIP: replay-dual test_29 needs >= 2 clients SKIP 29 (1s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 30: layout lock replay is not blocked on IO ========================================================== 18:02:59 (1713304979) 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.00750178 s, 5.5 MB/s 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.00700737 s, 5.8 MB/s fail_loc=0x32e fail_val=4 Failing mds1 on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server 18:03:02 (1713304982) shut down Failover mds1 to oleg304-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 18:03:14 (1713304994) targets are mounted 18:03:14 (1713304994) facet_failover done 160+0 records in 160+0 records out 81920 bytes (82 kB) copied, 21.9574 s, 3.7 kB/s oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 30 (26s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 31: deadlock on file_remove_privs and occupied mod rpc slots ========================================================== 18:03:27 (1713305007) Failing ost1 on oleg304-server Stopping /mnt/lustre-ost1 (opts:) on oleg304-server 18:03:29 (1713305009) shut down Creating to objid 3073 on ost lustre-OST0000... total: 32 open/close in 0.16 seconds: 206.08 ops/second at_max=0 fail_loc=0x80001420 file /mnt/lustre2/d31.replay-dual/mdtdir/f31.replay-dual is ready Failover ost1 to oleg304-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-OST0000 18:03:42 (1713305022) targets are mounted 18:03:42 (1713305022) facet_failover done oleg304-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL IDLE state after 0 sec pids: 18704 18705 18710 18711 18712 18713 18714 18715 18716 at_max=600 PASS 31 (21s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 32: gap in update llog shouldn't break recovery ========================================================== 18:03:50 (1713305030) fail_loc=0x0000131d fail_val=10 fail_loc=0x726 Stopping /mnt/lustre-mds2 (opts:) on oleg304-server Stopping /mnt/lustre-mds1 (opts:) on oleg304-server fail_loc=0 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2452 1285208 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2020 1285668 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1548 3605472 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 7840 3597764 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 9388 7203236 1% /mnt/lustre PASS 32 (14s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 33: Check for OBD_INCOMPAT_MULTI_RPCS in last_rcvd after abort_recovery ========================================================== 18:04:05 (1713305045) at_min=60 Stopping /mnt/lustre-mds2 (opts:) on oleg304-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 oleg304-client.virtnet: executing wait_import_state_mount REPLAY_WAIT mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in REPLAY_WAIT state after 0 sec oleg304-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec affected facets: mds2 oleg304-server: oleg304-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg304-server: *.lustre-MDT0001.recovery_status status: COMPLETE Stopping /mnt/lustre-mds2 (opts:) on oleg304-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg304-server: oleg304-server.virtnet: executing set_default_debug -1 all pdsh@oleg304-client: oleg304-server: ssh exited with exit code 1 Started lustre-MDT0001 Starting client: oleg304-client.virtnet: -o user_xattr,flock oleg304-server@tcp:/lustre /mnt/lustre2 oleg304-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL DISCONN state after 3 sec affected facets: mds2 oleg304-server: oleg304-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg304-server: *.lustre-MDT0001.recovery_status status: COMPLETE at_min=5 PASS 33 (38s) debug_raw_pointers=0 debug_raw_pointers=0 == replay-dual test complete, duration 2633 sec ========== 18:04:43 (1713305083) === replay-dual: start cleanup 18:04:44 (1713305084) === Stopping clients: oleg304-client.virtnet /mnt/lustre2 (opts:) Stopping client oleg304-client.virtnet /mnt/lustre2 opts: === replay-dual: finish cleanup 18:04:45 (1713305085) ===