-----============= acceptance-small: replay-dual ============----- Thu Apr 18 18:51:16 EDT 2024 excepting tests: 14b 21b skipping tests SLOW=no: 21b === replay-dual: start setup 18:51:20 (1713480680) === Starting client oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 Started clients oleg130-client.virtnet: 192.168.201.130@tcp:/lustre on /mnt/lustre2 type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) oleg130-client.virtnet: executing check_config_client /mnt/lustre oleg130-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg130-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8800a9d2c800.idle_timeout=debug osc.lustre-OST0000-osc-ffff88012b6ae000.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800a9d2c800.idle_timeout=debug osc.lustre-OST0001-osc-ffff88012b6ae000.idle_timeout=debug disable quota as required oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all osd-ldiskfs.track_declares_assert=1 === replay-dual: finish setup 18:51:27 (1713480687) === debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0a: expired recovery with lost client ========================================================== 18:51:28 (1713480688) Check file is LU482_FAILED=/tmp/replay-dual.lu482.HVHh8r UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 50 open/close in 0.35 seconds: 141.17 ops/second fail_loc=0x80000514 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:51:32 (1713480692) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:51:45 (1713480705) targets are mounted 18:51:45 (1713480705) facet_failover done Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 - unlinked 0 (time 1713480787 ; total 0 ; last 0) total: 50 unlinks in 0 seconds: inf unlinks/second PASS 0a (100s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0b: lost client during waiting for next transno ========================================================== 18:53:10 (1713480790) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:53:13 (1713480793) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:53:26 (1713480806) targets are mounted 18:53:26 (1713480806) facet_failover done Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 PASS 0b (93s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 1: |X| simple create ================= 18:54:45 (1713480885) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:54:49 (1713480889) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:55:04 (1713480904) targets are mounted 18:55:04 (1713480904) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 1 (26s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 2: |X| mkdir adir ==================== 18:55:13 (1713480913) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:55:18 (1713480918) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:55:33 (1713480933) targets are mounted 18:55:33 (1713480933) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 2 (28s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 3: |X| mkdir adir, mkdir adir/bdir === 18:55:43 (1713480943) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:55:48 (1713480948) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:56:02 (1713480962) targets are mounted 18:56:02 (1713480962) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 3 (27s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 4: |X| mkdir adir (-EEXIST), mkdir adir/bdir ========================================================== 18:56:12 (1713480972) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1916 1285772 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre mkdir: cannot create directory '/mnt/lustre/adir': File exists Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:56:16 (1713480976) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:56:31 (1713480991) targets are mounted 18:56:31 (1713480991) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 4 (27s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 5: open, unlink |X| close ============ 18:56:40 (1713481000) multiop /mnt/lustre2/a vo_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7502 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:56:45 (1713481005) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:57:01 (1713481021) targets are mounted 18:57:01 (1713481021) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 5 (27s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 6: open1, open2, unlink |X| close1 [fail mds1] close2 ========================================================== 18:57:09 (1713481029) multiop /mnt/lustre2/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7502 multiop /mnt/lustre/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7502 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:57:12 (1713481032) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:57:27 (1713481047) targets are mounted 18:57:27 (1713481047) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 6 (26s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 8: replay of resent request ========== 18:57:37 (1713481057) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x119 fail_loc=0 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:57:58 (1713481078) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:58:12 (1713481092) targets are mounted 18:58:12 (1713481092) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 8 (42s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 9: resending a replayed create ======= 18:58:20 (1713481100) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:58:24 (1713481104) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:58:39 (1713481119) targets are mounted 18:58:39 (1713481119) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (42s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 10: resending a replayed unlink ====== 18:59:04 (1713481144) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:59:08 (1713481148) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 18:59:22 (1713481162) targets are mounted 18:59:22 (1713481162) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 10 (38s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 11: both clients timeout during replay ========================================================== 18:59:44 (1713481184) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x0119 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 18:59:48 (1713481188) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:00:02 (1713481202) targets are mounted 19:00:02 (1713481202) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 15 sec fail_loc=0 PASS 11 (36s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 12: open resend timeout ============== 19:00:22 (1713481222) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre multiop /mnt/lustre/f12.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7502 fail_loc=0x80000302 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:00:27 (1713481227) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:00:42 (1713481242) targets are mounted 19:00:42 (1713481242) facet_failover done fail_loc=0 /mnt/lustre/f12.replay-dual /mnt/lustre/f12.replay-dual has type file OK PASS 12 (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 13: close resend timeout ============= 19:00:49 (1713481249) multiop /mnt/lustre/f13.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7502 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000115 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:00:53 (1713481253) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:01:08 (1713481268) targets are mounted 19:01:08 (1713481268) facet_failover done fail_loc=0 /mnt/lustre/f13.replay-dual /mnt/lustre/f13.replay-dual has type file OK PASS 13 (24s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_14b skipping ALWAYS excluded test 14b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15a: timeout waiting for lost client during replay, 1 client completes ========================================================== 19:01:15 (1713481275) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.17 seconds: 144.33 ops/second total: 1 open/close in 0.01 seconds: 137.40 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:01:19 (1713481279) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:01:34 (1713481294) targets are mounted 19:01:34 (1713481294) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713481365 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 PASS 15a (92s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15c: remove multiple OST orphans ===== 19:02:49 (1713481369) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:03:24 (1713481404) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:03:38 (1713481418) targets are mounted 19:03:38 (1713481418) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 PASS 15c (125s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 16: fail MDS during recovery (3571) == 19:04:56 (1713481496) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.31 seconds: 79.97 ops/second total: 1 open/close in 0.02 seconds: 65.62 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:05:01 (1713481501) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:05:16 (1713481516) targets are mounted 19:05:16 (1713481516) facet_failover done Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:05:37 (1713481537) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:05:51 (1713481551) targets are mounted 19:05:51 (1713481551) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713481624 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 PASS 16 (130s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 17: fail OST during recovery (3571) == 19:07:08 (1713481628) total: 25 open/close in 0.32 seconds: 77.59 ops/second total: 1 open/close in 0.02 seconds: 57.34 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1984 1285704 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing ost1 on oleg130-server Stopping /mnt/lustre-ost1 (opts:) on oleg130-server 19:07:13 (1713481633) shut down Failover ost1 to oleg130-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-OST0000 19:07:29 (1713481649) targets are mounted 19:07:29 (1713481649) facet_failover done Failing ost1 on oleg130-server Stopping /mnt/lustre-ost1 (opts:) on oleg130-server 19:07:50 (1713481670) shut down Failover ost1 to oleg130-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-OST0000 19:08:03 (1713481683) targets are mounted 19:08:03 (1713481683) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713481761 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 PASS 17 (135s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 18: ldlm_handle_enqueue succeeds on evicted export (3822) ========================================================== 19:09:25 (1713481765) debug=+dlmtrace fail_loc=0x8000030b using seed 3658105023 running for 500 iterations total: 500 stats in 0 seconds: inf stats/second ldlm.namespaces.MGC192.168.201.130@tcp.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800a8d17800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800b655e000.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800a8d17800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800b655e000.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800a8d17800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800b655e000.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800a8d17800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800b655e000.early_lock_cancel=0 fail_loc=0x80000305 Error in opening file "/mnt/lustre2/d18.replay-dual/f18.replay-dual"(flags=O_RDONLY) 2: No such file or directory ldlm.namespaces.MGC192.168.201.130@tcp.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800a8d17800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800b655e000.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800a8d17800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800b655e000.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800a8d17800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800b655e000.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800a8d17800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800b655e000.early_lock_cancel=1 fail_loc=0 fail_loc=0 PASS 18 (46s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 19: resend of open request =========== 19:10:13 (1713481813) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre fail_loc=0x157 - open/close 0 (time 1713481904.45 total 87.03 last 0.00) total: 1 open/close in 87.03 seconds: 0.01 ops/second fail_loc=0 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:11:45 (1713481905) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:12:01 (1713481921) targets are mounted 19:12:01 (1713481921) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 19 (115s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 20: recovery time is not increasing == 19:12:10 (1713481930) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:12:15 (1713481935) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:12:31 (1713481951) targets are mounted 19:12:31 (1713481951) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:14:57 (1713482097) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:15:13 (1713482113) targets are mounted 19:15:13 (1713482113) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 PASS 20 (327s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 21a: commit on sharing =============== 19:17:39 (1713482259) mdt.lustre-MDT0000.commit_on_sharing=1 mdt.lustre-MDT0001.commit_on_sharing=1 Replay barrier on lustre-MDT0000 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:17:44 (1713482264) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:17:59 (1713482279) targets are mounted 19:17:59 (1713482279) facet_failover done Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 mdt.lustre-MDT0000.commit_on_sharing=0 mdt.lustre-MDT0001.commit_on_sharing=0 PASS 21a (164s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_21b skipping SLOW test 21b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply & fail, c2 mkdir dir1/dir ========================================================== 19:20:25 (1713482425) fail_loc=0x119 Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 19:20:28 (1713482428) shut down Failover mds2 to oleg130-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 19:20:43 (1713482443) targets are mounted 19:20:43 (1713482443) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2000 1285688 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1824 1285864 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.04 seconds: 50.17 ops/second total: 2 open/close in 0.02 seconds: 104.29 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:20:53 (1713482453) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:21:09 (1713482469) targets are mounted 19:21:09 (1713482469) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22a (52s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22b: c1 lfs mkdir -i 1 d1, M1 drop reply & fail M0/M1, c2 mkdir d1/dir ========================================================== 19:21:19 (1713482479) fail_loc=0x119 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 19:21:24 (1713482484) shut down Failover mds1 to oleg130-server mount facets: mds1 Failover mds2 to oleg130-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 19:21:50 (1713482510) targets are mounted 19:21:50 (1713482510) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.04 seconds: 46.99 ops/second total: 2 open/close in 0.02 seconds: 104.26 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:22:01 (1713482521) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:22:16 (1713482536) targets are mounted 19:22:16 (1713482536) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22b (65s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22c: c1 lfs mkdir -i 1 d1, M1 drop update & fail M1, c2 mkdir d1/dir ========================================================== 19:22:26 (1713482546) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:22:29 (1713482549) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:22:43 (1713482563) targets are mounted 19:22:43 (1713482563) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.04 seconds: 53.76 ops/second total: 2 open/close in 0.02 seconds: 111.91 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:22:53 (1713482573) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:23:09 (1713482589) targets are mounted 19:23:09 (1713482589) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22c (50s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22d: c1 lfs mkdir -i 1 d1, M1 drop update & fail M0/M1,c2 mkdir d1/dir ========================================================== 19:23:19 (1713482599) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 19:23:32 (1713482612) shut down Failover mds1 to oleg130-server mount facets: mds1 Failover mds2 to oleg130-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 19:23:53 (1713482633) targets are mounted 19:23:53 (1713482633) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1960 1285728 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1784 1285904 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.04 seconds: 49.97 ops/second total: 2 open/close in 0.02 seconds: 116.17 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:24:17 (1713482657) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:24:32 (1713482672) targets are mounted 19:24:32 (1713482672) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22d (81s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23a: c1 rmdir d1, M1 drop reply and fail, client2 mkdir d1 ========================================================== 19:24:42 (1713482682) fail_loc=0x119 fail_loc=0 Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 19:24:46 (1713482686) shut down Failover mds2 to oleg130-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 19:25:00 (1713482700) targets are mounted 19:25:00 (1713482700) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 68.94 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:25:10 (1713482710) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:25:25 (1713482725) targets are mounted 19:25:25 (1713482725) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23a (51s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23b: c1 rmdir d1, M1 drop reply and fail M0/M1, c2 mkdir d1 ========================================================== 19:25:36 (1713482736) fail_loc=0x119 fail_loc=0 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 19:25:46 (1713482746) shut down Failover mds1 to oleg130-server mount facets: mds1 Failover mds2 to oleg130-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 19:26:06 (1713482766) targets are mounted 19:26:06 (1713482766) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 91.01 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:26:16 (1713482776) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:26:31 (1713482791) targets are mounted 19:26:31 (1713482791) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23b (62s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23c: c1 rmdir d1, M0 drop update reply and fail M0, c2 mkdir d1 ========================================================== 19:26:40 (1713482800) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:26:43 (1713482803) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:26:57 (1713482817) targets are mounted 19:26:57 (1713482817) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 71.85 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:27:07 (1713482827) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:27:22 (1713482842) targets are mounted 19:27:22 (1713482842) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23c (49s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23d: c1 rmdir d1, M0 drop update reply and fail M0/M1, c2 mkdir d1 ========================================================== 19:27:31 (1713482851) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 19:27:44 (1713482864) shut down Failover mds1 to oleg130-server mount facets: mds1 Failover mds2 to oleg130-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 19:28:04 (1713482884) targets are mounted 19:28:04 (1713482884) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1956 1285732 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1780 1285908 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 84.83 ops/second Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:28:15 (1713482895) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:28:30 (1713482910) targets are mounted 19:28:30 (1713482910) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23d (67s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 24: reconstruct on non-existing object ========================================================== 19:28:40 (1713482920) fail_loc=0x119 fail_loc=0 truncate: cannot truncate '/mnt/lustre/f24.replay-dual' to length 100: No such file or directory PASS 24 (88s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 25: replay|resend ==================== 19:30:10 (1713483010) 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.00479424 s, 107 kB/s fail_loc=0x304 fail_loc=0x80000325 Failing ost1 on oleg130-server Stopping /mnt/lustre-ost1 (opts:) on oleg130-server 19:30:13 (1713483013) shut down Failover ost1 to oleg130-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-OST0000 19:30:28 (1713483028) targets are mounted 19:30:28 (1713483028) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 6515: 4709 Terminated LUSTRE="/home/green/git/lustre-release/lustre" bash -c "multiop /mnt/lustre2/f25.replay-dual Ow512" fail_loc=0 PASS 25 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 26: dbench and tar with mds failover ========================================================== 19:30:36 (1713483036) Starting client oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre Started clients oleg130-client.virtnet: 192.168.201.130@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) Started tar loop with pid 6331 Started dbench loop with 6332 striped dir -i0 -c2 -H all_char /mnt/lustre2/d26.replay-dual/run_dbench striped dir -i0 -c2 -H crush2 /mnt/lustre/d26.replay-dual/run_tar looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Thu Apr 18 19:30:38 EDT 2024 waiting for dbench pid 6373 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs failed to create barrier semaphore 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2204 1285484 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 26100 3580028 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 27644 7185504 1% /mnt/lustre 1 155 6.54 MB/sec warmup 1 sec latency 37.997 ms 1 294 5.42 MB/sec warmup 2 sec latency 37.922 ms 1 429 5.04 MB/sec warmup 3 sec latency 34.293 ms 1 642 5.26 MB/sec warmup 4 sec latency 30.504 ms 1 799 4.29 MB/sec warmup 5 sec latency 25.757 ms test_26 fail mds1 1 times Failing mds1 on oleg130-server 1 1074 3.82 MB/sec warmup 6 sec latency 17.517 ms Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 1 1204 3.29 MB/sec warmup 7 sec latency 297.893 ms 19:30:46 (1713483046) shut down 1 1204 2.88 MB/sec warmup 8 sec latency 1298.128 ms 1 1204 2.56 MB/sec warmup 9 sec latency 2298.420 ms 1 1204 2.30 MB/sec warmup 10 sec latency 3298.702 ms 1 1204 2.10 MB/sec warmup 11 sec latency 4299.027 ms 1 1204 1.92 MB/sec warmup 12 sec latency 5299.346 ms 1 1204 1.77 MB/sec warmup 13 sec latency 6299.560 ms 1 1204 1.65 MB/sec warmup 14 sec latency 7299.805 ms 1 1204 1.54 MB/sec warmup 15 sec latency 8300.124 ms 1 1204 1.44 MB/sec warmup 16 sec latency 9300.410 ms 1 1204 1.36 MB/sec warmup 17 sec latency 10300.584 ms Failover mds1 to oleg130-server mount facets: mds1 1 1204 1.28 MB/sec warmup 18 sec latency 11300.751 ms 1 1204 1.21 MB/sec warmup 19 sec latency 12301.005 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 1204 0.00 MB/sec execute 1 sec latency 14301.429 ms 1 1204 0.00 MB/sec execute 2 sec latency 15301.667 ms oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:31:01 (1713483061) targets are mounted 19:31:01 (1713483061) facet_failover done 1 1204 0.00 MB/sec execute 3 sec latency 16301.949 ms 1 1204 0.00 MB/sec execute 4 sec latency 17302.125 ms 1 1204 0.00 MB/sec execute 5 sec latency 18302.392 ms 1 1204 0.00 MB/sec execute 6 sec latency 19302.735 ms 1 1211 0.00 MB/sec execute 7 sec latency 20282.028 ms oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 1415 0.37 MB/sec execute 8 sec latency 63.452 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1596 0.34 MB/sec execute 9 sec latency 24.070 ms 1 1779 0.32 MB/sec execute 10 sec latency 27.702 ms 1 1984 0.31 MB/sec execute 11 sec latency 21.227 ms 1 2168 0.31 MB/sec execute 12 sec latency 18.411 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2796 1284892 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2528 1285160 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 8060 3593544 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 29068 3569960 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 37128 7163504 1% /mnt/lustre 1 2388 0.38 MB/sec execute 13 sec latency 21.639 ms 1 2571 0.42 MB/sec execute 14 sec latency 25.073 ms 1 2918 0.69 MB/sec execute 15 sec latency 25.198 ms 1 3297 0.77 MB/sec execute 16 sec latency 26.492 ms test_26 fail mds2 2 times 1 3624 0.91 MB/sec execute 17 sec latency 19.408 ms Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 1 3815 0.93 MB/sec execute 18 sec latency 25.233 ms 19:31:17 (1713483077) shut down 1 3955 0.90 MB/sec execute 19 sec latency 28.889 ms 1 4083 0.85 MB/sec execute 20 sec latency 25.120 ms 1 4083 0.81 MB/sec execute 21 sec latency 1008.239 ms 1 4083 0.77 MB/sec execute 22 sec latency 2008.386 ms 1 4083 0.74 MB/sec execute 23 sec latency 3008.565 ms 1 4083 0.71 MB/sec execute 24 sec latency 4008.743 ms 1 4083 0.68 MB/sec execute 25 sec latency 5008.954 ms 1 4083 0.66 MB/sec execute 26 sec latency 6009.168 ms 1 4083 0.63 MB/sec execute 27 sec latency 7009.339 ms 1 4083 0.61 MB/sec execute 28 sec latency 8009.532 ms Failover mds2 to oleg130-server mount facets: mds2 1 4083 0.59 MB/sec execute 29 sec latency 9009.674 ms 1 4083 0.57 MB/sec execute 30 sec latency 10009.852 ms 1 4083 0.55 MB/sec execute 31 sec latency 11010.080 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 4083 0.53 MB/sec execute 32 sec latency 12010.269 ms 1 4083 0.52 MB/sec execute 33 sec latency 13010.475 ms oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 1 4083 0.50 MB/sec execute 34 sec latency 14010.635 ms Started lustre-MDT0001 19:31:33 (1713483093) targets are mounted 19:31:33 (1713483093) facet_failover done 1 4083 0.49 MB/sec execute 35 sec latency 15010.859 ms 1 4083 0.47 MB/sec execute 36 sec latency 16011.060 ms 1 4083 0.46 MB/sec execute 37 sec latency 17011.238 ms oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 4172 0.45 MB/sec execute 38 sec latency 17616.568 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4374 0.45 MB/sec execute 39 sec latency 27.968 ms 1 4536 0.45 MB/sec execute 40 sec latency 26.440 ms 1 4727 0.47 MB/sec execute 41 sec latency 27.628 ms 1 4969 0.53 MB/sec execute 42 sec latency 25.459 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4024 1283664 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2640 1285048 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 13128 3580972 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 37152 3558292 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 50280 7139264 1% /mnt/lustre 1 5133 0.52 MB/sec execute 43 sec latency 32.243 ms 1 5302 0.51 MB/sec execute 44 sec latency 30.119 ms 1 5474 0.50 MB/sec execute 45 sec latency 19.704 ms 1 5744 0.50 MB/sec execute 46 sec latency 17.791 ms 1 5954 0.51 MB/sec execute 47 sec latency 43.422 ms test_26 fail mds1 3 times Failing mds1 on oleg130-server 1 6147 0.52 MB/sec execute 48 sec latency 24.976 ms Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 1 6335 0.57 MB/sec execute 49 sec latency 487.135 ms 19:31:48 (1713483108) shut down 1 6335 0.56 MB/sec execute 50 sec latency 1487.315 ms 1 6335 0.55 MB/sec execute 51 sec latency 2487.541 ms 1 6335 0.54 MB/sec execute 52 sec latency 3487.727 ms 1 6335 0.53 MB/sec execute 53 sec latency 4488.004 ms 1 6335 0.52 MB/sec execute 54 sec latency 5488.233 ms 1 6335 0.51 MB/sec execute 55 sec latency 6488.465 ms 1 6335 0.50 MB/sec execute 56 sec latency 7488.720 ms 1 6335 0.49 MB/sec execute 57 sec latency 8488.943 ms 1 6335 0.48 MB/sec execute 58 sec latency 9489.164 ms 1 6335 0.48 MB/sec execute 59 sec latency 10489.517 ms Failover mds1 to oleg130-server mount facets: mds1 1 6335 0.47 MB/sec execute 60 sec latency 11489.740 ms 1 6335 0.46 MB/sec execute 61 sec latency 12489.947 ms 1 6335 0.45 MB/sec execute 62 sec latency 13490.119 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 6335 0.44 MB/sec execute 63 sec latency 14490.312 ms oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all 1 6335 0.44 MB/sec execute 64 sec latency 15490.494 ms pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:32:03 (1713483123) targets are mounted 19:32:03 (1713483123) facet_failover done 1 6335 0.43 MB/sec execute 65 sec latency 16490.712 ms 1 6335 0.42 MB/sec execute 66 sec latency 17490.928 ms 1 6335 0.42 MB/sec execute 67 sec latency 18491.148 ms 1 6335 0.41 MB/sec execute 68 sec latency 19491.389 ms 1 6335 0.41 MB/sec execute 69 sec latency 20491.710 ms 1 6335 0.40 MB/sec execute 70 sec latency 21491.901 ms 1 6461 0.41 MB/sec execute 71 sec latency 22234.417 ms oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 6787 0.42 MB/sec execute 72 sec latency 27.968 ms 1 7085 0.47 MB/sec execute 73 sec latency 25.757 ms 1 7274 0.47 MB/sec execute 74 sec latency 33.755 ms 1 7421 0.48 MB/sec execute 75 sec latency 25.362 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4300 1283388 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2552 1285136 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 18980 3581548 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 43252 3557052 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 62232 7138600 1% /mnt/lustre 1 7544 0.48 MB/sec execute 76 sec latency 51.098 ms 1 7688 0.47 MB/sec execute 77 sec latency 28.759 ms 1 7789 0.47 MB/sec execute 78 sec latency 562.608 ms 1 7960 0.47 MB/sec execute 79 sec latency 28.853 ms 1 8102 0.46 MB/sec execute 80 sec latency 26.306 ms test_26 fail mds2 4 times Failing mds2 on oleg130-server 1 8308 0.47 MB/sec execute 81 sec latency 44.002 ms Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 1 8496 0.50 MB/sec execute 82 sec latency 397.913 ms 19:32:21 (1713483141) shut down 1 8496 0.50 MB/sec execute 83 sec latency 1398.147 ms 1 8496 0.49 MB/sec execute 84 sec latency 2398.346 ms 1 8496 0.48 MB/sec execute 85 sec latency 3398.546 ms 1 8496 0.48 MB/sec execute 86 sec latency 4398.702 ms 1 8496 0.47 MB/sec execute 87 sec latency 5398.932 ms 1 8496 0.47 MB/sec execute 88 sec latency 6399.134 ms 1 8496 0.46 MB/sec execute 89 sec latency 7399.319 ms 1 8496 0.46 MB/sec execute 90 sec latency 8399.526 ms 1 8496 0.45 MB/sec execute 91 sec latency 9399.730 ms 1 8496 0.45 MB/sec execute 92 sec latency 10399.958 ms Failover mds2 to oleg130-server mount facets: mds2 1 8496 0.44 MB/sec execute 93 sec latency 11400.150 ms 1 8496 0.44 MB/sec execute 94 sec latency 12400.344 ms 1 8496 0.43 MB/sec execute 95 sec latency 13400.529 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 8496 0.43 MB/sec execute 96 sec latency 14400.727 ms oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all 1 8496 0.42 MB/sec execute 97 sec latency 15400.914 ms pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 19:32:36 (1713483156) targets are mounted 19:32:36 (1713483156) facet_failover done 1 8496 0.42 MB/sec execute 98 sec latency 16401.217 ms 1 8496 0.42 MB/sec execute 99 sec latency 17401.419 ms 1 cleanup 100 sec 1 cleanup 101 sec oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 cleanup 102 sec 0 cleanup 102 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 1243 29.928 19834.934 Close 916 2.920 8.472 Rename 54 19.824 27.951 Unlink 252 7.150 21.599 Qpathinfo 1150 38.610 22234.390 Qfileinfo 196 0.780 4.317 Qfsinfo 209 1.376 7.514 Sfileinfo 107 10.866 19.320 Find 440 49.272 20282.002 WriteX 612 3.446 13.761 ReadX 2020 0.209 44.760 LockX 4 3.172 3.515 UnlockX 4 3.969 6.367 Flush 86 19.509 562.583 Throughput 0.415968 MB/sec 1 clients 1 procs max_latency=22234.417 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec stopping dbench on /mnt/lustre at Thu Apr 18 19:32:41 EDT 2024 with return code 0 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Thu Apr 18 19:32:43 EDT 2024 waiting for dbench pid 10806 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 155 6.54 MB/sec warmup 1 sec latency 36.369 ms 1 364 6.52 MB/sec warmup 2 sec latency 33.249 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4720 1282968 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2684 1285004 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 8144 3594200 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 37692 3527152 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 45836 7121352 1% /mnt/lustre 1 652 7.02 MB/sec warmup 3 sec latency 29.846 ms 1 721 5.32 MB/sec warmup 4 sec latency 586.595 ms 1 888 4.33 MB/sec warmup 5 sec latency 596.161 ms 1 1059 3.81 MB/sec warmup 6 sec latency 25.289 ms test_26 fail mds1 5 times Failing mds1 on oleg130-server 1 1259 3.30 MB/sec warmup 7 sec latency 20.512 ms Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 1 1474 3.25 MB/sec warmup 8 sec latency 537.239 ms 1 1474 2.89 MB/sec warmup 9 sec latency 1537.522 ms 1 1474 2.60 MB/sec warmup 10 sec latency 2537.690 ms 1 1474 2.37 MB/sec warmup 11 sec latency 3537.909 ms 1 1474 2.17 MB/sec warmup 12 sec latency 4538.197 ms 1 1474 2.00 MB/sec warmup 13 sec latency 5538.468 ms 19:32:57 (1713483177) shut down 1 1474 1.86 MB/sec warmup 14 sec latency 6538.730 ms 1 1474 1.74 MB/sec warmup 15 sec latency 7538.896 ms 1 1474 1.63 MB/sec warmup 16 sec latency 8539.095 ms 1 1474 1.53 MB/sec warmup 17 sec latency 9539.402 ms 1 1474 1.45 MB/sec warmup 18 sec latency 10539.639 ms 1 1474 1.37 MB/sec warmup 19 sec latency 11539.819 ms 1 1474 0.00 MB/sec execute 1 sec latency 13540.222 ms 1 1474 0.00 MB/sec execute 2 sec latency 14540.445 ms 1 1474 0.00 MB/sec execute 3 sec latency 15540.739 ms Failover mds1 to oleg130-server mount facets: mds1 1 1474 0.00 MB/sec execute 4 sec latency 16540.976 ms 1 1474 0.00 MB/sec execute 5 sec latency 17541.176 ms 1 1474 0.00 MB/sec execute 6 sec latency 18541.351 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 1474 0.00 MB/sec execute 7 sec latency 19541.556 ms 1 1474 0.00 MB/sec execute 8 sec latency 20541.761 ms oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all 1 1474 0.00 MB/sec execute 9 sec latency 21541.979 ms pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:33:12 (1713483192) targets are mounted 19:33:12 (1713483192) facet_failover done 1 1474 0.00 MB/sec execute 10 sec latency 22542.186 ms 1 1474 0.00 MB/sec execute 11 sec latency 23542.433 ms 1 1474 0.00 MB/sec execute 12 sec latency 24542.702 ms 1 1474 0.00 MB/sec execute 13 sec latency 25543.053 ms oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 1576 0.01 MB/sec execute 14 sec latency 26167.713 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1855 0.02 MB/sec execute 15 sec latency 16.458 ms 1 2061 0.04 MB/sec execute 16 sec latency 19.759 ms 1 2269 0.05 MB/sec execute 17 sec latency 22.284 ms 1 2447 0.12 MB/sec execute 18 sec latency 27.988 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5184 1282504 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3208 1284480 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 21588 3582680 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 57116 3547984 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 78704 7130664 2% /mnt/lustre 1 2759 0.31 MB/sec execute 19 sec latency 27.513 ms 1 3000 0.37 MB/sec execute 20 sec latency 137.252 ms 1 3243 0.41 MB/sec execute 21 sec latency 607.584 ms 1 3563 0.56 MB/sec execute 22 sec latency 20.349 ms test_26 fail mds2 6 times 1 3773 0.60 MB/sec execute 23 sec latency 24.018 ms Failing mds2 on oleg130-server Stopping /mnt/lustre-mds2 (opts:) on oleg130-server 1 3906 0.58 MB/sec execute 24 sec latency 27.076 ms 19:33:28 (1713483208) shut down 1 3919 0.56 MB/sec execute 25 sec latency 940.075 ms 1 3919 0.54 MB/sec execute 26 sec latency 1940.284 ms 1 3919 0.52 MB/sec execute 27 sec latency 2940.642 ms 1 3919 0.50 MB/sec execute 28 sec latency 3940.886 ms 1 3919 0.48 MB/sec execute 29 sec latency 4941.125 ms 1 3919 0.47 MB/sec execute 30 sec latency 5941.421 ms 1 3919 0.45 MB/sec execute 31 sec latency 6941.757 ms 1 3919 0.44 MB/sec execute 32 sec latency 7942.009 ms 1 3919 0.43 MB/sec execute 33 sec latency 8942.300 ms 1 3919 0.41 MB/sec execute 34 sec latency 9942.657 ms Failover mds2 to oleg130-server mount facets: mds2 1 3919 0.40 MB/sec execute 35 sec latency 10942.828 ms 1 3919 0.39 MB/sec execute 36 sec latency 11942.993 ms 1 3919 0.38 MB/sec execute 37 sec latency 12943.182 ms 1 3919 0.37 MB/sec execute 38 sec latency 13943.423 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 3919 0.36 MB/sec execute 39 sec latency 14943.606 ms oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 1 3919 0.35 MB/sec execute 40 sec latency 15943.750 ms Started lustre-MDT0001 19:33:43 (1713483223) targets are mounted 19:33:43 (1713483223) facet_failover done 1 3919 0.34 MB/sec execute 41 sec latency 16943.956 ms 1 3919 0.33 MB/sec execute 42 sec latency 17944.200 ms 1 3919 0.33 MB/sec execute 43 sec latency 18944.502 ms 1 3919 0.32 MB/sec execute 44 sec latency 19944.720 ms 1 3951 0.31 MB/sec execute 45 sec latency 20693.483 ms oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4073 0.31 MB/sec execute 46 sec latency 21.796 ms 1 4233 0.30 MB/sec execute 47 sec latency 27.577 ms 1 4379 0.30 MB/sec execute 48 sec latency 40.672 ms 1 4538 0.30 MB/sec execute 49 sec latency 29.724 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5344 1282344 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3400 1284288 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 21824 3581644 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 57924 3545992 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 79748 7127636 2% /mnt/lustre 1 4715 0.32 MB/sec execute 50 sec latency 48.975 ms 1 4984 0.37 MB/sec execute 51 sec latency 22.880 ms 1 5059 0.37 MB/sec execute 52 sec latency 560.852 ms 1 5227 0.36 MB/sec execute 53 sec latency 32.942 ms 1 5395 0.36 MB/sec execute 54 sec latency 27.857 ms test_26 fail mds1 7 times Failing mds1 on oleg130-server 1 5587 0.36 MB/sec execute 55 sec latency 37.712 ms Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 1 5714 0.35 MB/sec execute 56 sec latency 468.249 ms 1 5714 0.35 MB/sec execute 57 sec latency 1468.495 ms 1 5714 0.34 MB/sec execute 58 sec latency 2468.710 ms 1 5714 0.34 MB/sec execute 59 sec latency 3468.955 ms 1 5714 0.33 MB/sec execute 60 sec latency 4469.210 ms 1 5714 0.32 MB/sec execute 61 sec latency 5469.570 ms 1 5714 0.32 MB/sec execute 62 sec latency 6469.857 ms 19:34:05 (1713483245) shut down 1 5714 0.31 MB/sec execute 63 sec latency 7470.084 ms 1 5714 0.31 MB/sec execute 64 sec latency 8470.335 ms 1 5714 0.30 MB/sec execute 65 sec latency 9470.551 ms 1 5714 0.30 MB/sec execute 66 sec latency 10470.817 ms 1 5714 0.30 MB/sec execute 67 sec latency 11471.041 ms 1 5714 0.29 MB/sec execute 68 sec latency 12471.261 ms 1 5714 0.29 MB/sec execute 69 sec latency 13471.421 ms 1 5714 0.28 MB/sec execute 70 sec latency 14471.692 ms 1 5714 0.28 MB/sec execute 71 sec latency 15471.959 ms 1 5714 0.27 MB/sec execute 72 sec latency 16472.226 ms Failover mds1 to oleg130-server mount facets: mds1 1 5714 0.27 MB/sec execute 73 sec latency 17472.415 ms 1 5714 0.27 MB/sec execute 74 sec latency 18472.633 ms 1 5714 0.26 MB/sec execute 75 sec latency 19472.825 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 5714 0.26 MB/sec execute 76 sec latency 20473.086 ms 1 5714 0.26 MB/sec execute 77 sec latency 21473.312 ms oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:34:21 (1713483261) targets are mounted 19:34:21 (1713483261) facet_failover done 1 5714 0.25 MB/sec execute 78 sec latency 22473.525 ms 1 5714 0.25 MB/sec execute 79 sec latency 23473.687 ms 1 5714 0.25 MB/sec execute 80 sec latency 24473.913 ms 1 5714 0.24 MB/sec execute 81 sec latency 25474.125 ms 1 5714 0.24 MB/sec execute 82 sec latency 26474.293 ms oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 5765 0.24 MB/sec execute 83 sec latency 27125.939 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 5976 0.25 MB/sec execute 84 sec latency 21.817 ms 1 6161 0.26 MB/sec execute 85 sec latency 24.832 ms 1 6532 0.31 MB/sec execute 86 sec latency 24.008 ms 1 6857 0.33 MB/sec execute 87 sec latency 24.924 ms 1 7120 0.36 MB/sec execute 88 sec latency 51.887 ms 1 7312 0.37 MB/sec execute 89 sec latency 30.545 ms 1 7446 0.37 MB/sec execute 90 sec latency 26.878 ms 1 7572 0.36 MB/sec execute 91 sec latency 25.893 ms 1 7726 0.36 MB/sec execute 92 sec latency 29.651 ms 1 7891 0.36 MB/sec execute 93 sec latency 74.482 ms 1 8042 0.36 MB/sec execute 94 sec latency 30.163 ms 1 8237 0.37 MB/sec execute 95 sec latency 21.946 ms 1 8500 0.40 MB/sec execute 96 sec latency 26.306 ms 1 8668 0.39 MB/sec execute 97 sec latency 47.435 ms 1 8827 0.39 MB/sec execute 98 sec latency 29.595 ms 1 9095 0.39 MB/sec execute 99 sec latency 25.625 ms 1 cleanup 100 sec 0 cleanup 101 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 1395 51.881 27125.918 Close 1030 2.910 17.122 Rename 53 20.011 29.186 Unlink 257 87.927 20693.456 Qpathinfo 1289 3.815 22.985 Qfileinfo 219 0.696 4.240 Qfsinfo 204 1.378 6.181 Sfileinfo 100 11.144 26.692 Find 471 4.045 18.239 WriteX 662 3.372 23.063 ReadX 2039 0.120 6.014 LockX 4 3.039 3.631 UnlockX 4 3.082 3.692 Flush 92 27.928 607.575 Throughput 0.39052 MB/sec 1 clients 1 procs max_latency=27125.939 ms stopping dbench on /mnt/lustre at Thu Apr 18 19:34:44 EDT 2024 with return code 0 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished striped dir -i0 -c2 -H all_char /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Thu Apr 18 19:34:45 EDT 2024 waiting for dbench pid 14199 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 157 6.54 MB/sec warmup 1 sec latency 37.893 ms 1 305 5.61 MB/sec warmup 2 sec latency 31.483 ms 1 452 5.30 MB/sec warmup 3 sec latency 29.086 ms 1 664 5.29 MB/sec warmup 4 sec latency 28.705 ms 1 811 4.30 MB/sec warmup 5 sec latency 48.405 ms 1 967 3.64 MB/sec warmup 6 sec latency 53.880 ms 1 1170 3.29 MB/sec warmup 7 sec latency 25.199 ms 1 1421 3.25 MB/sec warmup 8 sec latency 25.397 ms 1 1605 2.91 MB/sec warmup 9 sec latency 43.004 ms 1 1805 2.63 MB/sec warmup 10 sec latency 27.437 ms 1 1999 2.41 MB/sec warmup 11 sec latency 38.529 ms 1 2185 2.23 MB/sec warmup 12 sec latency 17.054 ms 1 2402 2.15 MB/sec warmup 13 sec latency 38.530 ms 1 2594 2.07 MB/sec warmup 14 sec latency 30.135 ms 1 2969 2.23 MB/sec warmup 15 sec latency 34.083 ms 1 3277 2.21 MB/sec warmup 16 sec latency 71.388 ms 1 3579 2.26 MB/sec warmup 17 sec latency 20.344 ms 1 3776 2.21 MB/sec warmup 18 sec latency 26.197 ms 1 3907 2.11 MB/sec warmup 19 sec latency 26.706 ms dbench killed by signal 15 stopping dbench on /mnt/lustre at Thu Apr 18 19:35:05 EDT 2024 with return code 0 14199 pts/0 S+ 0:00 dbench -c client.txt 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100 killed dbench main pid 14199 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished PASS 26 (272s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 28: lock replay should be ordered: waiting after granted ========================================================== 19:35:11 (1713483311) 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00535727 s, 765 kB/s fail_loc=0x80000324 fail_loc=0x32a Failing ost1 on oleg130-server Stopping /mnt/lustre-ost1 (opts:) on oleg130-server 19:35:15 (1713483315) shut down Failover ost1 to oleg130-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00541096 s, 757 kB/s pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-OST0000 19:35:30 (1713483330) targets are mounted 19:35:30 (1713483330) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 28 (27s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 29: replay vs update with the same xid ========================================================== 19:35:40 (1713483340) SKIP: replay-dual test_29 needs >= 2 clients SKIP 29 (1s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 30: layout lock replay is not blocked on IO ========================================================== 19:35:43 (1713483343) 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.0136211 s, 3.0 MB/s 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.0149734 s, 2.7 MB/s fail_loc=0x32e fail_val=4 Failing mds1 on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server 19:35:46 (1713483346) shut down Failover mds1 to oleg130-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 19:36:00 (1713483360) targets are mounted 19:36:00 (1713483360) facet_failover done 160+0 records in 160+0 records out 81920 bytes (82 kB) copied, 23.1755 s, 3.5 kB/s oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 30 (28s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 31: deadlock on file_remove_privs and occupied mod rpc slots ========================================================== 19:36:13 (1713483373) Failing ost1 on oleg130-server Stopping /mnt/lustre-ost1 (opts:) on oleg130-server 19:36:16 (1713483376) shut down Creating to objid 3169 on ost lustre-OST0000... total: 32 open/close in 0.29 seconds: 109.70 ops/second at_max=0 fail_loc=0x80001420 file /mnt/lustre2/d31.replay-dual/mdtdir/f31.replay-dual is ready Failover ost1 to oleg130-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-OST0000 19:36:30 (1713483390) targets are mounted 19:36:30 (1713483390) facet_failover done oleg130-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL IDLE state after 0 sec pids: 18832 18833 18838 18839 18840 18841 18842 18843 18844 at_max=600 PASS 31 (23s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 32: gap in update llog shouldn't break recovery ========================================================== 19:36:39 (1713483399) fail_loc=0x0000131d fail_val=10 fail_loc=0x726 Stopping /mnt/lustre-mds2 (opts:) on oleg130-server Stopping /mnt/lustre-mds1 (opts:) on oleg130-server fail_loc=0 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2452 1285208 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2028 1285660 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1548 3605472 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3072 7210968 1% /mnt/lustre PASS 32 (18s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 33: Check for OBD_INCOMPAT_MULTI_RPCS in last_rcvd after abort_recovery ========================================================== 19:36:58 (1713483418) at_min=60 Stopping /mnt/lustre-mds2 (opts:) on oleg130-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 oleg130-client.virtnet: executing wait_import_state_mount REPLAY_WAIT mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in REPLAY_WAIT state after 0 sec oleg130-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec affected facets: mds2 oleg130-server: oleg130-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg130-server: *.lustre-MDT0001.recovery_status status: COMPLETE Stopping /mnt/lustre-mds2 (opts:) on oleg130-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg130-server: oleg130-server.virtnet: executing set_default_debug -1 all pdsh@oleg130-client: oleg130-server: ssh exited with exit code 1 Started lustre-MDT0001 Starting client: oleg130-client.virtnet: -o user_xattr,flock oleg130-server@tcp:/lustre /mnt/lustre2 oleg130-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL DISCONN state after 3 sec affected facets: mds2 oleg130-server: oleg130-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg130-server: *.lustre-MDT0001.recovery_status status: COMPLETE at_min=5 PASS 33 (41s) debug_raw_pointers=0 debug_raw_pointers=0 == replay-dual test complete, duration 2783 sec ========== 19:37:40 (1713483460) === replay-dual: start cleanup 19:37:40 (1713483460) === Stopping clients: oleg130-client.virtnet /mnt/lustre2 (opts:) Stopping client oleg130-client.virtnet /mnt/lustre2 opts: === replay-dual: finish cleanup 19:37:42 (1713483462) ===