-----============= acceptance-small: replay-dual ============----- Wed Apr 17 10:29:57 EDT 2024 excepting tests: 14b 21b skipping tests SLOW=no: 21b === replay-dual: start setup 10:30:01 (1713364201) === Starting client oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 Started clients oleg110-client.virtnet: 192.168.201.110@tcp:/lustre on /mnt/lustre2 type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) oleg110-client.virtnet: executing check_config_client /mnt/lustre oleg110-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg110-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8800b60f7000.idle_timeout=debug osc.lustre-OST0000-osc-ffff88012c498800.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800b60f7000.idle_timeout=debug osc.lustre-OST0001-osc-ffff88012c498800.idle_timeout=debug disable quota as required oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all osd-ldiskfs.track_declares_assert=1 === replay-dual: finish setup 10:30:11 (1713364211) === debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0a: expired recovery with lost client ========================================================== 10:30:12 (1713364212) Check file is LU482_FAILED=/tmp/replay-dual.lu482.5Ijifc UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 50 open/close in 0.69 seconds: 72.83 ops/second fail_loc=0x80000514 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:30:17 (1713364217) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:30:33 (1713364233) targets are mounted 10:30:33 (1713364233) facet_failover done Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 - unlinked 0 (time 1713364312 ; total 0 ; last 0) total: 50 unlinks in 1 seconds: 50.000000 unlinks/second PASS 0a (102s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 0b: lost client during waiting for next transno ========================================================== 10:31:56 (1713364316) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:32:00 (1713364320) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:32:14 (1713364334) targets are mounted 10:32:14 (1713364334) facet_failover done Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 PASS 0b (95s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 1: |X| simple create ================= 10:33:32 (1713364412) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:33:36 (1713364416) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:33:51 (1713364431) targets are mounted 10:33:51 (1713364431) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 1 (26s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 2: |X| mkdir adir ==================== 10:34:00 (1713364440) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:34:04 (1713364444) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:34:18 (1713364458) targets are mounted 10:34:18 (1713364458) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 2 (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 3: |X| mkdir adir, mkdir adir/bdir === 10:34:27 (1713364467) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1768 1285920 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1612 1286076 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:34:30 (1713364470) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:34:44 (1713364484) targets are mounted 10:34:44 (1713364484) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 3 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 4: |X| mkdir adir (-EEXIST), mkdir adir/bdir ========================================================== 10:34:53 (1713364493) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre mkdir: cannot create directory '/mnt/lustre/adir': File exists Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:35:02 (1713364502) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:35:16 (1713364516) targets are mounted 10:35:16 (1713364516) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 4 (30s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 5: open, unlink |X| close ============ 10:35:25 (1713364525) multiop /mnt/lustre2/a vo_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.7504 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:35:28 (1713364528) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:35:43 (1713364543) targets are mounted 10:35:43 (1713364543) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 5 (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 6: open1, open2, unlink |X| close1 [fail mds1] close2 ========================================================== 10:35:52 (1713364552) multiop /mnt/lustre2/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7504 multiop /mnt/lustre/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7504 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:35:56 (1713364556) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:36:10 (1713364570) targets are mounted 10:36:10 (1713364570) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 6 (25s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 8: replay of resent request ========== 10:36:18 (1713364578) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x119 fail_loc=0 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:36:38 (1713364598) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:36:52 (1713364612) targets are mounted 10:36:52 (1713364612) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 8 (41s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 9: resending a replayed create ======= 10:37:01 (1713364621) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:37:04 (1713364624) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:37:18 (1713364638) targets are mounted 10:37:18 (1713364638) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (36s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 10: resending a replayed unlink ====== 10:37:38 (1713364658) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:37:42 (1713364662) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:37:56 (1713364676) targets are mounted 10:37:56 (1713364676) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 10 (39s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 11: both clients timeout during replay ========================================================== 10:38:18 (1713364698) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x0119 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:38:22 (1713364702) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:38:36 (1713364716) targets are mounted 10:38:36 (1713364716) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 17 sec fail_loc=0 PASS 11 (38s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 12: open resend timeout ============== 10:38:57 (1713364737) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre multiop /mnt/lustre/f12.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7504 fail_loc=0x80000302 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:39:01 (1713364741) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:39:15 (1713364755) targets are mounted 10:39:15 (1713364755) facet_failover done fail_loc=0 /mnt/lustre/f12.replay-dual /mnt/lustre/f12.replay-dual has type file OK PASS 12 (23s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 13: close resend timeout ============= 10:39:21 (1713364761) multiop /mnt/lustre/f13.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.7504 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre fail_loc=0x80000115 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:39:25 (1713364765) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:39:38 (1713364778) targets are mounted 10:39:38 (1713364778) facet_failover done fail_loc=0 /mnt/lustre/f13.replay-dual /mnt/lustre/f13.replay-dual has type file OK PASS 13 (22s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_14b skipping ALWAYS excluded test 14b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15a: timeout waiting for lost client during replay, 1 client completes ========================================================== 10:39:45 (1713364785) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.21 seconds: 118.13 ops/second total: 1 open/close in 0.01 seconds: 103.65 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:39:49 (1713364789) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:40:02 (1713364802) targets are mounted 10:40:02 (1713364802) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713364875 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 PASS 15a (92s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 15c: remove multiple OST orphans ===== 10:41:18 (1713364878) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:41:50 (1713364910) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:42:04 (1713364924) targets are mounted 10:42:04 (1713364924) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 PASS 15c (120s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 16: fail MDS during recovery (3571) == 10:43:20 (1713365000) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1912 1285776 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre total: 25 open/close in 0.17 seconds: 150.81 ops/second total: 1 open/close in 0.01 seconds: 125.27 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:43:24 (1713365004) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:43:37 (1713365017) targets are mounted 10:43:37 (1713365017) facet_failover done Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:43:59 (1713365039) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:44:12 (1713365052) targets are mounted 10:44:12 (1713365052) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713365125 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 PASS 16 (127s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 17: fail OST during recovery (3571) == 10:45:28 (1713365128) total: 25 open/close in 0.18 seconds: 139.01 ops/second total: 1 open/close in 0.01 seconds: 106.72 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1984 1285704 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3048 7210992 1% /mnt/lustre Failing ost1 on oleg110-server Stopping /mnt/lustre-ost1 (opts:) on oleg110-server 10:45:32 (1713365132) shut down Failover ost1 to oleg110-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-OST0000 10:45:46 (1713365146) targets are mounted 10:45:46 (1713365146) facet_failover done Failing ost1 on oleg110-server Stopping /mnt/lustre-ost1 (opts:) on oleg110-server 10:46:07 (1713365167) shut down Failover ost1 to oleg110-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-OST0000 10:46:21 (1713365181) targets are mounted 10:46:21 (1713365181) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713365258 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 PASS 17 (132s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 18: ldlm_handle_enqueue succeeds on evicted export (3822) ========================================================== 10:47:42 (1713365262) debug=+dlmtrace fail_loc=0x8000030b using seed 151152081 running for 500 iterations total: 500 stats in 0 seconds: inf stats/second ldlm.namespaces.MGC192.168.201.110@tcp.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800a99d8800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800abb74800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800a99d8800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800abb74800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800a99d8800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800abb74800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800a99d8800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800abb74800.early_lock_cancel=0 fail_loc=0x80000305 Error in opening file "/mnt/lustre2/d18.replay-dual/f18.replay-dual"(flags=O_RDONLY) 2: No such file or directory ldlm.namespaces.MGC192.168.201.110@tcp.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800a99d8800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800abb74800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800a99d8800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0001-mdc-ffff8800abb74800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800a99d8800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800abb74800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800a99d8800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800abb74800.early_lock_cancel=1 fail_loc=0 fail_loc=0 PASS 18 (46s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 19: resend of open request =========== 10:48:30 (1713365310) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre fail_loc=0x157 - open/close 0 (time 1713365399.26 total 86.04 last 0.00) total: 1 open/close in 86.04 seconds: 0.01 ops/second fail_loc=0 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:50:00 (1713365400) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:50:14 (1713365414) targets are mounted 10:50:14 (1713365414) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 19 (112s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 20: recovery time is not increasing == 10:50:24 (1713365424) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:50:28 (1713365428) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:50:43 (1713365443) targets are mounted 10:50:43 (1713365443) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1988 1285700 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1796 1285892 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:53:09 (1713365589) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:53:23 (1713365603) targets are mounted 10:53:23 (1713365603) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 PASS 20 (325s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 21a: commit on sharing =============== 10:55:51 (1713365751) mdt.lustre-MDT0000.commit_on_sharing=1 mdt.lustre-MDT0001.commit_on_sharing=1 Replay barrier on lustre-MDT0000 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:55:55 (1713365755) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:56:09 (1713365769) targets are mounted 10:56:09 (1713365769) facet_failover done Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 mdt.lustre-MDT0000.commit_on_sharing=0 mdt.lustre-MDT0001.commit_on_sharing=0 PASS 21a (163s) debug_raw_pointers=0 debug_raw_pointers=0 SKIP: replay-dual test_21b skipping SLOW test 21b debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply & fail, c2 mkdir dir1/dir ========================================================== 10:58:36 (1713365916) fail_loc=0x119 Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 10:58:38 (1713365918) shut down Failover mds2 to oleg110-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 10:58:51 (1713365931) targets are mounted 10:58:51 (1713365931) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2000 1285688 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1824 1285864 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 88.39 ops/second total: 2 open/close in 0.01 seconds: 189.97 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 10:59:00 (1713365940) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 10:59:14 (1713365954) targets are mounted 10:59:14 (1713365954) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22a (45s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22b: c1 lfs mkdir -i 1 d1, M1 drop reply & fail M0/M1, c2 mkdir d1/dir ========================================================== 10:59:23 (1713365963) fail_loc=0x119 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 10:59:27 (1713365967) shut down Failover mds1 to oleg110-server mount facets: mds1 Failover mds2 to oleg110-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 10:59:53 (1713365993) targets are mounted 10:59:53 (1713365993) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 79.80 ops/second total: 2 open/close in 0.01 seconds: 187.34 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:00:03 (1713366003) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:00:19 (1713366019) targets are mounted 11:00:19 (1713366019) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22b (63s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22c: c1 lfs mkdir -i 1 d1, M1 drop update & fail M1, c2 mkdir d1/dir ========================================================== 11:00:28 (1713366028) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:00:32 (1713366032) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:00:45 (1713366045) targets are mounted 11:00:45 (1713366045) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1996 1285692 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1820 1285868 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.03 seconds: 62.80 ops/second total: 2 open/close in 0.01 seconds: 159.29 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:00:55 (1713366055) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:01:10 (1713366070) targets are mounted 11:01:10 (1713366070) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22c (50s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 22d: c1 lfs mkdir -i 1 d1, M1 drop update & fail M0/M1,c2 mkdir d1/dir ========================================================== 11:01:20 (1713366080) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 11:01:32 (1713366092) shut down Failover mds1 to oleg110-server mount facets: mds1 Failover mds2 to oleg110-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 11:01:54 (1713366114) targets are mounted 11:01:54 (1713366114) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1960 1285728 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1784 1285904 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.04 seconds: 45.21 ops/second total: 2 open/close in 0.02 seconds: 99.86 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:02:04 (1713366124) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:02:18 (1713366138) targets are mounted 11:02:18 (1713366138) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 22d (65s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23a: c1 rmdir d1, M1 drop reply and fail, client2 mkdir d1 ========================================================== 11:02:27 (1713366147) fail_loc=0x119 fail_loc=0 Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 11:02:35 (1713366155) shut down Failover mds2 to oleg110-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 11:02:49 (1713366169) targets are mounted 11:02:49 (1713366169) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.01 seconds: 145.81 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:02:58 (1713366178) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:03:12 (1713366192) targets are mounted 11:03:12 (1713366192) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23a (53s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23b: c1 rmdir d1, M1 drop reply and fail M0/M1, c2 mkdir d1 ========================================================== 11:03:22 (1713366202) fail_loc=0x119 fail_loc=0 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 11:03:32 (1713366212) shut down Failover mds1 to oleg110-server mount facets: mds1 Failover mds2 to oleg110-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 11:03:53 (1713366233) targets are mounted 11:03:53 (1713366233) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.02 seconds: 113.66 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:04:15 (1713366255) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:04:28 (1713366268) targets are mounted 11:04:28 (1713366268) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23b (73s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23c: c1 rmdir d1, M0 drop update reply and fail M0, c2 mkdir d1 ========================================================== 11:04:37 (1713366277) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:04:40 (1713366280) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:04:53 (1713366293) targets are mounted 11:04:53 (1713366293) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1992 1285696 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1816 1285872 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.01 seconds: 139.33 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:05:02 (1713366302) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:05:16 (1713366316) targets are mounted 11:05:16 (1713366316) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23c (46s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 23d: c1 rmdir d1, M0 drop update reply and fail M0/M1, c2 mkdir d1 ========================================================== 11:05:25 (1713366325) fail_loc=0x1701 fail_loc=0 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 11:05:36 (1713366336) shut down Failover mds1 to oleg110-server mount facets: mds1 Failover mds2 to oleg110-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 11:05:57 (1713366357) targets are mounted 11:05:57 (1713366357) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 1956 1285732 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 1780 1285908 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1540 3605480 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605496 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3064 7210976 1% /mnt/lustre total: 2 open/close in 0.01 seconds: 143.96 ops/second Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:06:06 (1713366366) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:06:20 (1713366380) targets are mounted 11:06:20 (1713366380) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 23d (62s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 24: reconstruct on non-existing object ========================================================== 11:06:28 (1713366388) fail_loc=0x119 fail_loc=0 truncate: cannot truncate '/mnt/lustre/f24.replay-dual' to length 100: No such file or directory PASS 24 (87s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 25: replay|resend ==================== 11:07:57 (1713366477) 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.00320429 s, 160 kB/s fail_loc=0x304 fail_loc=0x80000325 Failing ost1 on oleg110-server Stopping /mnt/lustre-ost1 (opts:) on oleg110-server 11:07:59 (1713366479) shut down Failover ost1 to oleg110-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-OST0000 11:08:12 (1713366492) targets are mounted 11:08:12 (1713366492) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 6528: 4686 Terminated LUSTRE="/home/green/git/lustre-release/lustre" bash -c "multiop /mnt/lustre2/f25.replay-dual Ow512" fail_loc=0 PASS 25 (20s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 26: dbench and tar with mds failover ========================================================== 11:08:19 (1713366499) Starting client oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre Started clients oleg110-client.virtnet: 192.168.201.110@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt,statfs_project) Started tar loop with pid 6307 Started dbench loop with 6308 striped dir -i0 -c2 -H all_char /mnt/lustre2/d26.replay-dual/run_dbench striped dir -i0 -c2 -H fnv_1a_64 /mnt/lustre/d26.replay-dual/run_tar looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Wed Apr 17 11:08:20 EDT 2024 waiting for dbench pid 6349 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs failed to create barrier semaphore 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 221 8.56 MB/sec warmup 1 sec latency 25.363 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2288 1285400 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2108 1285580 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1544 3605476 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 26100 3555916 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 27644 7161392 1% /mnt/lustre 1 485 8.47 MB/sec warmup 2 sec latency 24.162 ms 1 730 7.10 MB/sec warmup 3 sec latency 337.869 ms 1 1058 5.72 MB/sec warmup 4 sec latency 14.019 ms test_26 fail mds1 1 times 1 1426 5.20 MB/sec warmup 5 sec latency 16.098 ms Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 1 1701 4.37 MB/sec warmup 6 sec latency 119.344 ms 11:08:27 (1713366507) shut down 1 1701 3.75 MB/sec warmup 7 sec latency 1119.528 ms 1 1701 3.28 MB/sec warmup 8 sec latency 2119.763 ms 1 1701 2.91 MB/sec warmup 9 sec latency 3120.105 ms 1 1701 2.62 MB/sec warmup 10 sec latency 4120.376 ms 1 1701 2.38 MB/sec warmup 11 sec latency 5120.582 ms 1 1701 2.19 MB/sec warmup 12 sec latency 6120.803 ms 1 1701 2.02 MB/sec warmup 13 sec latency 7121.065 ms 1 1701 1.87 MB/sec warmup 14 sec latency 8121.432 ms 1 1701 1.75 MB/sec warmup 15 sec latency 9121.697 ms 1 1701 1.64 MB/sec warmup 16 sec latency 10121.970 ms Failover mds1 to oleg110-server mount facets: mds1 1 1701 1.54 MB/sec warmup 17 sec latency 11122.217 ms 1 1701 1.46 MB/sec warmup 18 sec latency 12122.374 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 1701 1.38 MB/sec warmup 19 sec latency 13122.596 ms oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:08:41 (1713366521) targets are mounted 11:08:41 (1713366521) facet_failover done 1 1701 0.00 MB/sec execute 1 sec latency 15122.813 ms 1 1701 0.00 MB/sec execute 2 sec latency 16122.967 ms 1 1701 0.00 MB/sec execute 3 sec latency 17123.257 ms 1 1701 0.00 MB/sec execute 4 sec latency 18123.461 ms 1 1771 0.01 MB/sec execute 5 sec latency 18986.397 ms oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 2093 0.07 MB/sec execute 6 sec latency 13.840 ms 1 2347 0.24 MB/sec execute 7 sec latency 47.878 ms 1 2594 0.35 MB/sec execute 8 sec latency 17.953 ms 1 3197 0.90 MB/sec execute 9 sec latency 18.945 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2944 1284744 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2588 1285100 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 5232 3592036 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 32528 3566164 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 37760 7158200 1% /mnt/lustre 1 3669 1.25 MB/sec execute 10 sec latency 13.905 ms 1 3889 1.23 MB/sec execute 11 sec latency 21.952 ms 1 4093 1.16 MB/sec execute 12 sec latency 16.039 ms 1 4306 1.09 MB/sec execute 13 sec latency 23.135 ms test_26 fail mds2 2 times Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 1 4591 1.11 MB/sec execute 14 sec latency 20.662 ms 11:08:55 (1713366535) shut down 1 4632 1.05 MB/sec execute 15 sec latency 884.514 ms 1 4632 0.99 MB/sec execute 16 sec latency 1884.665 ms 1 4632 0.93 MB/sec execute 17 sec latency 2884.853 ms 1 4632 0.88 MB/sec execute 18 sec latency 3885.014 ms 1 4632 0.83 MB/sec execute 19 sec latency 4885.203 ms 1 4632 0.79 MB/sec execute 20 sec latency 5885.447 ms 1 4632 0.75 MB/sec execute 21 sec latency 6885.663 ms 1 4632 0.72 MB/sec execute 22 sec latency 7885.942 ms 1 4632 0.69 MB/sec execute 23 sec latency 8886.135 ms 1 4632 0.66 MB/sec execute 24 sec latency 9886.284 ms Failover mds2 to oleg110-server mount facets: mds2 1 4632 0.63 MB/sec execute 25 sec latency 10886.479 ms 1 4632 0.61 MB/sec execute 26 sec latency 11886.688 ms 1 4632 0.58 MB/sec execute 27 sec latency 12886.846 ms 1 4632 0.56 MB/sec execute 28 sec latency 13887.034 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 4632 0.54 MB/sec execute 29 sec latency 14887.274 ms oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all 1 4632 0.53 MB/sec execute 30 sec latency 15887.470 ms pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 11:09:11 (1713366551) targets are mounted 11:09:11 (1713366551) facet_failover done 1 4632 0.51 MB/sec execute 31 sec latency 16887.684 ms 1 4632 0.49 MB/sec execute 32 sec latency 17887.889 ms 1 4632 0.48 MB/sec execute 33 sec latency 18888.112 ms 1 4635 0.46 MB/sec execute 34 sec latency 19872.120 ms oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4862 0.47 MB/sec execute 35 sec latency 21.102 ms 1 5098 0.53 MB/sec execute 36 sec latency 28.483 ms 1 5252 0.52 MB/sec execute 37 sec latency 28.594 ms 1 5413 0.51 MB/sec execute 38 sec latency 26.177 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 3932 1283756 1% /mnt/lustre[MDT:0] 1 5627 0.50 MB/sec execute 39 sec latency 35.673 ms lustre-MDT0001_UUID 1414116 2424 1285264 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 11344 3576928 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 38888 3556696 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 50232 7133624 1% /mnt/lustre 1 5905 0.52 MB/sec execute 40 sec latency 19.510 ms 1 5983 0.51 MB/sec execute 41 sec latency 554.574 ms 1 6167 0.52 MB/sec execute 42 sec latency 28.472 ms 1 6548 0.61 MB/sec execute 43 sec latency 25.659 ms test_26 fail mds1 3 times Failing mds1 on oleg110-server 1 6939 0.64 MB/sec execute 44 sec latency 26.730 ms Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 1 7137 0.70 MB/sec execute 45 sec latency 541.913 ms 1 7137 0.68 MB/sec execute 46 sec latency 1542.120 ms 1 7137 0.67 MB/sec execute 47 sec latency 2542.296 ms 1 7137 0.65 MB/sec execute 48 sec latency 3542.510 ms 1 7137 0.64 MB/sec execute 49 sec latency 4542.637 ms 1 7137 0.63 MB/sec execute 50 sec latency 5542.813 ms 1 7137 0.61 MB/sec execute 51 sec latency 6543.007 ms 11:09:32 (1713366572) shut down 1 7137 0.60 MB/sec execute 52 sec latency 7543.143 ms 1 7137 0.59 MB/sec execute 53 sec latency 8543.313 ms 1 7137 0.58 MB/sec execute 54 sec latency 9543.458 ms 1 7137 0.57 MB/sec execute 55 sec latency 10543.750 ms 1 7137 0.56 MB/sec execute 56 sec latency 11544.058 ms 1 7137 0.55 MB/sec execute 57 sec latency 12544.346 ms 1 7137 0.54 MB/sec execute 58 sec latency 13544.594 ms 1 7137 0.53 MB/sec execute 59 sec latency 14544.840 ms 1 7137 0.52 MB/sec execute 60 sec latency 15545.067 ms 1 7137 0.51 MB/sec execute 61 sec latency 16545.426 ms Failover mds1 to oleg110-server mount facets: mds1 1 7137 0.51 MB/sec execute 62 sec latency 17545.689 ms 1 7137 0.50 MB/sec execute 63 sec latency 18545.899 ms 1 7137 0.49 MB/sec execute 64 sec latency 19546.186 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 7137 0.48 MB/sec execute 65 sec latency 20546.431 ms oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all 1 7137 0.47 MB/sec execute 66 sec latency 21546.663 ms pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:09:47 (1713366587) targets are mounted 11:09:47 (1713366587) facet_failover done 1 7137 0.47 MB/sec execute 67 sec latency 22546.957 ms 1 7137 0.46 MB/sec execute 68 sec latency 23547.237 ms 1 7137 0.45 MB/sec execute 69 sec latency 24547.603 ms 1 7137 0.45 MB/sec execute 70 sec latency 25547.835 ms 1 7137 0.44 MB/sec execute 71 sec latency 26549.560 ms oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 7240 0.44 MB/sec execute 72 sec latency 27065.114 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 7471 0.45 MB/sec execute 73 sec latency 16.506 ms 1 7679 0.45 MB/sec execute 74 sec latency 18.713 ms 1 7985 0.45 MB/sec execute 75 sec latency 18.337 ms 1 8276 0.46 MB/sec execute 76 sec latency 15.147 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 4312 1283376 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2364 1285324 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 22316 3582532 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 43448 3561572 2% /mnt/lustre[OST:1] filesystem_summary: 7666232 65764 7144104 1% /mnt/lustre 1 8617 0.49 MB/sec execute 77 sec latency 25.083 ms 1 8799 0.49 MB/sec execute 78 sec latency 386.223 ms 1 9054 0.49 MB/sec execute 79 sec latency 14.877 ms 1 9364 0.48 MB/sec execute 80 sec latency 13.867 ms test_26 fail mds2 4 times Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 1 9688 0.51 MB/sec execute 81 sec latency 14.392 ms 11:10:02 (1713366602) shut down 1 10308 0.56 MB/sec execute 82 sec latency 14.133 ms 1 10644 0.61 MB/sec execute 83 sec latency 16.797 ms 1 10895 0.62 MB/sec execute 84 sec latency 23.611 ms 1 11045 0.61 MB/sec execute 85 sec latency 20.293 ms 1 11173 0.61 MB/sec execute 86 sec latency 439.486 ms 1 11173 0.60 MB/sec execute 87 sec latency 1439.822 ms 1 11173 0.59 MB/sec execute 88 sec latency 2440.089 ms 1 11173 0.59 MB/sec execute 89 sec latency 3440.308 ms 1 11173 0.58 MB/sec execute 90 sec latency 4440.508 ms 1 11173 0.57 MB/sec execute 91 sec latency 5440.752 ms Failover mds2 to oleg110-server mount facets: mds2 1 11173 0.57 MB/sec execute 92 sec latency 6440.890 ms 1 11173 0.56 MB/sec execute 93 sec latency 7441.006 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 11173 0.55 MB/sec execute 94 sec latency 8441.106 ms oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all 1 11173 0.55 MB/sec execute 95 sec latency 9441.247 ms pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 11:10:16 (1713366616) targets are mounted 11:10:16 (1713366616) facet_failover done 1 11173 0.54 MB/sec execute 96 sec latency 10441.396 ms 1 11173 0.54 MB/sec execute 97 sec latency 11441.626 ms 1 11173 0.53 MB/sec execute 98 sec latency 12441.821 ms 1 11173 0.53 MB/sec execute 99 sec latency 13442.039 ms oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 cleanup 100 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 0 cleanup 100 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 1653 26.661 27065.098 Close 1192 2.106 14.865 Rename 67 14.083 28.577 Unlink 350 5.275 21.736 Qpathinfo 1541 24.904 19872.105 Qfileinfo 259 0.548 3.446 Qfsinfo 278 0.999 6.263 Sfileinfo 118 7.631 14.091 Find 589 35.197 18986.385 WriteX 796 2.584 19.607 ReadX 2650 0.119 31.556 LockX 6 2.108 3.219 UnlockX 6 2.134 3.167 Flush 102 18.253 554.554 Throughput 0.526174 MB/sec 1 clients 1 procs max_latency=27065.114 ms stopping dbench on /mnt/lustre at Wed Apr 17 11:10:21 EDT 2024 with return code 0 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished striped dir -i0 -c2 -H crush /mnt/lustre2/d26.replay-dual/run_dbench looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Wed Apr 17 11:10:22 EDT 2024 waiting for dbench pid 10777 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 255 9.66 MB/sec warmup 1 sec latency 21.703 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5020 1282668 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3080 1284608 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 36188 3539732 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 13568 3585176 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 49756 7124908 1% /mnt/lustre 1 438 7.69 MB/sec warmup 2 sec latency 27.927 ms 1 735 7.10 MB/sec warmup 3 sec latency 24.814 ms 1 878 5.42 MB/sec warmup 4 sec latency 412.339 ms 1 1159 4.60 MB/sec warmup 5 sec latency 18.717 ms test_26 fail mds1 5 times 1 1462 4.34 MB/sec warmup 6 sec latency 25.537 ms Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 1 1654 3.74 MB/sec warmup 7 sec latency 339.201 ms 11:10:30 (1713366630) shut down 1 1654 3.28 MB/sec warmup 8 sec latency 1339.422 ms 1 1654 2.91 MB/sec warmup 9 sec latency 2339.670 ms 1 1654 2.62 MB/sec warmup 10 sec latency 3339.922 ms 1 1654 2.38 MB/sec warmup 11 sec latency 4340.235 ms 1 1654 2.18 MB/sec warmup 12 sec latency 5340.500 ms 1 1654 2.02 MB/sec warmup 13 sec latency 6340.722 ms 1 1654 1.87 MB/sec warmup 14 sec latency 7340.961 ms 1 1654 1.75 MB/sec warmup 15 sec latency 8341.238 ms 1 1654 1.64 MB/sec warmup 16 sec latency 9341.546 ms 1 1654 1.54 MB/sec warmup 17 sec latency 10341.842 ms Failover mds1 to oleg110-server mount facets: mds1 1 1654 1.46 MB/sec warmup 18 sec latency 11342.037 ms 1 1654 1.38 MB/sec warmup 19 sec latency 12342.255 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 1 1654 0.00 MB/sec execute 1 sec latency 14342.514 ms Started lustre-MDT0000 11:10:44 (1713366644) targets are mounted 11:10:44 (1713366644) facet_failover done 1 1654 0.00 MB/sec execute 2 sec latency 15342.645 ms 1 1654 0.00 MB/sec execute 3 sec latency 16342.927 ms 1 1654 0.00 MB/sec execute 4 sec latency 17343.215 ms 1 1654 0.00 MB/sec execute 5 sec latency 18343.391 ms 1 1689 0.00 MB/sec execute 6 sec latency 19201.542 ms oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 2016 0.05 MB/sec execute 7 sec latency 15.904 ms 1 2394 0.22 MB/sec execute 8 sec latency 13.203 ms 1 2881 0.67 MB/sec execute 9 sec latency 16.324 ms 1 3365 0.92 MB/sec execute 10 sec latency 18.928 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 5440 1282248 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 3488 1284200 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 37612 3551576 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 18848 3571600 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 56460 7123176 1% /mnt/lustre 1 3783 1.23 MB/sec execute 11 sec latency 25.521 ms 1 3987 1.15 MB/sec execute 12 sec latency 19.634 ms 1 4282 1.09 MB/sec execute 13 sec latency 20.068 ms 1 4550 1.06 MB/sec execute 14 sec latency 18.812 ms test_26 fail mds2 6 times Failing mds2 on oleg110-server Stopping /mnt/lustre-mds2 (opts:) on oleg110-server 1 4957 1.26 MB/sec execute 15 sec latency 19.062 ms 1 4999 1.18 MB/sec execute 16 sec latency 808.975 ms 1 4999 1.11 MB/sec execute 17 sec latency 1809.096 ms 1 4999 1.05 MB/sec execute 18 sec latency 2809.185 ms 1 4999 1.00 MB/sec execute 19 sec latency 3809.305 ms 1 4999 0.95 MB/sec execute 20 sec latency 4809.479 ms 11:11:03 (1713366663) shut down 1 4999 0.90 MB/sec execute 21 sec latency 5809.610 ms 1 4999 0.86 MB/sec execute 22 sec latency 6809.699 ms 1 4999 0.82 MB/sec execute 23 sec latency 7809.805 ms 1 4999 0.79 MB/sec execute 24 sec latency 8809.919 ms 1 4999 0.76 MB/sec execute 25 sec latency 9810.028 ms 1 4999 0.73 MB/sec execute 26 sec latency 10810.153 ms 1 4999 0.70 MB/sec execute 27 sec latency 11810.257 ms 1 4999 0.68 MB/sec execute 28 sec latency 12810.350 ms 1 4999 0.65 MB/sec execute 29 sec latency 13810.527 ms 1 4999 0.63 MB/sec execute 30 sec latency 14810.710 ms Failover mds2 to oleg110-server mount facets: mds2 1 4999 0.61 MB/sec execute 31 sec latency 15810.839 ms 1 4999 0.59 MB/sec execute 32 sec latency 16810.981 ms Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 1 4999 0.57 MB/sec execute 33 sec latency 17811.144 ms oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all 1 4999 0.56 MB/sec execute 34 sec latency 18811.315 ms pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 11:11:17 (1713366677) targets are mounted 11:11:17 (1713366677) facet_failover done 1 4999 0.54 MB/sec execute 35 sec latency 19811.486 ms 1 4999 0.53 MB/sec execute 36 sec latency 20811.704 ms 1 4999 0.51 MB/sec execute 37 sec latency 21811.880 ms 1 4999 0.50 MB/sec execute 38 sec latency 22812.073 ms oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid 1 5089 0.49 MB/sec execute 39 sec latency 23496.472 ms mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec 1 5385 0.48 MB/sec execute 40 sec latency 15.696 ms 1 5720 0.48 MB/sec execute 41 sec latency 26.094 ms 1 6033 0.51 MB/sec execute 42 sec latency 42.133 ms 1 6468 0.61 MB/sec execute 43 sec latency 18.316 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2924 1284764 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2600 1285088 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 40124 3564852 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 21236 3585036 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 61360 7149888 1% /mnt/lustre 1 6839 0.64 MB/sec execute 44 sec latency 62.400 ms 1 7197 0.70 MB/sec execute 45 sec latency 15.353 ms 1 7426 0.71 MB/sec execute 46 sec latency 18.701 ms 1 7593 0.70 MB/sec execute 47 sec latency 21.285 ms test_26 fail mds1 7 times Failing mds1 on oleg110-server striped dir -i0 -c2 -H crush /mnt/lustre/d26.replay-dual/run_tar Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 1 7844 0.69 MB/sec execute 48 sec latency 65.383 ms 11:11:31 (1713366691) shut down 1 7844 0.68 MB/sec execute 49 sec latency 1065.579 ms 1 7844 0.67 MB/sec execute 50 sec latency 2065.802 ms 1 7844 0.65 MB/sec execute 51 sec latency 3066.055 ms 1 7844 0.64 MB/sec execute 52 sec latency 4066.282 ms 1 7844 0.63 MB/sec execute 53 sec latency 5066.497 ms 1 7844 0.62 MB/sec execute 54 sec latency 6066.762 ms 1 7844 0.61 MB/sec execute 55 sec latency 7067.008 ms 1 7844 0.60 MB/sec execute 56 sec latency 8067.179 ms 1 7844 0.58 MB/sec execute 57 sec latency 9067.353 ms 1 7844 0.57 MB/sec execute 58 sec latency 10067.548 ms Failover mds1 to oleg110-server mount facets: mds1 1 7844 0.57 MB/sec execute 59 sec latency 11067.868 ms 1 7844 0.56 MB/sec execute 60 sec latency 12068.033 ms Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 1 7844 0.55 MB/sec execute 61 sec latency 13068.213 ms oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all 1 7844 0.54 MB/sec execute 62 sec latency 14068.354 ms pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:11:45 (1713366705) targets are mounted 11:11:45 (1713366705) facet_failover done 1 7844 0.53 MB/sec execute 63 sec latency 15068.495 ms 1 7844 0.52 MB/sec execute 64 sec latency 16068.684 ms 1 7844 0.51 MB/sec execute 65 sec latency 17068.823 ms 1 7844 0.51 MB/sec execute 66 sec latency 18068.974 ms oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid 1 7928 0.50 MB/sec execute 67 sec latency 18739.253 ms mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 8196 0.51 MB/sec execute 68 sec latency 18.339 ms 1 8566 0.55 MB/sec execute 69 sec latency 18.161 ms 1 8858 0.55 MB/sec execute 70 sec latency 14.474 ms tar: Unexpected EOF in archive tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now dbench killed by signal 15 stopping dbench on /mnt/lustre at Wed Apr 17 11:11:53 EDT 2024 with return code 0 10777 pts/0 S+ 0:00 dbench -c client.txt 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100 killed dbench main pid 10777 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished PASS 26 (218s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 28: lock replay should be ordered: waiting after granted ========================================================== 11:11:59 (1713366719) 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00259779 s, 1.6 MB/s fail_loc=0x80000324 fail_loc=0x32a Failing ost1 on oleg110-server Stopping /mnt/lustre-ost1 (opts:) on oleg110-server 11:12:02 (1713366722) shut down Failover ost1 to oleg110-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-OST0000 11:12:16 (1713366736) targets are mounted 11:12:16 (1713366736) facet_failover done 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.0028271 s, 1.4 MB/s oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 28 (23s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 29: replay vs update with the same xid ========================================================== 11:12:23 (1713366743) SKIP: replay-dual test_29 needs >= 2 clients SKIP 29 (1s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 30: layout lock replay is not blocked on IO ========================================================== 11:12:26 (1713366746) 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.00855599 s, 4.8 MB/s 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.00895462 s, 4.6 MB/s fail_loc=0x32e fail_val=4 Failing mds1 on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server 11:12:28 (1713366748) shut down Failover mds1 to oleg110-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 11:12:41 (1713366761) targets are mounted 11:12:41 (1713366761) facet_failover done 160+0 records in 160+0 records out 81920 bytes (82 kB) copied, 20.6275 s, 4.0 kB/s oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 30 (24s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 31: deadlock on file_remove_privs and occupied mod rpc slots ========================================================== 11:12:52 (1713366772) Failing ost1 on oleg110-server Stopping /mnt/lustre-ost1 (opts:) on oleg110-server 11:12:54 (1713366774) shut down Creating to objid 3137 on ost lustre-OST0000... total: 32 open/close in 0.16 seconds: 200.54 ops/second at_max=0 fail_loc=0x80001420 file /mnt/lustre2/d31.replay-dual/mdtdir/f31.replay-dual is ready Failover ost1 to oleg110-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-OST0000 11:13:07 (1713366787) targets are mounted 11:13:07 (1713366787) facet_failover done oleg110-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL IDLE state after 0 sec pids: 18732 18733 18738 18739 18740 18741 18742 18743 18744 at_max=600 PASS 31 (19s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 32: gap in update llog shouldn't break recovery ========================================================== 11:13:13 (1713366793) fail_loc=0x0000131d fail_val=10 fail_loc=0x726 Stopping /mnt/lustre-mds2 (opts:) on oleg110-server Stopping /mnt/lustre-mds1 (opts:) on oleg110-server fail_loc=0 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 1414116 2484 1285204 1% /mnt/lustre[MDT:0] lustre-MDT0001_UUID 1414116 2056 1285632 1% /mnt/lustre[MDT:1] lustre-OST0000_UUID 3833116 1548 3605472 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3833116 1524 3605492 1% /mnt/lustre[OST:1] filesystem_summary: 7666232 3072 7210964 1% /mnt/lustre PASS 32 (13s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == replay-dual test 33: Check for OBD_INCOMPAT_MULTI_RPCS in last_rcvd after abort_recovery ========================================================== 11:13:28 (1713366808) at_min=60 Stopping /mnt/lustre-mds2 (opts:) on oleg110-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 oleg110-client.virtnet: executing wait_import_state_mount REPLAY_WAIT mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in REPLAY_WAIT state after 0 sec oleg110-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec affected facets: mds2 oleg110-server: oleg110-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg110-server: *.lustre-MDT0001.recovery_status status: COMPLETE Stopping /mnt/lustre-mds2 (opts:) on oleg110-server last_rcvd in /dev/mapper/mds2_flakey: incompat = 0x61c Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg110-server: oleg110-server.virtnet: executing set_default_debug -1 all pdsh@oleg110-client: oleg110-server: ssh exited with exit code 1 Started lustre-MDT0001 Starting client: oleg110-client.virtnet: -o user_xattr,flock oleg110-server@tcp:/lustre /mnt/lustre2 oleg110-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL DISCONN state after 4 sec affected facets: mds2 oleg110-server: oleg110-server.virtnet: executing _wait_recovery_complete *.lustre-MDT0001.recovery_status 1475 oleg110-server: *.lustre-MDT0001.recovery_status status: COMPLETE at_min=5 PASS 33 (35s) debug_raw_pointers=0 debug_raw_pointers=0 == replay-dual test complete, duration 2645 sec ========== 11:14:03 (1713366843) === replay-dual: start cleanup 11:14:04 (1713366844) === Stopping clients: oleg110-client.virtnet /mnt/lustre2 (opts:) Stopping client oleg110-client.virtnet /mnt/lustre2 opts: === replay-dual: finish cleanup 11:14:05 (1713366845) ===