-----============= acceptance-small: replay-dual ============----- Wed Apr 17 04:43:32 EDT 2024 excepting tests: 14b 21b 21b skipping tests SLOW=no: 21b Starting client oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 Started clients oleg150-client.virtnet: 192.168.201.150@tcp:/lustre on /mnt/lustre2 type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt) oleg150-client.virtnet: executing check_config_client /mnt/lustre oleg150-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg150-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8800aa8e4000.idle_timeout=debug osc.lustre-OST0000-osc-ffff8800b636e800.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800aa8e4000.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800b636e800.idle_timeout=debug disable quota as required oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 == replay-dual test 0a: expired recovery with lost client ========================================================== 04:43:41 (1713343421) Check file is LU482_FAILED=/tmp/replay-dual.lu482.btlbrx UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre total: 50 open/close in 0.30 seconds: 168.57 ops/second fail_loc=0x80000514 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 - unlinked 0 (time 1713343525 ; total 0 ; last 0) total: 50 unlinks in 0 seconds: inf unlinks/second PASS 0a (106s) == replay-dual test 0b: lost client during waiting for next transno ========================================================== 04:45:27 (1713343527) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 PASS 0b (120s) == replay-dual test 1: |X| simple create ================= 04:47:27 (1713343647) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 1 (22s) == replay-dual test 2: |X| mkdir adir ==================== 04:47:49 (1713343669) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 2 (23s) == replay-dual test 3: |X| mkdir adir, mkdir adir/bdir === 04:48:12 (1713343692) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 3 (22s) == replay-dual test 4: |X| mkdir adir (-EEXIST), mkdir adir/bdir ========================================================== 04:48:34 (1713343714) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre mkdir: cannot create directory '/mnt/lustre/adir': File exists Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 4 (24s) == replay-dual test 5: open, unlink |X| close ============ 04:48:58 (1713343738) multiop /mnt/lustre2/a vo_tSc TMPPIPE=/tmp/multiop_open_wait_pipe.6843 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 5 (22s) == replay-dual test 6: open1, open2, unlink |X| close1 [fail mds1] close2 ========================================================== 04:49:20 (1713343760) multiop /mnt/lustre2/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.6843 multiop /mnt/lustre/a vo_c TMPPIPE=/tmp/multiop_open_wait_pipe.6843 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 6 (23s) == replay-dual test 8: replay of resent request ========== 04:49:43 (1713343783) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre fail_loc=0x119 fail_loc=0 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 8 (33s) == replay-dual test 9: resending a replayed create ======= 04:50:16 (1713343816) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 9 (28s) == replay-dual test 10: resending a replayed unlink ====== 04:50:44 (1713343844) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210560 3200 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre fail_loc=0x80000119 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec fail_loc=0 PASS 10 (28s) == replay-dual test 11: both clients timeout during replay ========================================================== 04:51:12 (1713343872) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre fail_loc=0x0119 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 7 sec fail_loc=0 PASS 11 (26s) == replay-dual test 12: open resend timeout ============== 04:51:38 (1713343898) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre multiop /mnt/lustre/f12.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.6843 fail_loc=0x80000302 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 fail_loc=0 /mnt/lustre/f12.replay-dual /mnt/lustre/f12.replay-dual has type file OK PASS 12 (27s) == replay-dual test 13: close resend timeout ============= 04:52:05 (1713343925) multiop /mnt/lustre/f13.replay-dual vmo_c TMPPIPE=/tmp/multiop_open_wait_pipe.6843 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre fail_loc=0x80000115 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 fail_loc=0 /mnt/lustre/f13.replay-dual /mnt/lustre/f13.replay-dual has type file OK PASS 13 (21s) SKIP: replay-dual test_14b skipping ALWAYS excluded test 14b == replay-dual test 15a: timeout waiting for lost client during replay, 1 client completes ========================================================== 04:52:26 (1713343946) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre total: 25 open/close in 0.12 seconds: 202.19 ops/second total: 1 open/close in 0.01 seconds: 176.02 ops/second Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713344041 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 PASS 15a (96s) == replay-dual test 15c: remove multiple OST orphans ===== 04:54:02 (1713344042) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 PASS 15c (107s) == replay-dual test 16: fail MDS during recovery (3571) == 04:55:49 (1713344149) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre total: 25 open/close in 0.22 seconds: 114.74 ops/second total: 1 open/close in 0.01 seconds: 110.49 ops/second Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713344294 ; total 0 ; last 0) total: 25 unlinks in 0 seconds: inf unlinks/second Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 PASS 16 (146s) == replay-dual test 17: fail OST during recovery (3571) == 04:58:15 (1713344295) total: 25 open/close in 0.21 seconds: 117.80 ops/second total: 1 open/close in 0.03 seconds: 39.46 ops/second UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing ost1 on oleg150-server Stopping /mnt/lustre-ost1 (opts:) on oleg150-server reboot facets: ost1 Failover ost1 to oleg150-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-OST0000 Failing ost1 on oleg150-server Stopping /mnt/lustre-ost1 (opts:) on oleg150-server reboot facets: ost1 Failover ost1 to oleg150-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-OST0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec - unlinked 0 (time 1713344417 ; total 0 ; last 0) total: 25 unlinks in 1 seconds: 25.000000 unlinks/second Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 PASS 17 (125s) == replay-dual test 18: ldlm_handle_enqueue succeeds on evicted export (3822) ========================================================== 05:00:20 (1713344420) debug=+dlmtrace using seed 2341973653 running for 500 iterations total: 500 stats in 0 seconds: inf stats/second fail_loc=0x8000030b ldlm.namespaces.MGC192.168.201.150@tcp.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800aa954800.early_lock_cancel=0 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800ab983800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800aa954800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0000-osc-ffff8800ab983800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800aa954800.early_lock_cancel=0 ldlm.namespaces.lustre-OST0001-osc-ffff8800ab983800.early_lock_cancel=0 fail_loc=0x80000305 Error in opening file "/mnt/lustre2/d18.replay-dual/f18.replay-dual"(flags=O_RDONLY) 2: No such file or directory ldlm.namespaces.MGC192.168.201.150@tcp.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800aa954800.early_lock_cancel=1 ldlm.namespaces.lustre-MDT0000-mdc-ffff8800ab983800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800aa954800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0000-osc-ffff8800ab983800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800aa954800.early_lock_cancel=1 ldlm.namespaces.lustre-OST0001-osc-ffff8800ab983800.early_lock_cancel=1 fail_loc=0 fail_loc=0 PASS 18 (46s) == replay-dual test 19: resend of open request =========== 05:01:06 (1713344466) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3328 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre fail_loc=0x157 - open/close 0 (time 1713344545.52 total 77.05 last 0.00) total: 1 open/close in 77.05 seconds: 0.01 ops/second fail_loc=0 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 19 (102s) == replay-dual test 20: recovery time is not increasing == 05:02:48 (1713344568) UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210560 3200 2205312 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3766272 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7532544 1% /mnt/lustre Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 PASS 20 (318s) == replay-dual test 21a: commit on sharing =============== 05:08:06 (1713344886) mdt.lustre-MDT0000.commit_on_sharing=1 Replay barrier on lustre-MDT0000 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting client: oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre2 mdt.lustre-MDT0000.commit_on_sharing=0 PASS 21a (155s) SKIP: replay-dual test_21b skipping SLOW test 21b == replay-dual test 22a: c1 lfs mkdir -i 1 dir1, M1 drop reply & fail, c2 mkdir dir1/dir ========================================================== 05:10:41 (1713345041) SKIP: replay-dual test_22a needs >= 2 MDTs SKIP 22a (1s) == replay-dual test 22b: c1 lfs mkdir -i 1 d1, M1 drop reply & fail M0/M1, c2 mkdir d1/dir ========================================================== 05:10:42 (1713345042) SKIP: replay-dual test_22b needs >= 2 MDTs SKIP 22b (1s) == replay-dual test 22c: c1 lfs mkdir -i 1 d1, M1 drop update & fail M1, c2 mkdir d1/dir ========================================================== 05:10:43 (1713345043) SKIP: replay-dual test_22c needs >= 2 MDTs SKIP 22c (1s) == replay-dual test 22d: c1 lfs mkdir -i 1 d1, M1 drop update & fail M0/M1,c2 mkdir d1/dir ========================================================== 05:10:44 (1713345044) SKIP: replay-dual test_22d needs >= 2 MDTs SKIP 22d (0s) == replay-dual test 23a: c1 rmdir d1, M1 drop reply and fail, client2 mkdir d1 ========================================================== 05:10:45 (1713345045) SKIP: replay-dual test_23a needs >= 2 MDTs SKIP 23a (0s) == replay-dual test 23b: c1 rmdir d1, M1 drop reply and fail M0/M1, c2 mkdir d1 ========================================================== 05:10:45 (1713345045) SKIP: replay-dual test_23b needs >= 2 MDTs SKIP 23b (1s) == replay-dual test 23c: c1 rmdir d1, M0 drop update reply and fail M0, c2 mkdir d1 ========================================================== 05:10:46 (1713345046) SKIP: replay-dual test_23c needs >= 2 MDTs SKIP 23c (1s) == replay-dual test 23d: c1 rmdir d1, M0 drop update reply and fail M0/M1, c2 mkdir d1 ========================================================== 05:10:47 (1713345047) SKIP: replay-dual test_23d needs >= 2 MDTs SKIP 23d (1s) == replay-dual test 24: reconstruct on non-existing object ========================================================== 05:10:48 (1713345048) fail_loc=0x119 fail_loc=0 truncate: cannot truncate '/mnt/lustre/f24.replay-dual' to length 100: No such file or directory PASS 24 (79s) == replay-dual test 25: replay|resend ==================== 05:12:07 (1713345127) 1+0 records in 1+0 records out 512 bytes (512 B) copied, 0.00272452 s, 188 kB/s fail_loc=0x304 fail_loc=0x80000325 Failing ost1 on oleg150-server Stopping /mnt/lustre-ost1 (opts:) on oleg150-server reboot facets: ost1 Failover ost1 to oleg150-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-OST0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec /home/green/git/lustre-release/lustre/tests/test-framework.sh: line 5977: 14164 Terminated LUSTRE="/home/green/git/lustre-release/lustre" bash -c "multiop /mnt/lustre2/f25.replay-dual Ow512" fail_loc=0 PASS 25 (21s) == replay-dual test 26: dbench and tar with mds failover ========================================================== 05:12:28 (1713345148) Starting client oleg150-client.virtnet: -o user_xattr,flock oleg150-server@tcp:/lustre /mnt/lustre Started clients oleg150-client.virtnet: 192.168.201.150@tcp:/lustre on /mnt/lustre type lustre (rw,checksum,flock,user_xattr,lruresize,lazystatfs,nouser_fid2path,verbose,noencrypt) Started tar loop with pid 15749 Started dbench loop with 15750 looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Wed Apr 17 05:12:30 EDT 2024 waiting for dbench pid 15775 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs failed to create barrier semaphore 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients 1 232 8.88 MB/sec warmup 1 sec latency 27.299 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 3200 2205440 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 3072 3690496 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 3072 3659776 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 6144 7350272 1% /mnt/lustre 1 436 7.69 MB/sec warmup 2 sec latency 36.594 ms 1 695 7.06 MB/sec warmup 3 sec latency 70.735 ms 1 893 5.42 MB/sec warmup 4 sec latency 64.570 ms test_26 fail mds1 1 times Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server 1 1200 4.61 MB/sec warmup 5 sec latency 45.613 ms reboot facets: mds1 1 1225 3.85 MB/sec warmup 6 sec latency 918.495 ms 1 1225 3.30 MB/sec warmup 7 sec latency 1918.632 ms 1 1225 2.88 MB/sec warmup 8 sec latency 2918.822 ms 1 1225 2.56 MB/sec warmup 9 sec latency 3919.023 ms 1 1225 2.31 MB/sec warmup 10 sec latency 4919.177 ms 1 1225 2.10 MB/sec warmup 11 sec latency 5919.347 ms 1 1225 1.92 MB/sec warmup 12 sec latency 6919.501 ms 1 1225 1.77 MB/sec warmup 13 sec latency 7919.643 ms 1 1225 1.65 MB/sec warmup 14 sec latency 8919.820 ms 1 1225 1.54 MB/sec warmup 15 sec latency 9919.975 ms Failover mds1 to oleg150-server 1 1225 1.44 MB/sec warmup 16 sec latency 10920.142 ms mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 1 1225 1.36 MB/sec warmup 17 sec latency 11920.380 ms 1 1225 1.28 MB/sec warmup 18 sec latency 12921.456 ms 1 1225 1.21 MB/sec warmup 19 sec latency 13921.613 ms oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 1 1225 0.00 MB/sec execute 1 sec latency 15921.940 ms 1 1225 0.00 MB/sec execute 2 sec latency 16922.118 ms 1 1225 0.00 MB/sec execute 3 sec latency 17922.284 ms 1 1225 0.00 MB/sec execute 4 sec latency 18922.524 ms 1 1225 0.00 MB/sec execute 5 sec latency 19922.710 ms 1 1225 0.00 MB/sec execute 6 sec latency 20922.992 ms 1 1225 0.00 MB/sec execute 7 sec latency 21926.674 ms 1 1225 0.00 MB/sec execute 8 sec latency 22926.871 ms 1 1225 0.00 MB/sec execute 9 sec latency 23927.027 ms 1 1225 0.00 MB/sec execute 10 sec latency 24927.176 ms 1 1225 0.00 MB/sec execute 11 sec latency 25927.329 ms 1 1225 0.00 MB/sec execute 12 sec latency 26927.530 ms 1 1225 0.00 MB/sec execute 13 sec latency 27927.673 ms 1 1225 0.00 MB/sec execute 14 sec latency 28927.788 ms 1 1462 0.20 MB/sec execute 15 sec latency 29225.640 ms oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 1705 0.20 MB/sec execute 16 sec latency 51.234 ms 1 1984 0.20 MB/sec execute 17 sec latency 69.614 ms 1 2296 0.26 MB/sec execute 18 sec latency 17.973 ms 1 2478 0.30 MB/sec execute 19 sec latency 70.798 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 4736 2203904 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 15360 3704832 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 40960 3607552 2% /mnt/lustre[OST:1] filesystem_summary: 7542784 56320 7312384 1% /mnt/lustre 1 2919 0.52 MB/sec execute 20 sec latency 59.076 ms 1 3339 0.59 MB/sec execute 21 sec latency 61.946 ms 1 3736 0.73 MB/sec execute 22 sec latency 17.883 ms test_26 fail mds1 2 times 1 3956 0.74 MB/sec execute 23 sec latency 23.535 ms Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server 1 4111 0.71 MB/sec execute 24 sec latency 356.609 ms reboot facets: mds1 1 4111 0.68 MB/sec execute 25 sec latency 1356.763 ms 1 4111 0.65 MB/sec execute 26 sec latency 2356.935 ms 1 4111 0.63 MB/sec execute 27 sec latency 3357.113 ms 1 4111 0.61 MB/sec execute 28 sec latency 4357.319 ms 1 4111 0.59 MB/sec execute 29 sec latency 5357.459 ms 1 4111 0.57 MB/sec execute 30 sec latency 6357.630 ms 1 4111 0.55 MB/sec execute 31 sec latency 7357.824 ms 1 4111 0.53 MB/sec execute 32 sec latency 8358.030 ms 1 4111 0.52 MB/sec execute 33 sec latency 9358.196 ms 1 4111 0.50 MB/sec execute 34 sec latency 10358.364 ms Failover mds1 to oleg150-server mount facets: mds1 1 4111 0.49 MB/sec execute 35 sec latency 11358.580 ms Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 1 4111 0.47 MB/sec execute 36 sec latency 12358.767 ms 1 4111 0.46 MB/sec execute 37 sec latency 13358.940 ms 1 4111 0.45 MB/sec execute 38 sec latency 14359.154 ms oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 1 4111 0.44 MB/sec execute 39 sec latency 15359.291 ms Started lustre-MDT0000 1 4111 0.43 MB/sec execute 40 sec latency 16359.536 ms 1 4270 0.42 MB/sec execute 41 sec latency 16658.414 ms oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 4452 0.42 MB/sec execute 42 sec latency 89.561 ms 1 4710 0.44 MB/sec execute 43 sec latency 38.732 ms 1 5029 0.50 MB/sec execute 44 sec latency 19.940 ms 1 5211 0.49 MB/sec execute 45 sec latency 76.563 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210560 6016 2202496 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 17408 3729408 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 48128 3690496 2% /mnt/lustre[OST:1] filesystem_summary: 7542784 65536 7419904 1% /mnt/lustre 1 5442 0.49 MB/sec execute 46 sec latency 20.113 ms 1 5660 0.48 MB/sec execute 47 sec latency 64.195 ms 1 5949 0.50 MB/sec execute 48 sec latency 43.660 ms test_26 fail mds1 3 times 1 6320 0.57 MB/sec execute 49 sec latency 54.588 ms Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server 1 6731 0.60 MB/sec execute 50 sec latency 170.343 ms reboot facets: mds1 1 6731 0.59 MB/sec execute 51 sec latency 1170.630 ms 1 6731 0.58 MB/sec execute 52 sec latency 2170.804 ms 1 6731 0.57 MB/sec execute 53 sec latency 3171.063 ms 1 6731 0.56 MB/sec execute 54 sec latency 4171.225 ms 1 6731 0.55 MB/sec execute 55 sec latency 5171.407 ms 1 6731 0.54 MB/sec execute 56 sec latency 6171.605 ms 1 6731 0.53 MB/sec execute 57 sec latency 7171.831 ms 1 6731 0.52 MB/sec execute 58 sec latency 8172.042 ms 1 6731 0.51 MB/sec execute 59 sec latency 9172.206 ms 1 6731 0.50 MB/sec execute 60 sec latency 10172.371 ms Failover mds1 to oleg150-server mount facets: mds1 1 6731 0.49 MB/sec execute 61 sec latency 11172.578 ms Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 1 6731 0.49 MB/sec execute 62 sec latency 12172.753 ms 1 6731 0.48 MB/sec execute 63 sec latency 13172.941 ms 1 6731 0.47 MB/sec execute 64 sec latency 14173.102 ms 1 6731 0.46 MB/sec execute 65 sec latency 15173.267 ms 1 6731 0.46 MB/sec execute 66 sec latency 16173.389 ms oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 1 6731 0.45 MB/sec execute 67 sec latency 17173.519 ms Started lustre-MDT0000 1 6731 0.44 MB/sec execute 68 sec latency 18173.778 ms 1 6731 0.44 MB/sec execute 69 sec latency 19173.943 ms 1 6731 0.43 MB/sec execute 70 sec latency 20174.407 ms 1 6731 0.42 MB/sec execute 71 sec latency 21174.704 ms 1 6731 0.42 MB/sec execute 72 sec latency 22174.927 ms 1 6731 0.41 MB/sec execute 73 sec latency 23175.134 ms 1 6731 0.41 MB/sec execute 74 sec latency 24175.305 ms 1 6731 0.40 MB/sec execute 75 sec latency 25175.529 ms 1 6731 0.40 MB/sec execute 76 sec latency 26175.698 ms 1 6731 0.39 MB/sec execute 77 sec latency 27175.940 ms 1 6731 0.39 MB/sec execute 78 sec latency 28176.071 ms 1 6731 0.38 MB/sec execute 79 sec latency 29176.230 ms 1 7077 0.43 MB/sec execute 80 sec latency 29344.634 ms oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 7352 0.44 MB/sec execute 81 sec latency 21.682 ms 1 7531 0.44 MB/sec execute 82 sec latency 27.034 ms 1 7778 0.44 MB/sec execute 83 sec latency 20.418 ms 1 7964 0.44 MB/sec execute 84 sec latency 88.096 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 7552 2201088 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 20480 3719168 1% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 52224 3700736 2% /mnt/lustre[OST:1] filesystem_summary: 7542784 72704 7419904 1% /mnt/lustre 1 8209 0.45 MB/sec execute 85 sec latency 53.154 ms 1 8544 0.48 MB/sec execute 86 sec latency 18.443 ms 1 8803 0.48 MB/sec execute 87 sec latency 49.784 ms test_26 fail mds1 4 times Failing mds1 on oleg150-server 1 9135 0.47 MB/sec execute 88 sec latency 35.553 ms Stopping /mnt/lustre-mds1 (opts:) on oleg150-server 1 9307 0.47 MB/sec execute 89 sec latency 551.816 ms reboot facets: mds1 1 9307 0.47 MB/sec execute 90 sec latency 1552.009 ms 1 9307 0.46 MB/sec execute 91 sec latency 2552.217 ms 1 9307 0.46 MB/sec execute 92 sec latency 3552.370 ms 1 9307 0.45 MB/sec execute 93 sec latency 4552.581 ms 1 9307 0.45 MB/sec execute 94 sec latency 5552.755 ms 1 9307 0.44 MB/sec execute 95 sec latency 6552.907 ms 1 9307 0.44 MB/sec execute 96 sec latency 7553.068 ms 1 9307 0.43 MB/sec execute 97 sec latency 8553.218 ms 1 9307 0.43 MB/sec execute 98 sec latency 9553.385 ms 1 9307 0.42 MB/sec execute 99 sec latency 10553.547 ms Failover mds1 to oleg150-server mount facets: mds1 1 cleanup 100 sec Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 1 cleanup 101 sec 1 cleanup 102 sec 1 cleanup 103 sec 1 cleanup 104 sec oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 1 cleanup 105 sec pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 0 cleanup 106 sec Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 1436 42.138 29344.618 Close 1063 1.962 6.312 Rename 55 13.204 21.728 Unlink 262 4.593 9.833 Qpathinfo 1323 37.269 29225.628 Qfileinfo 227 0.427 2.895 Qfsinfo 209 0.449 2.510 Sfileinfo 105 6.582 13.629 Find 484 1.237 35.968 WriteX 676 2.043 10.000 ReadX 2143 0.112 33.596 LockX 4 2.078 2.672 UnlockX 4 1.608 1.960 Flush 92 34.399 89.543 Throughput 0.423404 MB/sec 1 clients 1 procs max_latency=29344.634 ms stopping dbench on /mnt/lustre at Wed Apr 17 05:14:36 EDT 2024 with return code 0 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished looking for dbench program /usr/bin/dbench found dbench client file /usr/share/dbench/client.txt '/usr/share/dbench/client.txt' -> 'client.txt' running 'dbench 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100' on /mnt/lustre at Wed Apr 17 05:14:37 EDT 2024 waiting for dbench pid 19846 dbench version 4.00 - Copyright Andrew Tridgell 1999-2004 Running for 100 seconds with load 'client.txt' and minimum warmup 20 secs 0 of 1 processes prepared for launch 0 sec 1 of 1 processes prepared for launch 0 sec releasing clients oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 255 9.66 MB/sec warmup 1 sec latency 22.859 ms 1 459 8.05 MB/sec warmup 2 sec latency 40.989 ms 1 728 7.10 MB/sec warmup 3 sec latency 103.509 ms 1 983 5.48 MB/sec warmup 4 sec latency 55.403 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 9472 2199168 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 36864 3443712 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 17408 3698688 1% /mnt/lustre[OST:1] filesystem_summary: 7542784 54272 7142400 1% /mnt/lustre 1 1410 5.20 MB/sec warmup 5 sec latency 21.264 ms 1 1695 4.37 MB/sec warmup 6 sec latency 47.895 ms 1 2012 3.79 MB/sec warmup 7 sec latency 40.925 ms test_26 fail mds1 5 times 1 2347 3.49 MB/sec warmup 8 sec latency 66.763 ms Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server 1 2490 3.19 MB/sec warmup 9 sec latency 269.661 ms 1 2490 2.87 MB/sec warmup 10 sec latency 1269.865 ms 1 2490 2.61 MB/sec warmup 11 sec latency 2270.041 ms 1 2490 2.39 MB/sec warmup 12 sec latency 3270.196 ms 1 2490 2.21 MB/sec warmup 13 sec latency 4270.544 ms 1 2490 2.05 MB/sec warmup 14 sec latency 5270.708 ms 1 2490 1.91 MB/sec warmup 15 sec latency 6270.851 ms reboot facets: mds1 1 2490 1.79 MB/sec warmup 16 sec latency 7271.021 ms 1 2490 1.69 MB/sec warmup 17 sec latency 8271.168 ms 1 2490 1.60 MB/sec warmup 18 sec latency 9271.329 ms 1 2490 1.51 MB/sec warmup 19 sec latency 10271.488 ms 1 2490 0.00 MB/sec execute 1 sec latency 12271.845 ms 1 2490 0.00 MB/sec execute 2 sec latency 13272.035 ms 1 2490 0.00 MB/sec execute 3 sec latency 14272.198 ms 1 2490 0.00 MB/sec execute 4 sec latency 15272.355 ms 1 2490 0.00 MB/sec execute 5 sec latency 16273.169 ms Failover mds1 to oleg150-server mount facets: mds1 1 2490 0.00 MB/sec execute 6 sec latency 17273.320 ms Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 1 2490 0.00 MB/sec execute 7 sec latency 18273.489 ms 1 2490 0.00 MB/sec execute 8 sec latency 19273.639 ms 1 2490 0.00 MB/sec execute 9 sec latency 20274.283 ms 1 2490 0.00 MB/sec execute 10 sec latency 21274.450 ms oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 1 2490 0.00 MB/sec execute 11 sec latency 22275.310 ms pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 1 2490 0.00 MB/sec execute 12 sec latency 23275.456 ms 1 2490 0.00 MB/sec execute 13 sec latency 24275.730 ms 1 2490 0.00 MB/sec execute 14 sec latency 25275.886 ms 1 2490 0.00 MB/sec execute 15 sec latency 26276.034 ms 1 2490 0.00 MB/sec execute 16 sec latency 27276.172 ms 1 2979 0.28 MB/sec execute 17 sec latency 27326.979 ms oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 3528 0.54 MB/sec execute 18 sec latency 57.313 ms 1 3856 0.58 MB/sec execute 19 sec latency 18.134 ms 1 4097 0.57 MB/sec execute 20 sec latency 19.333 ms 1 4360 0.56 MB/sec execute 21 sec latency 78.131 ms UUID 1K-blocks Used Available Use% Mounted on lustre-MDT0000_UUID 2210688 10880 2197760 1% /mnt/lustre[MDT:0] lustre-OST0000_UUID 3771392 47104 3571712 2% /mnt/lustre[OST:0] lustre-OST0001_UUID 3771392 39936 3662848 2% /mnt/lustre[OST:1] filesystem_summary: 7542784 87040 7234560 2% /mnt/lustre 1 4618 0.60 MB/sec execute 22 sec latency 51.878 ms 1 5023 0.71 MB/sec execute 23 sec latency 14.293 ms 1 5346 0.70 MB/sec execute 24 sec latency 42.207 ms test_26 fail mds1 6 times Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server 1 5712 0.68 MB/sec execute 25 sec latency 38.892 ms reboot facets: mds1 1 5771 0.66 MB/sec execute 26 sec latency 809.853 ms 1 5771 0.64 MB/sec execute 27 sec latency 1810.602 ms 1 5771 0.61 MB/sec execute 28 sec latency 2810.788 ms 1 5771 0.59 MB/sec execute 29 sec latency 3810.955 ms 1 5771 0.57 MB/sec execute 30 sec latency 4811.112 ms 1 5771 0.55 MB/sec execute 31 sec latency 5811.397 ms 1 5771 0.54 MB/sec execute 32 sec latency 6811.560 ms 1 5771 0.52 MB/sec execute 33 sec latency 7811.846 ms 1 5771 0.50 MB/sec execute 34 sec latency 8812.171 ms 1 5771 0.49 MB/sec execute 35 sec latency 9812.390 ms Failover mds1 to oleg150-server 1 5771 0.48 MB/sec execute 36 sec latency 10813.110 ms mount facets: mds1 1 5771 0.46 MB/sec execute 37 sec latency 11813.386 ms Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 1 5771 0.45 MB/sec execute 38 sec latency 12813.562 ms 1 5771 0.44 MB/sec execute 39 sec latency 13813.796 ms 1 5771 0.43 MB/sec execute 40 sec latency 14814.331 ms 1 5771 0.42 MB/sec execute 41 sec latency 15814.493 ms 1 5771 0.41 MB/sec execute 42 sec latency 16814.670 ms 1 5771 0.40 MB/sec execute 43 sec latency 17814.902 ms oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 1 5771 0.39 MB/sec execute 44 sec latency 18815.089 ms 1 5771 0.38 MB/sec execute 45 sec latency 19815.269 ms 1 5771 0.37 MB/sec execute 46 sec latency 20815.479 ms 1 5771 0.37 MB/sec execute 47 sec latency 21815.832 ms 1 5771 0.36 MB/sec execute 48 sec latency 22816.584 ms 1 5771 0.35 MB/sec execute 49 sec latency 23816.795 ms 1 5771 0.34 MB/sec execute 50 sec latency 24817.260 ms 1 5771 0.34 MB/sec execute 51 sec latency 25817.615 ms 1 5771 0.33 MB/sec execute 52 sec latency 26817.810 ms 1 5771 0.32 MB/sec execute 53 sec latency 27818.214 ms 1 5771 0.32 MB/sec execute 54 sec latency 28818.404 ms 1 5771 0.31 MB/sec execute 55 sec latency 29818.600 ms 1 5771 0.31 MB/sec execute 56 sec latency 30818.773 ms 1 5771 0.30 MB/sec execute 57 sec latency 31819.063 ms 1 5771 0.30 MB/sec execute 58 sec latency 32819.207 ms 1 5771 0.29 MB/sec execute 59 sec latency 33819.408 ms 1 5773 0.29 MB/sec execute 60 sec latency 34794.053 ms 1 5995 0.30 MB/sec execute 61 sec latency 79.076 ms oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec 1 6436 0.36 MB/sec execute 62 sec latency 53.691 ms 1 6831 0.41 MB/sec execute 63 sec latency 73.974 ms 1 7228 0.46 MB/sec execute 64 sec latency 31.789 ms 1 7451 0.47 MB/sec execute 65 sec latency 18.479 ms 1 7630 0.46 MB/sec execute 66 sec latency 25.502 ms 1 7824 0.46 MB/sec execute 67 sec latency 104.272 ms 1 7985 0.46 MB/sec execute 68 sec latency 82.536 ms 1 8245 0.47 MB/sec execute 69 sec latency 65.406 ms 1 8574 0.51 MB/sec execute 70 sec latency 16.862 ms 1 8794 0.50 MB/sec execute 71 sec latency 93.739 ms 1 9071 0.50 MB/sec execute 72 sec latency 63.408 ms 1 9449 0.51 MB/sec execute 73 sec latency 14.967 ms 1 9696 0.52 MB/sec execute 74 sec latency 59.616 ms 1 10112 0.57 MB/sec execute 75 sec latency 67.156 ms 1 10644 0.63 MB/sec execute 76 sec latency 57.015 ms 1 10922 0.64 MB/sec execute 77 sec latency 16.642 ms 1 11181 0.64 MB/sec execute 78 sec latency 16.082 ms 1 11454 0.63 MB/sec execute 79 sec latency 67.451 ms 1 11731 0.64 MB/sec execute 80 sec latency 54.922 ms 1 12179 0.68 MB/sec execute 81 sec latency 57.536 ms 1 12514 0.67 MB/sec execute 82 sec latency 43.269 ms 1 12905 0.67 MB/sec execute 83 sec latency 51.539 ms 1 13248 0.69 MB/sec execute 84 sec latency 54.140 ms 1 13846 0.74 MB/sec execute 85 sec latency 54.115 ms 1 14304 0.78 MB/sec execute 86 sec latency 29.242 ms 1 14546 0.79 MB/sec execute 87 sec latency 21.073 ms dbench killed by signal 15 stopping dbench on /mnt/lustre at Wed Apr 17 05:16:24 EDT 2024 with return code 0 19846 pts/0 S+ 0:00 dbench -c client.txt 1 -D /mnt/lustre2/d26.replay-dual/run_dbench -t 100 killed dbench main pid 19846 clean dbench files on /mnt/lustre /mnt/lustre /mnt/lustre removed 'client.txt' /mnt/lustre dbench successfully finished PASS 26 (238s) == replay-dual test 28: lock replay should be ordered: waiting after granted ========================================================== 05:16:26 (1713345386) 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00234759 s, 1.7 MB/s fail_loc=0x80000324 fail_loc=0x32a Failing ost1 on oleg150-server Stopping /mnt/lustre-ost1 (opts:) on oleg150-server reboot facets: ost1 Failover ost1 to oleg150-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-OST0000 1+0 records in 1+0 records out 4096 bytes (4.1 kB) copied, 0.00440312 s, 930 kB/s oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 28 (30s) == replay-dual test 29: replay vs update with the same xid ========================================================== 05:16:56 (1713345416) SKIP: replay-dual test_29 needs >= 2 MDTs SKIP 29 (1s) == replay-dual test 30: layout lock replay is not blocked on IO ========================================================== 05:16:57 (1713345417) 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.00650062 s, 6.3 MB/s 10+0 records in 10+0 records out 40960 bytes (41 kB) copied, 0.0053643 s, 7.6 MB/s fail_loc=0x32e fail_val=4 Failing mds1 on oleg150-server Stopping /mnt/lustre-mds1 (opts:) on oleg150-server reboot facets: mds1 Failover mds1 to oleg150-server mount facets: mds1 Starting mds1: -o localrecov lustre-mdt1/mdt1 /mnt/lustre-mds1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-MDT0000 160+0 records in 160+0 records out 81920 bytes (82 kB) copied, 32.3791 s, 2.5 kB/s oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 30 (36s) == replay-dual test 31: deadlock on file_remove_privs and occupied mod rpc slots ========================================================== 05:17:33 (1713345453) Failing ost1 on oleg150-server Stopping /mnt/lustre-ost1 (opts:) on oleg150-server reboot facets: ost1 Creating to objid 3201 on ost lustre-OST0000... total: 32 open/close in 0.13 seconds: 252.34 ops/second at_max=0 fail_loc=0x80001420 Failover ost1 to oleg150-server mount facets: ost1 Starting ost1: -o localrecov lustre-ost1/ost1 /mnt/lustre-ost1 oleg150-server: oleg150-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg150-client: oleg150-server: ssh exited with exit code 1 Started lustre-OST0000 oleg150-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in IDLE FULL state after 0 sec at_max=600 PASS 31 (25s) == replay-dual test 32: gap in update llog shouldn't break recovery ========================================================== 05:17:58 (1713345478) SKIP: replay-dual test_32 needs >= 2 MDTs SKIP 32 (1s) == replay-dual test 33: Check for OBD_INCOMPAT_MULTI_RPCS in last_rcvd after abort_recovery ========================================================== 05:17:59 (1713345479) SKIP: replay-dual test_33 ldiskfs only test SKIP 33 (1s) == replay-dual test complete, duration 2068 sec ========== 05:18:00 (1713345480) Stopping clients: oleg150-client.virtnet /mnt/lustre2 (opts:) Stopping client oleg150-client.virtnet /mnt/lustre2 opts: