-----============= acceptance-small: insanity ============----- Fri Apr 19 08:49:55 EDT 2024 excepting tests: === insanity: start setup 08:49:57 (1713530997) === oleg455-client.virtnet: executing check_config_client /mnt/lustre oleg455-client.virtnet: Checking config lustre mounted on /mnt/lustre Checking servers environments Checking clients oleg455-client.virtnet environments Using TIMEOUT=20 osc.lustre-OST0000-osc-ffff8800b6703000.idle_timeout=debug osc.lustre-OST0001-osc-ffff8800b6703000.idle_timeout=debug disable quota as required oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all osd-ldiskfs.track_declares_assert=1 === insanity: finish setup 08:50:04 (1713531004) === debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 0: Fail all nodes, independently ======== 08:50:05 (1713531005) Failing mds1 on oleg455-server Stopping /mnt/lustre-mds1 (opts:) on oleg455-server 08:50:07 (1713531007) shut down Failover mds1 to oleg455-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 08:50:20 (1713531020) targets are mounted 08:50:20 (1713531020) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server 08:50:28 (1713531028) shut down Failover mds2 to oleg455-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 08:50:42 (1713531042) targets are mounted 08:50:42 (1713531042) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Failing ost1 on oleg455-server Stopping /mnt/lustre-ost1 (opts:) on oleg455-server 08:50:50 (1713531050) shut down Failover ost1 to oleg455-server mount facets: ost1 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 08:51:04 (1713531064) targets are mounted 08:51:04 (1713531064) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Failing ost2 on oleg455-server Stopping /mnt/lustre-ost2 (opts:) on oleg455-server 08:51:10 (1713531070) shut down Failover ost2 to oleg455-server mount facets: ost2 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0001 08:51:24 (1713531084) targets are mounted 08:51:24 (1713531084) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 0 (85s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 1: MDS/MDS failure ====================== 08:51:32 (1713531092) Stopping /mnt/lustre-mds1 (opts:) on oleg455-server Failover mds1 to oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server Reintegrating MDS2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 Verify reintegration PASS 1 (174s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 2: Second Failure Mode: MDS/OST Fri Apr 19 08:54:27 EDT 2024 ========================================================== 08:54:28 (1713531268) Verify Lustre filesystem is up and running Stopping /mnt/lustre-mds1 (opts:) on oleg455-server Failover mds1 to oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server Failover mds2 to oleg455-server Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Reintegrating OST Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 Verify reintegration PASS 2 (165s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 3: Third Failure Mode: MDS/CLIENT Fri Apr 19 08:57:14 EDT 2024 ========================================================== 08:57:16 (1713531436) Verify Lustre filesystem is up and running Failing mds1 on oleg455-server Stopping /mnt/lustre-mds1 (opts:) on oleg455-server 08:57:18 (1713531438) shut down Failover mds1 to oleg455-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 08:57:32 (1713531452) targets are mounted 08:57:32 (1713531452) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server 08:57:39 (1713531459) shut down Failover mds2 to oleg455-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 08:57:53 (1713531473) targets are mounted 08:57:53 (1713531473) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Test Lustre stability after MDS failover Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS PASS 3 (52s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 4: Fourth Failure Mode: OST/MDS Fri Apr 19 08:58:08 EDT 2024 ========================================================== 08:58:10 (1713531490) Fourth Failure Mode: OST/MDS Fri Apr 19 08:58:10 EDT 2024 Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Test Lustre stability after OST failure Stopping /mnt/lustre-mds1 (opts:) on oleg455-server Failover mds1 to oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server Failover mds2 to oleg455-server Reintegrating OST Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 Test Lustre stability after MDS failover PASS 4 (168s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 5: Fifth Failure Mode: OST/OST Fri Apr 19 09:00:59 EDT 2024 ========================================================== 09:01:01 (1713531661) Fifth Failure Mode: OST/OST Fri Apr 19 09:01:01 EDT 2024 Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Test Lustre stability after OST failure Stopping /mnt/lustre-ost2 (opts:) on oleg455-server Test Lustre stability after OST failure Reintegrating OSTs Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0001 PASS 5 (63s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 6: Sixth Failure Mode: OST/CLIENT Fri Apr 19 09:02:05 EDT 2024 ========================================================== 09:02:06 (1713531726) Sixth Failure Mode: OST/CLIENT Fri Apr 19 09:02:06 EDT 2024 Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Test Lustre stability after OST failure DFPIDA=23349 Failing CLIENTs Request fail clients: , to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure DFPIDB=23702 Reintegrating OST/CLIENTs Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Verifying mount PASS 6 (35s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 7: Seventh Failure Mode: CLIENT/MDS Fri Apr 19 09:02:42 EDT 2024 ========================================================== 09:02:43 (1713531763) Seventh Failure Mode: CLIENT/MDS Fri Apr 19 09:02:44 EDT 2024 Verify Lustre filesystem is up and running Part 1: Failing CLIENT Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg455-client: total 0 oleg455-client: -rw-r--r-- 1 root root 0 Apr 19 09:02 oleg455-client.virtnet_testfile Wait 1 minutes Verify Lustre filesystem is up and running oleg455-client: rm: cannot remove '/mnt/lustre/d0.insanity/oleg455-client.virtnet_testfile': No such file or directory pdsh@oleg455-client: oleg455-client: ssh exited with exit code 1 Failing mds1 on oleg455-server Stopping /mnt/lustre-mds1 (opts:) on oleg455-server 09:03:51 (1713531831) shut down Failover mds1 to oleg455-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 09:04:05 (1713531845) targets are mounted 09:04:05 (1713531845) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Failing mds2 on oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server 09:04:12 (1713531852) shut down Failover mds2 to oleg455-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 09:04:26 (1713531866) targets are mounted 09:04:26 (1713531866) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec oleg455-client: total 0 Reintegrating CLIENTs wait 1 minutes PASS 7 (172s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 8: Eighth Failure Mode: CLIENT/OST Fri Apr 19 09:05:36 EDT 2024 ========================================================== 09:05:38 (1713531938) Eighth Failure Mode: CLIENT/OST Fri Apr 19 09:05:38 EDT 2024 Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg455-client: total 0 oleg455-client: -rw-r--r-- 1 root root 0 Apr 19 09:05 oleg455-client.virtnet_testfile Wait 1 minutes Verify Lustre filesystem is up and running Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Test Lustre stability after OST failure Reintegrating CLIENTs/OST Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Wait 1 minutes PASS 8 (151s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 9: Ninth Failure Mode: CLIENT/CLIENT Fri Apr 19 09:08:10 EDT 2024 ========================================================== 09:08:11 (1713532091) Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg455-client: total 0 oleg455-client: -rw-r--r-- 1 root root 0 Apr 19 09:08 oleg455-client.virtnet_testfile oleg455-client: -rw-r--r-- 1 root root 0 Apr 19 09:07 oleg455-client.virtnet_testfile2 Wait 1 minutes Verify Lustre filesystem is up and running Failing CLIENTs Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENTs failure oleg455-client: total 0 oleg455-client: -rw-r--r-- 1 root root 0 Apr 19 09:09 oleg455-client.virtnet_testfile oleg455-client: -rw-r--r-- 1 root root 0 Apr 19 09:07 oleg455-client.virtnet_testfile2 Reintegrating CLIENTs/CLIENTs Wait 1 minutes PASS 9 (132s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 10: Tenth Failure Mode: MDT0/OST/MDT1 Fri Apr 19 09:10:24 EDT 2024 ========================================================== 09:10:26 (1713532226) Stopping /mnt/lustre-mds1 (opts:) on oleg455-server Failover mds1 to oleg455-server Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Reintegrating OST Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Stopping /mnt/lustre-mds2 (opts:) on oleg455-server Failover mds2 to oleg455-server Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 Verify reintegration PASS 10 (160s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 11: Eleventh Failure Mode: MDS0/CLIENT/MDS1 Fri Apr 19 09:13:07 EDT 2024 ========================================================== 09:13:08 (1713532388) Verify Lustre filesystem is up and running Failing mds1 on oleg455-server Stopping /mnt/lustre-mds1 (opts:) on oleg455-server 09:13:10 (1713532390) shut down Failover mds1 to oleg455-server mount facets: mds1 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 09:13:24 (1713532404) targets are mounted 09:13:24 (1713532404) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec Test Lustre stability after MDS failover Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS Failing mds2 on oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server 09:13:35 (1713532415) shut down Failover mds2 to oleg455-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 09:13:49 (1713532429) targets are mounted 09:13:49 (1713532429) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 11 (52s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 12: Twelve Failure Mode: MDS0,MDS1/OST0, OST1/CLIENTS Fri Apr 19 09:14:01 EDT 2024 ========================================================== 09:14:02 (1713532442) Verify Lustre filesystem is up and running Failing mds1 on oleg455-server Stopping /mnt/lustre-mds1 (opts:) on oleg455-server Failing mds2 on oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server 09:14:05 (1713532445) shut down Failover mds1 to oleg455-server mount facets: mds1 Failover mds2 to oleg455-server mount facets: mds2 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 09:14:31 (1713532471) targets are mounted 09:14:31 (1713532471) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Failing ost1 on oleg455-server Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Failing ost2 on oleg455-server Stopping /mnt/lustre-ost2 (opts:) on oleg455-server 09:14:42 (1713532482) shut down Failover ost1 to oleg455-server mount facets: ost1 Failover ost2 to oleg455-server mount facets: ost2 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 seq.cli-lustre-OST0001-super.width=65536 seq.cli-lustre-OST0000-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Started lustre-OST0001 09:14:56 (1713532496) targets are mounted 09:14:56 (1713532496) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid,osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS PASS 12 (71s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 13: Thirteen Failure Mode: MDS0,MDS1/CLIENTS/OST0,OST1 Fri Apr 19 09:15:14 EDT 2024 ========================================================== 09:15:15 (1713532515) Verify Lustre filesystem is up and running Failing mds1 on oleg455-server Stopping /mnt/lustre-mds1 (opts:) on oleg455-server Failing mds2 on oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server 09:15:18 (1713532518) shut down Failover mds1 to oleg455-server mount facets: mds1 Failover mds2 to oleg455-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0000 Started lustre-MDT0001 09:15:44 (1713532544) targets are mounted 09:15:44 (1713532544) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS Failing ost1 on oleg455-server Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Failing ost2 on oleg455-server Stopping /mnt/lustre-ost2 (opts:) on oleg455-server 09:16:09 (1713532569) shut down Failover ost1 to oleg455-server mount facets: ost1 Failover ost2 to oleg455-server mount facets: ost2 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 seq.cli-lustre-OST0001-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Started lustre-OST0001 09:16:24 (1713532584) targets are mounted 09:16:24 (1713532584) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid,osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec PASS 13 (80s) debug_raw_pointers=0 debug_raw_pointers=0 debug_raw_pointers=Y debug_raw_pointers=Y == insanity test 14: Fourteen Failure Mode: OST0,OST1/CLIENTS/MDS0,MDS1 Fri Apr 19 09:16:35 EDT 2024 ========================================================== 09:16:37 (1713532597) Verify Lustre filesystem is up and running Failing ost1 on oleg455-server Stopping /mnt/lustre-ost1 (opts:) on oleg455-server Failing ost2 on oleg455-server Stopping /mnt/lustre-ost2 (opts:) on oleg455-server 09:16:40 (1713532600) shut down Failover ost1 to oleg455-server mount facets: ost1 Failover ost2 to oleg455-server mount facets: ost2 Starting ost2: -o localrecov /dev/mapper/ost2_flakey /mnt/lustre-ost2 Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 seq.cli-lustre-OST0000-super.width=65536 seq.cli-lustre-OST0001-super.width=65536 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-OST0000 Started lustre-OST0001 09:16:54 (1713532614) targets are mounted 09:16:54 (1713532614) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid,osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec osc.lustre-OST0001-osc-[-0-9a-f]*.ost_server_uuid in FULL state after 0 sec Failing 2 CLIENTS Request fail clients: 2, to fail: 0, failed: 0 No clients failed! Test Lustre stability after CLIENT failure Reintegrating CLIENTS Failing mds1 on oleg455-server Stopping /mnt/lustre-mds1 (opts:) on oleg455-server Failing mds2 on oleg455-server Stopping /mnt/lustre-mds2 (opts:) on oleg455-server 09:17:12 (1713532632) shut down Failover mds1 to oleg455-server mount facets: mds1 Failover mds2 to oleg455-server mount facets: mds2 Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all oleg455-server: oleg455-server.virtnet: executing set_default_debug vfstrace rpctrace dlmtrace neterror ha config ioctl super lfsck all pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 pdsh@oleg455-client: oleg455-server: ssh exited with exit code 1 Started lustre-MDT0001 Started lustre-MDT0000 09:17:37 (1713532657) targets are mounted 09:17:37 (1713532657) facet_failover done oleg455-client.virtnet: executing wait_import_state_mount (FULL|IDLE) mdc.lustre-MDT0000-mdc-*.mds_server_uuid,mdc.lustre-MDT0001-mdc-*.mds_server_uuid mdc.lustre-MDT0000-mdc-*.mds_server_uuid in FULL state after 0 sec mdc.lustre-MDT0001-mdc-*.mds_server_uuid in FULL state after 0 sec PASS 14 (70s) debug_raw_pointers=0 debug_raw_pointers=0 == insanity test complete, duration 1672 sec ============= 09:17:48 (1713532668) === insanity: start cleanup 09:17:48 (1713532668) === === insanity: finish cleanup 09:17:48 (1713532668) ===