== conf-sanity test 35b: Continue reconnection retries, if the active server is busy ========================================================== 04:51:38 (1713343898) start mds service on oleg315-server Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg315-server: oleg315-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg315-client: oleg315-server: ssh exited with exit code 1 Started lustre-MDT0000 start mds service on oleg315-server Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg315-server: oleg315-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg315-client: oleg315-server: ssh exited with exit code 1 Started lustre-MDT0001 oleg315-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid oleg315-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid start ost1 service on oleg315-server Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 oleg315-server: oleg315-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg315-client: oleg315-server: ssh exited with exit code 1 Started lustre-OST0000 oleg315-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid mount lustre on /mnt/lustre..... Starting client: oleg315-client.virtnet: -o user_xattr,flock oleg315-server@tcp:/lustre /mnt/lustre debug=ha conf-sanity.sh test_35b 2024-04-17 4h52m03s Set up a fake failnode for the MDS at_max=0 at_max=0 Injecting EBUSY on MDS fail_loc=0x80000136 mdc.lustre-MDT0000-mdc-ffff8800b59e6800.stats=clear mdc.lustre-MDT0001-mdc-ffff8800b59e6800.stats=clear Creating a test file and stat it File: '/mnt/lustre/d35b.conf-sanity/f35b.conf-sanity' Size: 0 Blocks: 0 IO Block: 4194304 regular empty file Device: 2c54f966h/743766374d Inode: 144115238826934274 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2024-04-17 04:52:46.000000000 -0400 Modify: 2024-04-17 04:52:46.000000000 -0400 Change: 2024-04-17 04:52:46.000000000 -0400 Birth: - Stop injecting EBUSY on MDS fail_loc=0 done at_max=600 at_max=600 Debug log: 67 lines, 67 kept, 0 dropped, 0 bad. umount lustre on /mnt/lustre..... Stopping client oleg315-client.virtnet /mnt/lustre (opts:) stop ost1 service on oleg315-server Stopping /mnt/lustre-ost1 (opts:-f) on oleg315-server stop mds service on oleg315-server Stopping /mnt/lustre-mds1 (opts:-f) on oleg315-server stop mds service on oleg315-server Stopping /mnt/lustre-mds2 (opts:-f) on oleg315-server LNET ready to unload unloading modules on: 'oleg315-server' oleg315-server: oleg315-server.virtnet: executing unload_modules_local oleg315-server: LNET ready to unload modules unloaded. oleg315-server: tunefs.lustre: Unable to mount /dev/mapper/mds1_flakey: No such device oleg315-server: Is the ldiskfs module available? oleg315-server: oleg315-server: tunefs.lustre FATAL: failed to write local files oleg315-server: tunefs.lustre: exiting with 19 (No such device) pdsh@oleg315-client: oleg315-server: ssh exited with exit code 19 checking for existing Lustre data: found Read previous values: Target: lustre-MDT0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x5 (MDT MGS ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: sys.timeout=20 mdt.identity_upcall=/home/green/git/lustre-release/lustre/utils/l_getidentity Permanent disk data: Target: lustre=MDT0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x105 (MDT MGS writeconf ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: sys.timeout=20 mdt.identity_upcall=/home/green/git/lustre-release/lustre/utils/l_getidentity oleg315-server: tunefs.lustre: Unable to mount /dev/mapper/mds2_flakey: No such device oleg315-server: Is the ldiskfs module available? oleg315-server: oleg315-server: tunefs.lustre FATAL: failed to write local files oleg315-server: tunefs.lustre: exiting with 19 (No such device) pdsh@oleg315-client: oleg315-server: ssh exited with exit code 19 checking for existing Lustre data: found Read previous values: Target: lustre-MDT0001 Index: 1 Lustre FS: lustre Mount type: ldiskfs Flags: 0x1 (MDT ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=192.168.203.115@tcp sys.timeout=20 mdt.identity_upcall=/home/green/git/lustre-release/lustre/utils/l_getidentity Permanent disk data: Target: lustre=MDT0001 Index: 1 Lustre FS: lustre Mount type: ldiskfs Flags: 0x101 (MDT writeconf ) Persistent mount opts: user_xattr,errors=remount-ro Parameters: mgsnode=192.168.203.115@tcp sys.timeout=20 mdt.identity_upcall=/home/green/git/lustre-release/lustre/utils/l_getidentity oleg315-server: tunefs.lustre: Unable to mount /dev/mapper/ost1_flakey: No such device oleg315-server: Is the ldiskfs module available? oleg315-server: oleg315-server: tunefs.lustre FATAL: failed to write local files oleg315-server: tunefs.lustre: exiting with 19 (No such device) pdsh@oleg315-client: oleg315-server: ssh exited with exit code 19 checking for existing Lustre data: found Read previous values: Target: lustre-OST0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x2 (OST ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=192.168.203.115@tcp sys.timeout=20 Permanent disk data: Target: lustre=OST0000 Index: 0 Lustre FS: lustre Mount type: ldiskfs Flags: 0x102 (OST writeconf ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=192.168.203.115@tcp sys.timeout=20 oleg315-server: tunefs.lustre: Unable to mount /dev/mapper/ost2_flakey: No such device oleg315-server: Is the ldiskfs module available? oleg315-server: oleg315-server: tunefs.lustre FATAL: failed to write local files oleg315-server: tunefs.lustre: exiting with 19 (No such device) pdsh@oleg315-client: oleg315-server: ssh exited with exit code 19 checking for existing Lustre data: found Read previous values: Target: lustre-OST0001 Index: 1 Lustre FS: lustre Mount type: ldiskfs Flags: 0x62 (OST first_time update ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=192.168.203.115@tcp sys.timeout=20 Permanent disk data: Target: lustre=OST0001 Index: 1 Lustre FS: lustre Mount type: ldiskfs Flags: 0x162 (OST first_time update writeconf ) Persistent mount opts: ,errors=remount-ro Parameters: mgsnode=192.168.203.115@tcp sys.timeout=20 tunefs failed, reformatting instead Stopping clients: oleg315-client.virtnet /mnt/lustre (opts:-f) Stopping clients: oleg315-client.virtnet /mnt/lustre2 (opts:-f) pdsh@oleg315-client: oleg315-server: ssh exited with exit code 2 oleg315-server: oleg315-server.virtnet: executing set_hostid Loading modules from /home/green/git/lustre-release/lustre detected 4 online CPUs by sysfs Force libcfs to create 2 CPU partitions ../libcfs/libcfs/libcfs options: 'cpu_npartitions=2' ptlrpc/ptlrpc options: 'lbug_on_grant_miscount=1' quota/lquota options: 'hash_lqs_cur_bits=3' loading modules on: 'oleg315-server' oleg315-server: oleg315-server.virtnet: executing load_modules_local oleg315-server: Loading modules from /home/green/git/lustre-release/lustre oleg315-server: detected 4 online CPUs by sysfs oleg315-server: Force libcfs to create 2 CPU partitions oleg315-server: libkmod: kmod_module_get_holders: could not open '/sys/module/intel_rapl/holders': No such file or directory oleg315-server: ptlrpc/ptlrpc options: 'lbug_on_grant_miscount=1' oleg315-server: quota/lquota options: 'hash_lqs_cur_bits=3' Formatting mgs, mds, osts Format mds1: /dev/mapper/mds1_flakey Format mds2: /dev/mapper/mds2_flakey Format ost1: /dev/mapper/ost1_flakey Format ost2: /dev/mapper/ost2_flakey start mds service on oleg315-server Starting mds1: -o localrecov /dev/mapper/mds1_flakey /mnt/lustre-mds1 oleg315-server: oleg315-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg315-client: oleg315-server: ssh exited with exit code 1 Commit the device label on /dev/mapper/mds1_flakey Started lustre-MDT0000 start mds service on oleg315-server Starting mds2: -o localrecov /dev/mapper/mds2_flakey /mnt/lustre-mds2 oleg315-server: oleg315-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg315-client: oleg315-server: ssh exited with exit code 1 Commit the device label on /dev/mapper/mds2_flakey Started lustre-MDT0001 oleg315-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0000-mdc-*.mds_server_uuid oleg315-client.virtnet: executing wait_import_state_mount FULL mdc.lustre-MDT0001-mdc-*.mds_server_uuid start ost1 service on oleg315-server Starting ost1: -o localrecov /dev/mapper/ost1_flakey /mnt/lustre-ost1 oleg315-server: oleg315-server.virtnet: executing set_default_debug -1 all 8 pdsh@oleg315-client: oleg315-server: ssh exited with exit code 1 Commit the device label on /dev/mapper/ost1_flakey Started lustre-OST0000 oleg315-client.virtnet: executing wait_import_state_mount (FULL|IDLE) osc.lustre-OST0000-osc-[-0-9a-f]*.ost_server_uuid oleg315-server: oleg315-server.virtnet: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid 40 oleg315-server: os[cp].lustre-OST0000-osc-MDT0000.ost_server_uuid in FULL state after 0 sec oleg315-server: oleg315-server.virtnet: executing wait_import_state FULL os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid 40 oleg315-server: os[cp].lustre-OST0000-osc-MDT0001.ost_server_uuid in FULL state after 0 sec stop ost1 service on oleg315-server Stopping /mnt/lustre-ost1 (opts:-f) on oleg315-server stop mds service on oleg315-server Stopping /mnt/lustre-mds1 (opts:-f) on oleg315-server stop mds service on oleg315-server Stopping /mnt/lustre-mds2 (opts:-f) on oleg315-server