************************ crashinfo ************************* /exports/testreports/42104/testresults/sanityn-ldiskfs-DNE-centos7_x86_64-centos7_x86_64-retry1/oleg351-server-timeout-core (3.10.0-7.9-debug) +==========================+ | *** Crashinfo v1.3.7 *** | +==========================+ +++WARNING+++ PARTIAL DUMP with size(vmcore) < 25% size(RAM) KERNEL: /tmp/crash-anaysis.WmaC4/vmlinux [TAINTED] DUMPFILE: /exports/testreports/42104/testresults/sanityn-ldiskfs-DNE-centos7_x86_64-centos7_x86_64-retry1/oleg351-server-timeout-core [PARTIAL DUMP] CPUS: 4 DATE: Thu Apr 18 10:17:17 EDT 2024 UPTIME: 01:48:38 LOAD AVERAGE: 0.44, 0.41, 0.48 TASKS: 235 NODENAME: oleg351-server.virtnet RELEASE: 3.10.0-7.9-debug VERSION: #1 SMP Sat Mar 26 23:28:42 EDT 2022 MACHINE: x86_64 (2399 Mhz) MEMORY: 4 GB PANIC: "" +--------------------------+ >------------------------| Per-cpu Stacks ('bt -a') |------------------------< +--------------------------+ -- CPU#0 -- PID=0 CPU=0 CMD=swapper/0 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 rest_init+0x8e #7 start_kernel+0x456 #8 x86_64_start_reservations+0x2a #9 x86_64_start_kernel+0x152 #10 start_cpu+0x5 -- CPU#1 -- PID=0 CPU=1 CMD=swapper/1 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#2 -- PID=0 CPU=2 CMD=swapper/2 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 -- CPU#3 -- PID=0 CPU=3 CMD=swapper/3 #-1 native_safe_halt+0xb, 449 bytes of data #0 default_idle+0x1e #1 default_enter_idle+0x45 #2 cpuidle_enter_state+0x40 #3 cpuidle_idle_call+0xd8 #4 arch_cpu_idle+0xe #5 cpu_startup_entry+0x14a #6 start_secondary+0x1eb #7 start_cpu+0x5 +--------------------------------+ >---------------------| How This Dump Has Been Created |---------------------< +--------------------------------+ Cannot identify the specific condition that triggered vmcore +---------------+ >------------------------------| Tasks Summary |------------------------------< +---------------+ Number of Threads That Ran Recently ----------------------------------- last second 44 last 5s 78 last 60s 119 ----- Total Numbers of Threads per State ------ TASK_INTERRUPTIBLE 231 TASK_RUNNING 1 +++WARNING+++ There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details +-----------------------+ >--------------------------| 5 Most Recent Threads |--------------------------< +-----------------------+ PID CMD Age ARGS ----- -------------- ------ ---------------------------- 774 kworker/1:1H 0 ms (no user stack) 28304 kworker/1:2 0 ms (no user stack) 69 kworker/3:1 0 ms (no user stack) 26449 ldlm_bl_05 0 ms (no user stack) 4257 l2arc_feed 20 ms (no user stack) +------------------------+ >-------------------------| Memory Usage (kmem -i) |-------------------------< +------------------------+ PAGES TOTAL PERCENTAGE TOTAL MEM 955067 3.6 GB ---- FREE 658351 2.5 GB 68% of TOTAL MEM USED 296716 1.1 GB 31% of TOTAL MEM SHARED 21041 82.2 MB 2% of TOTAL MEM BUFFERS 18486 72.2 MB 1% of TOTAL MEM CACHED 163510 638.7 MB 17% of TOTAL MEM SLAB 20123 78.6 MB 2% of TOTAL MEM TOTAL HUGE 0 0 ---- HUGE FREE 0 0 0% of TOTAL HUGE TOTAL SWAP 262143 1024 MB ---- SWAP USED 0 0 0% of TOTAL SWAP SWAP FREE 262143 1024 MB 100% of TOTAL SWAP COMMIT LIMIT 739676 2.8 GB ---- COMMITTED 61160 238.9 MB 8% of TOTAL LIMIT +-------------------------------+ >----------------------| Scheduler Runqueues (per CPU) |----------------------< +-------------------------------+ ---+ CPU=0 ---- | CURRENT TASK , CMD=swapper/0 ---+ CPU=1 ---- | CURRENT TASK , CMD=swapper/1 ---+ CPU=2 ---- | CURRENT TASK , CMD=swapper/2 ---+ CPU=3 ---- | CURRENT TASK , CMD=swapper/3 +------------------------+ >-------------------------| Network Status Summary |-------------------------< +------------------------+ TCP Connection Info ------------------- ESTABLISHED 6 LISTEN 3 NAGLE disabled (TCP_NODELAY): 5 user_data set (NFS etc.): 4 Unusual Situations: Doing Retransmission: 1 (run xportshow --retrans for details) UDP Connection Info ------------------- 2 UDP sockets, 0 in ESTABLISHED Unix Connection Info ------------------------ ESTABLISHED 26 CLOSE 18 LISTEN 8 Raw sockets info -------------------- ESTABLISHED 1 Interfaces Info --------------- How long ago (in seconds) interfaces transmitted/received? Name RX TX ---- ---------- --------- lo n/a 6516.6 eth0 n/a 0.0 RSS_TOTAL=59796 pages, %mem= 0.9 +------------+ >-------------------------------| Mounted FS |-------------------------------< +------------+ MOUNT SUPERBLK TYPE DEVNAME DIRNAME ffff880138cca000 ffff880139940800 rootfs rootfs / ffff880137668380 ffff880137678800 sysfs sysfs /sys ffff880137668540 ffff880139944000 proc proc /proc ffff880137668700 ffff880137678000 devtmpfs devtmpfs /dev ffff8801376688c0 ffff88012a320000 securityfs securityfs /sys/kernel/security ffff880137668a80 ffff880137679000 tmpfs tmpfs /dev/shm ffff880137668c40 ffff88013771f000 devpts devpts /dev/pts ffff880137668e00 ffff880137679800 tmpfs tmpfs /run ffff880137668fc0 ffff88013767a000 tmpfs tmpfs /sys/fs/cgroup ffff880137669180 ffff88013767a800 cgroup cgroup /sys/fs/cgroup/systemd ffff880137669340 ffff88013767b000 pstore pstore /sys/fs/pstore ffff880138ccb500 ffff8800b519c800 cgroup cgroup /sys/fs/cgroup/hugetlb ffff880138ccb6c0 ffff8800b519c000 cgroup cgroup /sys/fs/cgroup/net_cls,net_prio ffff880138ccb880 ffff8800b519b800 cgroup cgroup /sys/fs/cgroup/pids ffff880138ccba40 ffff8800b519b000 cgroup cgroup /sys/fs/cgroup/cpuset ffff880138ccbc00 ffff8800b519d000 cgroup cgroup /sys/fs/cgroup/perf_event ffff880138ccbdc0 ffff8800b519d800 cgroup cgroup /sys/fs/cgroup/memory ffff88012a2ec000 ffff8800b519e000 cgroup cgroup /sys/fs/cgroup/freezer ffff88012a2ec1c0 ffff8800b519e800 cgroup cgroup /sys/fs/cgroup/devices ffff88012a2ec380 ffff8800b519f000 cgroup cgroup /sys/fs/cgroup/cpu,cpuacct ffff88012a2ec540 ffff8800b519f800 cgroup cgroup /sys/fs/cgroup/blkio ffff88012a2ec8c0 ffff88012a377800 configfs configfs /sys/kernel/config ffff880129c608c0 ffff8800b4bfd000 ext4 /dev/nbd0 / ffff880129c60a80 ffff8800b5222000 rpc_pipefs rpc_pipefs /var/lib/nfs/rpc_pipefs ffff8800b53ce700 ffff88012a371000 autofs systemd-1 /proc/sys/fs/binfmt_misc ffff8801377f6700 ffff88012b260800 mqueue mqueue /dev/mqueue ffff8801377f68c0 ffff880129f61800 hugetlbfs hugetlbfs /dev/hugepages ffff880129c60c40 ffff880139947800 debugfs debugfs /sys/kernel/debug ffff880129c60e00 ffff8800b406c800 binfmt_misc binfmt_misc /proc/sys/fs/binfmt_misc/ ffff8801377f6c40 ffff880129f60000 ramfs none /mnt ffff880129c60fc0 ffff8800b4068000 tmpfs none /var/lib/stateless/writable ffff8801377f6e00 ffff8800b498d000 squashfs /dev/vda /home/green/git/lustre-release ffff8800b53ce8c0 ffff8800b4068000 tmpfs none /var/cache/man ffff8800b53cea80 ffff8800b4068000 tmpfs none /var/log ffff8800b53cec40 ffff8800b4068000 tmpfs none /var/lib/dbus ffff8800b53cee00 ffff8800b4068000 tmpfs none /tmp ffff8800b53cefc0 ffff8800b4068000 tmpfs none /var/lib/dhclient ffff8800b53cf180 ffff8800b4068000 tmpfs none /var/tmp ffff8800b53cf340 ffff8800b4068000 tmpfs none /var/lib/NetworkManager ffff8801377f6fc0 ffff8800b4068000 tmpfs none /var/lib/systemd/random-seed ffff880129c61340 ffff8800b4068000 tmpfs none /var/spool ffff8800b53cf500 ffff8800b4068000 tmpfs none /var/lib/nfs ffff880129c61500 ffff8800b4068000 tmpfs none /var/lib/gssproxy ffff8800b53cf6c0 ffff8800b4068000 tmpfs none /var/lib/logrotate ffff8800b53cf880 ffff8800b4068000 tmpfs none /etc ffff8800b53cfa40 ffff8800b4068000 tmpfs none /var/lib/rsyslog ffff880129c616c0 ffff8800b4068000 tmpfs none /var/lib/dhclient/var/lib/dhclient ffff880129c61880 ffff8800b5225000 nfs4 192.168.200.253:/exports/state/oleg351-server.virtnet /var/lib/stateless/state ffff8800b53cfdc0 ffff8800b5225000 nfs4 192.168.200.253:/exports/state/oleg351-server.virtnet /boot ffff8801377f7180 ffff8800b5225000 nfs4 192.168.200.253:/exports/state/oleg351-server.virtnet /etc/etc/kdump.conf ffff8800b53cfc00 ffff8800b5222000 rpc_pipefs sunrpc /var/lib/nfs/var/lib/nfs/rpc_pipefs ffff8800b1ed96c0 ffff8800b498d000 squashfs /dev/vda /usr/sbin/mount.lustre ffff88012a2ed340 ffff88012b8eb000 lustre /dev/mapper/ost1_flakey /mnt/lustre-ost1 ffff88012a2edc00 ffff88012c2c7800 lustre /dev/mapper/ost2_flakey /mnt/lustre-ost2 ffff8800b4bcba40 ffff8800795c7000 tmpfs tmpfs /run/user/0 +-------------------------------+ >----------------------| Last 40 lines of dmesg buffer |----------------------< +-------------------------------+ [ 6286.617332] Lustre: DEBUG MARKER: oleg351-server.virtnet: executing load_modules_local [ 6293.494690] Lustre: DEBUG MARKER: == sanityn test 110: do not grant another lock on resend ========================================================== 10:13:30 (1713449610) [ 6295.250217] LustreError: 10053:0:(mdt_handler.c:2463:mdt_getattr_name_lock()) cfs_fail_timeout id 534 sleeping for 14000ms [ 6295.255324] LustreError: 10053:0:(mdt_handler.c:2463:mdt_getattr_name_lock()) Skipped 3 previous similar messages [ 6309.259943] LustreError: 10053:0:(mdt_handler.c:2463:mdt_getattr_name_lock()) cfs_fail_timeout id 534 awake [ 6309.264802] Lustre: 10053:0:(service.c:2359:ptlrpc_server_handle_request()) @@@ Request took longer than estimated (11/3s); client may timeout req@ffff880121ba0380 x1796682127775296/t0(0) o101->ff4aa31a-b49a-4931-9ff8-5a9956796f6c@192.168.203.51@tcp:529/0 lens 584/640 e 0 to 0 dl 1713449624 ref 1 fl Complete:/200/0 rc 0/0 job:'stat.0' uid:0 gid:0 [ 6309.267242] LustreError: 28609:0:(mdt_handler.c:2463:mdt_getattr_name_lock()) cfs_fail_timeout id 534 sleeping for 25000ms [ 6309.936728] Lustre: lustre-MDT0000: Client ff4aa31a-b49a-4931-9ff8-5a9956796f6c (at 192.168.203.51@tcp) reconnecting [ 6309.942029] Lustre: Skipped 1 previous similar message [ 6316.045932] LustreError: 6944:0:(service.c:2223:ptlrpc_server_handle_req_in()) cfs_fail_timeout id 534 awake [ 6325.972029] Lustre: lustre-MDT0000: Client ff4aa31a-b49a-4931-9ff8-5a9956796f6c (at 192.168.203.51@tcp) reconnecting [ 6325.979450] LustreError: 8091:0:(service.c:2223:ptlrpc_server_handle_req_in()) cfs_fail_timeout id 534 sleeping for 6000ms [ 6325.985318] LustreError: 8091:0:(service.c:2223:ptlrpc_server_handle_req_in()) Skipped 1 previous similar message [ 6331.989928] LustreError: 8091:0:(service.c:2223:ptlrpc_server_handle_req_in()) cfs_fail_timeout id 534 awake [ 6334.366926] LustreError: 28609:0:(mdt_handler.c:2463:mdt_getattr_name_lock()) cfs_fail_timeout id 534 awake [ 6342.004674] Lustre: lustre-MDT0000: Client ff4aa31a-b49a-4931-9ff8-5a9956796f6c (at 192.168.203.51@tcp) reconnecting [ 6358.047981] Lustre: lustre-MDT0000: Client ff4aa31a-b49a-4931-9ff8-5a9956796f6c (at 192.168.203.51@tcp) reconnecting [ 6360.836004] Lustre: DEBUG MARKER: == sanityn test 111: A racy rename/link an open file should not cause fs corruption ========================================================== 10:14:38 (1713449678) [ 6361.363706] LustreError: 28609:0:(mdt_reint.c:1408:mdt_reint_link()) cfs_race id 18a sleeping [ 6361.370471] LustreError: 28609:0:(mdt_reint.c:1408:mdt_reint_link()) Skipped 31 previous similar messages [ 6363.389244] LustreError: 19611:0:(mdt_reint.c:2969:mdt_reint_rename()) cfs_fail_race id 18a waking [ 6363.393390] LustreError: 19611:0:(mdt_reint.c:2969:mdt_reint_rename()) Skipped 1 previous similar message [ 6363.397805] LustreError: 28609:0:(mdt_reint.c:1408:mdt_reint_link()) cfs_fail_race id 18a awake: rc=2980 [ 6363.402651] LustreError: 28609:0:(mdt_reint.c:1408:mdt_reint_link()) Skipped 31 previous similar messages [ 6368.973754] Lustre: DEBUG MARKER: == sanityn test 112: update max-inherit in default LMV === 10:14:46 (1713449686) [ 6375.371340] Lustre: DEBUG MARKER: == sanityn test 113: check servers of specified fs ======= 10:14:52 (1713449692) [ 6379.967334] Lustre: DEBUG MARKER: == sanityn test 114: implicit default LMV inherit ======== 10:14:57 (1713449697) [ 6390.091336] Lustre: DEBUG MARKER: == sanityn test 115: ldiskfs doesn't check direntry for uniqueness ========================================================== 10:15:07 (1713449707) [ 6398.771689] LustreError: 29909:0:(mdt_reint.c:622:mdt_create()) cfs_fail_timeout id 2401 sleeping for 5000ms [ 6403.775935] LustreError: 29909:0:(mdt_reint.c:622:mdt_create()) cfs_fail_timeout id 2401 awake [ 6406.885221] Lustre: DEBUG MARKER: cleanup: ====================================================== [ 6407.583677] Lustre: DEBUG MARKER: == sanityn test complete, duration 6292 sec ============== 10:15:24 (1713449724) [ 6513.416853] LustreError: 11-0: lustre-MDT0000-osp-MDT0001: operation out_update to node 0@lo failed: rc = -107 [ 6513.421592] Lustre: lustre-MDT0000-osp-MDT0001: Connection to lustre-MDT0000 (at 0@lo) was lost; in progress operations using this service will wait for recovery to complete [ 6513.428968] Lustre: lustre-MDT0000: Not available for connect from 0@lo (stopping) [ 6515.229636] Lustre: server umount lustre-MDT0000 complete [ 6518.153780] LustreError: 6929:0:(ldlm_lockd.c:2594:ldlm_cancel_handler()) ldlm_cancel from 0@lo arrived at 1713449836 with bad export cookie 13305530880106328799 [ 6518.155325] LustreError: 166-1: MGC192.168.203.151@tcp: Connection to MGS (at 0@lo) was lost; in progress operations using this service will fail [ 6518.166350] LustreError: 6929:0:(ldlm_lockd.c:2594:ldlm_cancel_handler()) Skipped 4 previous similar messages [ 6518.384436] Lustre: server umount lustre-MDT0001 complete ****************************************************************************** ************************ A Summary Of Problems Found ************************* ****************************************************************************** -------------------- A list of all +++WARNING+++ messages -------------------- PARTIAL DUMP with size(vmcore) < 25% size(RAM) There are 3 threads running in their own namespaces Use 'taskinfo --ns' to get more details ------------------------------------------------------------------------------ ** Execution took 10.79s (real) 5.67s (CPU), Child processes: 5.10s