集体博客: 点击这里进入
一. 问题形容
某一台跑批服务器硬盘无奈失常读写,提醒 input/output error,服务器每天均需应用,询问状况后发现服务器首先为硬盘故障,更换硬盘后提醒此谬误(RAID 已失常同步)
二. 排查问题
呈现问题,先查看日志,收集日志进行剖析查看,日志剖析后果如下:
[12922471.544897] smartpqi 0000:5e:00.0: reset of scsi 14:1:0:3: SUCCESS
[12922471.545034] sd 14:1:0:3: [sdd] Medium access timeout failure. Offlining disk!
…
[12922471.546144] blk_update_request: I/O error, dev sdd, sector 2351217920
[12922471.546473] sd 14:1:0:3: rejecting I/O to offline device
[12922471.547836] XFS (sdd1): metadata I/O error: block 0x8bbac400 (“xlog_iodone”) error 5 numblks 512
[12922471.547840] XFS (sdd1): xfs_do_force_shutdown(0x2) called from line 1200 of file fs/xfs/xfs_log.c. Return address = 0xffffffffc07a1ea0
[12922471.547866] XFS (sdd1): Log I/O Error Detected. Shutting down filesystem
[12922471.547868] XFS (sdd1): Please umount the filesystem and rectify the problem(s)
[12922471.547870] XFS (sdd1): metadata I/O error: block 0x8bbac600 (“xlog_iodone”) error 5 numblks 512
[12922471.547872] XFS (sdd1): xfs_do_force_shutdown(0x2) called from line 1200 of file fs/xfs/xfs_log.c. Return address = 0xffffffffc07a1ea0
[12922471.547891] XFS (sdd1): metadata I/O error: block 0x2bc1a6c0 (“xfs_trans_read_buf_map”) error 5 numblks 32
[12922471.547898] XFS (sdd1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
[12922471.548349] XFS (sdd1): metadata I/O error: block 0xc65b63f8 (“xfs_trans_read_buf_map”) error 5 numblks 8
[12922471.548390] XFS (sdd1): metadata I/O error: block 0x8bdb5820 (“xfs_trans_read_buf_map”) error 5 numblks 32
[12922471.548408] XFS (sdd1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
[12922471.548412] XFS (sdd1): metadata I/O error: block 0x11771540 (“xfs_trans_read_buf_map”) error 5 numblks 32
[12922471.548417] XFS (sdd1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
…
[15351852.339037] sd 14:1:0:3: rejecting I/O to offline device
- 查看日志发现磁盘曾经 offline,并且文件系统曾经异样.
三. 解决方案
- 1. 手动将此硬盘设置为 online
# echo running > /sys/block/sdd/device/state
- 2. 查问是否为 running
cat /sys/block/sdd/device/state
- 3. 修复文件系统,并确认硬盘处于 umount 状态(视状况而定,如无奈 umount 则只能进行重启,我是进行的重启操作)
- 4. 开始修复
XFS : Corruption detected. Unmount and run xfs_repair
官网文档如下:https://access.redhat.com/sol…
- 5. 依照上述办法修复实现后,再进行 mount 操作