我遇到一个问题,服务器启动一段时间后(〜周/几天),服务器将开始读取损坏的数据.例如,当我在新引导后运行文件的sha1sum时,它保持不变.但是过了一段时间我会开始得到段错误,从那时起每当我读到这个文件时,我得到一个不同的sha1sum.
我通过长时间的测试检查了S.M.A.R.T,并且我运行了一个扩展的memtest86(12次通过)
我的lspci如下:
00:00.0 Host bridge: Advanced Micro Devices [AMD] RS780 Host Bridge 00:01.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (int gfx) 00:06.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 2) 00:07.0 PCI bridge: Advanced Micro Devices [AMD] RS780 PCI to PCI bridge (PCIE port 3) 00:11.0 SATA controller: ATI Technologies Inc SB700/SB800 SATA Controller [AHCI mode] 00:12.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:12.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller 00:12.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:13.0 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI0 Controller 00:13.1 USB Controller: ATI Technologies Inc SB700 USB OHCI1 Controller 00:13.2 USB Controller: ATI Technologies Inc SB700/SB800 USB EHCI Controller 00:14.0 SMBus: ATI Technologies Inc SBx00 SMBus Controller (rev 3c) 00:14.1 IDE interface: ATI Technologies Inc SB700/SB800 IDE Controller 00:14.3 ISA bridge: ATI Technologies Inc SB700/SB800 LPC host controller 00:14.4 PCI bridge: ATI Technologies Inc SBx00 PCI to PCI Bridge 00:14.5 USB Controller: ATI Technologies Inc SB700/SB800 USB OHCI2 Controller 00:18.0 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron,Athlon64,Sempron] HyperTransport Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron,Sempron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron,Sempron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron,Sempron] Miscellaneous Control 00:18.4 Host bridge: Advanced Micro Devices [AMD] K10 [Opteron,Sempron] Link Control 01:05.0 VGA compatible controller: ATI Technologies Inc Radeon HD 3300 Graphics 01:05.1 Audio device: ATI Technologies Inc RS780 Azalia controller 02:00.0 Ethernet controller: Atheros Communications Atheros AR8121/AR8113/AR8114 PCI-E Ethernet Controller (rev b0) 03:00.0 FireWire (IEEE 1394): VIA Technologies,Inc. Device 3403
我真的可以在这方面使用一些帮助,你知道是什么原因引起的吗?这真让我感到沮丧,因为它似乎完全随机触发,直到我重新启动才会消失.我也使用KVM进行虚拟化以及此服务器上的软件RAID MD,处理器是Phenom II X4 965.我不相信它是软件raid,但是这会影响非raid分区上托管的文件,所以我不知道.
6月21日更新
好的,只是更换了主板.还是有同样的错误.我找不到cpu错误;磁盘都通过智能测试报告正常.有没有人知道这可能是什么?我把头发拉到这里.