Oracle Linux 6.7上的NFS性能问题：NFS上的共享库减慢了系统速度

我目前正在寻找具有相同内存和cpu的一些VMWare虚拟机上的一些性能问题,运行Oracle Linux 6.6(内核2.6)和6.7(内核3.8).这些机器通过NFS4挂载相同的共享,其中包含使用LD_LIBRARY_PATH包含在构建中的一些共享库.两个系统都使用相同的挂载选项(默认),这对6.7来说意味着“硬”而对6.6来说意味着“软”.从6.7开始,我们观察到我们的编译过程的性能下降了5倍,而cpu闲置了大约80％,但也没有观察到高的等待(顶级报告~0.4％wa).

试图重现这个问题我很快发现,只要编译,但几乎任何命令,例如“ls”,只要通过LD_LIBRARY_PATH包含来自NFS挂载的共享库,就会在6.7上慢得多.

我开始使用简单的“时间”进行调查：

在6.6：
如果没有设置LD_LIBRARY_PATH和PATH：

$time for i in $(seq 0 1000); do ls done;
... ls output 
real    0m2.917s
user    0m0.288s
sys     0m1.012s

将LD_LIBRARY_PATH和PATH设置为包含NFS上的目录

$time for i in $(seq 0 1000); do ls done;
... ls output
real    0m2.766s
user    0m0.184s
sys     0m1.051s

在6.7没有LD_LIBRARY_PATH

$time for i in $(seq 0 1000); do ls done;
...
real    0m5.144s
user    0m0.280s
sys     0m1.172s

并使用LD_LIBRARY_PATH

$time for i in $(seq 0 1000); do ls done;
...
real    1m27.329s
user    0m0.537s
sys     0m1.792s

巨大的开销使我很好奇,并且我发现NFS共享上的一些共享库的解析需要很长时间：

同样,如果没有设置LD_LIBRARY_PATH,strace输出中的“open”调用如下所示：

$strace -T ls 2>&1|vim - # keep only the "open" calls

open("/etc/ld.so.cache",O_RDONLY)      = 3 <0.000014>
open("/lib64/libselinux.so.1",O_RDONLY) = 3 <0.000013>
open("/lib64/librt.so.1",O_RDONLY)     = 3 <0.000016>
open("/lib64/libcap.so.2",O_RDONLY)    = 3 <0.000014>
open("/lib64/libacl.so.1",O_RDONLY)    = 3 <0.000014>
open("/lib64/libc.so.6",O_RDONLY)      = 3 <0.000016> 
open("/lib64/libdl.so.2",O_RDONLY)     = 3 <0.000014>
open("/lib64/libpthread.so.0",O_RDONLY) = 3 <0.000014>
open("/lib64/libattr.so.1",O_RDONLY)   = 3 <0.000014>
open("/proc/filesystems",O_RDONLY)     = 3 <0.000032>
open("/usr/lib/locale/locale-archive",O_RDONLY) = 3 <0.000014>
open(".",O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 <0.001255>

使用LD_LIBRARY_PATH它看起来像这样：

open("/usr/local/lib/librt.so.1",O_RDONLY) = -1 ENOENT (No such file or directory) <0.000013>
open("/lib64/librt.so.1",O_RDONLY)     = 3 <0.000018>
open("/oracle/current/lib/libcap.so.2",O_RDONLY) = -1 ENOENT (No such file or directory) <0.006196>
open("/opt/development/opt/gcc/gcc-5.3.0/lib64/libcap.so.2",O_RDONLY) = -1 ENOENT (No such file or directory) <0.002042>
open("/usr/local/lib/libcap.so.2",O_RDONLY) = -1 ENOENT (No such file or directory) <0.000035>
open("/lib64/libcap.so.2",O_RDONLY)    = 3 <0.000039>
open("/oracle/current/lib/libacl.so.1",O_RDONLY) = -1 ENOENT (No such file or directory) <0.009304>
open("/opt/development/opt/gcc/gcc-5.3.0/lib64/libacl.so.1",O_RDONLY) = -1 ENOENT (No such file or directory) <0.009107>
open("/usr/local/lib/libacl.so.1",O_RDONLY) = -1 ENOENT (No such file or directory) <0.000023>
open("/lib64/libacl.so.1",O_RDONLY)    = 3 <0.000027>
open("/oracle/current/lib/libc.so.6",O_RDONLY) = -1 ENOENT (No such file or directory) <0.009520>
open("/opt/development/opt/gcc/gcc-5.3.0/lib64/libc.so.6",O_RDONLY) = -1 ENOENT (No such file or directory) <0.007850>
open("/usr/local/lib/libc.so.6",O_RDONLY) = -1 ENOENT (No such file or directory) <0.000024>
open("/lib64/libc.so.6",O_RDONLY)      = 3 <0.000030>
open("/oracle/current/lib/libdl.so.2",O_RDONLY) = -1 ENOENT (No such file or directory) <0.006916>
open("/opt/development/opt/gcc/gcc-5.3.0/lib64/libdl.so.2",O_RDONLY) = -1 ENOENT (No such file or directory) <0.013979>
open("/usr/local/lib/libdl.so.2",O_RDONLY) = -1 ENOENT (No such file or directory) <0.000023>
open("/lib64/libdl.so.2",O_RDONLY)     = 3 <0.000030>
open("/oracle/current/lib/libpthread.so.0",O_RDONLY) = -1 ENOENT (No such file or directory) <0.015317>
open("/opt/development/opt/gcc/gcc-5.3.0/lib64/libpthread.so.0",O_RDONLY) = -1 ENOENT (No such file or directory) <0.014230>
open("/usr/local/lib/libpthread.so.0",O_RDONLY) = -1 ENOENT (No such file or directory) <0.000014>
open("/lib64/libpthread.so.0",O_RDONLY) = 3 <0.000019>
open("/oracle/current/lib/libattr.so.1",O_RDONLY) = -1 ENOENT (No such file or directory) <0.015212>
open("/opt/development/opt/gcc/gcc-5.3.0/lib64/libattr.so.1",O_RDONLY) = -1 ENOENT (No such file or directory) <0.011979>
open("/usr/local/lib/libattr.so.1",O_RDONLY) = -1 ENOENT (No such file or directory) <0.000014>
open("/lib64/libattr.so.1",O_RDONLY)   = 3 <0.000018>
open("/proc/filesystems",O_RDONLY)     = 3 <0.000025>
open(".",O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 <0.000014>

除了拨打更多的电话,这在6.6上是相同的,在6.7上,一些(不成功的)打开在6.7上需要很长时间(与~0.000020相比,最多0.01).

所以我开始调查NFS.而nfsstat确实为两个系统显示了一些令人惊讶的不同统计数据：

在6.7

$nfsstat
Client rpc stats:
calls      retrans    authrefrsh
1314991    0          1315849 

Client nfs v4:
null         read         write        commit       open         open_conf    
0         0% 3782      0% 1589      0% 1         0% 561257   42% 53        0% 
open_noat    open_dgrd    close        setattr      fsinfo       renew        
0         0% 0         0% 4750      0% 383       0% 7         0% 4094      0% 
setclntid    confirm      lock         lockt        locku        access       
2         0% 2         0% 80        0% 0         0% 80        0% 538017   40% 
getattr      lookup       lookup_root  remove       rename       link         
172506   13% 20536     1% 2         0% 112       0% 541       0% 2         0% 
symlink      create       pathconf     statfs       readlink     readdir      
0         0% 9         0% 5         0% 2057      0% 164       0% 942       0% 
server_caps  delegreturn  getacl       setacl       fs_locations rel_lkowner  
12        0% 2968      0% 0         0% 0         0% 0         0% 80        0% 
secinfo      exchange_id  create_ses   destroy_ses  sequence     get_lease_t  
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
reclaim_comp layoutget    getdevinfo   layoutcommit layoutreturn getdevlist   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
(null)       
0         0%

在6.6

$nfsstat 
Client rpc stats:
calls      retrans    authrefrsh
637725     0          637781  

Client nfs v4:
null         read         write        commit       open         open_conf    
0         0% 23782     3% 13127     2% 48        0% 41244     6% 406       0% 
open_noat    open_dgrd    close        setattr      fsinfo       renew        
0         0% 0         0% 31228     4% 14668     2% 7         0% 27319     4% 
setclntid    confirm      lock         lockt        locku        access       
1         0% 1         0% 8493      1% 2         0% 8459      1% 175320   27% 
getattr      lookup       lookup_root  remove       rename       link         
134732   21% 112688   17% 2         0% 1007      0% 6728      1% 4         0% 
symlink      create       pathconf     statfs       readlink     readdir      
11        0% 129       0% 5         0% 7624      1% 143       0% 11507     1% 
server_caps  delegreturn  getacl       setacl       fs_locations rel_lkowner  
12        0% 12732     1% 0         0% 0         0% 0         0% 6335      0% 
secinfo      exchange_id  create_ses   destroy_ses  sequence     get_lease_t  
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
reclaim_comp layoutget    getdevinfo   layoutcommit layoutreturn getdevlist   
0         0% 0         0% 0         0% 0         0% 0         0% 0         0% 
(null)       
0         0%

这似乎确认了6.7的长时间开放时间,但我真的不知道如何通过挂载选项来解决这个问题.

更多的实验表明,即使从具有最新操作系统(CentOS 7,Ubuntu 16.04)的Docker容器中安装NFS(我做了以排除nfsutils等问题),6.7主机上的性能总是表现出相同的缓慢性能使用NFS4时.使用6.7的NFS3性能与6.6一样好.

此时我预计底层主机上的内核(-module)或vmware-tools会导致问题,但我错过了如何进一步挖掘的想法.

这是一个已知的问题吗？
我是否想念一些可能的罪魁祸首？
你会如何进一步挖掘？
有没有办法分析NFS客户端？
我怎样才能排除vmware驱动程序的问题？

当然：有没有人为我提供简单的解决方案？

编辑：今天早上我确实在一个不同的方向挖掘：
使用tcpdump我再次检查了NFS流量,看起来6.7上没有发生缓存.每次访问(不存在的)共享库总是会导致实际的NFS流量,因为LD_LIBRARY_PATH不包含大多数库通常会返回
回复ok 52 getattr错误：没有这样的文件或目录.在6.6只有第一个导致实际流量.
通过了解这一点,我能够通过将NFS路径从LD_LIBRARY_PATH移动到带有编译过程所需库的额外ld.so.conf文件来解决标准命令(如“ls”)的基础性能问题.但是,这仍然只是一种解决方法,现在问题似乎是,NFS客户端中没有发生缓存.所以我再次尝试按照建议here激活NFS上的文件系统缓存,但仍然每次“打开”都会导致NFS流量和编译速度仍然慢得令人无法接受.

根据shodanshok的要求：

在6.6

server:/share /mnt nfs4 rw,relatime,vers=4,rsize=1048576,wsize=1048576,namlen=255,soft,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=<clientip>,minorversion=0,local_lock=none,addr=<serverip> 0 0

在6.7(激活fsc后)

server:/share /mnt nfs4 ro,vers=4.0,fsc,addr=<serverip> 0 0

这两台机器都没有使用nscd.不过我前段时间在两台机器上都安装了cachefilesd,看看这对性能是否有帮助,但事实并非如此.目前6.6上的缓存甚至没有活动(/ var / cache / fscache / *为空)并且从今天早上在mount选项中使用fsc开始它实际上包含6.7上的3个文件,但它似乎没有缓存非存在共享库路径,因此性能没有改变.
对于不存在的文件,我希望acregmin等会产生影响,但是虽然它们(默认)值对我来说似乎是合理的,但它们似乎没有任何影响.

mountstats的输出

Stats for server:/share mounted on /mnt:
NFS mount options: rw,acregmin=3,acregmax=60,acdirmin=30,acdirmax=60,clientaddr=10.114.160.159,local_lock=none
NFS server capabilities: caps=0xfff7,wtmult=512,dtsize=32768,bsize=0,namlen=255
NFSv4 capability flags: bm0=0xfdffafff,bm1=0xf9be3e,acl=0x0,pnfs=notconfigured
NFS security flavor: 1  pseudoflavor: 0

有谁知道这些标志(即cap,bm0,bm1,……)是什么意思？

消毒ps轴输出：
在6.6

PID TTY      STAT   TIME COMMAND
  1 ?        Ss     0:01 /sbin/init
  2 ?        S      0:00 [kthreadd]
  3 ?        S      0:05 [ksoftirqd/0]
  6 ?        S      0:00 [migration/0]
  7 ?        S      0:10 [watchdog/0]
 37 ?        S<     0:00 [cpuset]
 38 ?        S<     0:00 [khelper]
 39 ?        S<     0:00 [netns]
 40 ?        S      0:06 [sync_supers]
 41 ?        S      0:00 [bdi-default]
 42 ?        S<     0:00 [kintegrityd]
 43 ?        S<     0:00 [kblockd]
 50 ?        S      0:49 [kworker/1:1]
 51 ?        S<     0:00 [xenbus_frontend]
 52 ?        S<     0:00 [ata_sff]
 53 ?        S      0:00 [khubd]
 54 ?        S<     0:00 [md]
 55 ?        S      0:01 [khungtaskd]
 56 ?        S      0:04 [kswapd0]
 57 ?        SN     0:00 [ksmd]
 58 ?        S      0:00 [fsnotify_mark]
 59 ?        S<     0:00 [crypto]
 64 ?        S<     0:00 [kthrotld]
 66 ?        S<     0:00 [kpsmoused]
240 ?        S      0:00 [scsi_eh_0]
241 ?        S      0:00 [scsi_eh_1]
248 ?        S<     0:00 [mpt_poll_0]
249 ?        S<     0:00 [mpt/0]
250 ?        S      0:00 [scsi_eh_2]
313 ?        S<     0:00 [kdmflush]
325 ?        S      0:00 [kjournald]
445 ?        S<s    0:00 /sbin/udevd -d
706 ?        S<     0:00 [vmmemctl]
815 ?        S<     0:00 [kdmflush]
865 ?        S      0:08 [kjournald]
907 ?        S      0:00 [kauditd]
1091 ?        S      0:11 [flush-252:2]
1243 ?        S     26:05 /usr/sbin/vmtoolsd
1311 ?        Ssl    0:03 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
1334 ?        Ss     0:06 cachefilesd -f /etc/cachefilesd.conf
1361 ?        Ss     6:55 irqbalance --pid=/var/run/irqbalance.pid
1377 ?        Ss     0:02 rpcbind
1397 ?        Ss     0:00 rpc.statd
1428 ?        S<     0:00 [rpciod]
1433 ?        Ss     0:00 rpc.idmapd 
1507 ?        S<     0:00 [nfsiod]
1508 ?        S      0:00 [nfsv4.0-svc]  
1608 ?        Ss     0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid
1783 ?        Ss     0:11 crond
1796 ?        Ss     0:00 /usr/sbin/atd
1807 ?        Ss     0:01 rhnsd
1989 ?        S     99:05 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr
4879 ?        S      0:00 /bin/sh /etc/xrdp/startwm.sh
4904 ?        Ss     0:02 /bin/dbus-daemon --fork --print-pid 5 --print- address 7 --session
4924 ?        S     60:14 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr

并在6.7：

PID TTY      STAT   TIME COMMAND
  1 ?        Ss     0:01 /sbin/init
  3 ?        S      0:10 [ksoftirqd/0]
  5 ?        S<     0:00 [kworker/0:0H]
  8 ?        S      0:19 [migration/0]
 11 ?        S      0:02 [watchdog/0]
 47 ?        S<     0:00 [cpuset]
 48 ?        S<     0:00 [khelper]
 49 ?        S      0:00 [kdevtmpfs]
 50 ?        S<     0:00 [netns]
 51 ?        S      0:00 [bdi-default]
 52 ?        S<     0:00 [kintegrityd]
 53 ?        S<     0:00 [crypto]
 54 ?        S<     0:00 [kblockd]
 62 ?        S<     0:00 [ata_sff]
 63 ?        S      0:00 [khubd]
 64 ?        S<     0:00 [md]
 66 ?        S      0:00 [khungtaskd]
 67 ?        S      0:36 [kswapd0]
 68 ?        SN     0:00 [ksmd]
 69 ?        S      0:00 [fsnotify_mark]
 80 ?        S<     0:00 [kthrotld]
 84 ?        S<     0:00 [deferwq]
151 ?        S<     0:00 [ttm_swap]
273 ?        S      0:00 [scsi_eh_0]
274 ?        S      0:00 [scsi_eh_1]
281 ?        S<     0:00 [mpt_poll_0]
282 ?        S<     0:00 [mpt/0]
283 ?        S      0:00 [scsi_eh_2]
374 ?        S<     0:00 [kdmflush]
387 ?        S      0:00 [kjournald]
480 ?        S<s    0:00 /sbin/udevd -d
872 ?        S<     0:00 [kworker/2:1H]
1828 ?        S<     0:00 [kdmflush]
1834 ?        S<     0:00 [kdmflush]
1837 ?        S<     0:00 [kdmflush]
1840 ?        S<     0:00 [kdmflush]
1881 ?        S      0:00 [kjournald]
1882 ?        S      0:03 [kjournald]
1883 ?        S      0:03 [kjournald]
1884 ?        S      3:14 [kjournald]
1926 ?        S      0:00 [kauditd]
2136 ?        S      1:37 [flush-252:1]
2137 ?        S      0:02 [flush-252:2]
2187 ?        S      5:04 /usr/sbin/vmtoolsd
2214 ?        S      0:00 /usr/lib/vmware-vgauth/VGAuthService -s
2264 ?        Sl     1:54 ./ManagementAgentHost
2327 ?        Ssl    0:00 /sbin/rsyslogd -i /var/run/syslogd.pid -c 5
2368 ?        Ss     0:00 rpcbind
2390 ?        Ss     0:00 rpc.statd
2425 ?        S<     0:00 [rpciod]
2430 ?        Ss     0:00 rpc.idmapd
2456 ?        Ss     0:00 dbus-daemon --system
2473 ?        S      0:00 [kworker/7:2]
2501 ?        S<     0:00 [nfsiod]
2504 ?        S      0:00 [nfsv4.0-svc]
2519 ?        Ss     0:00 /usr/sbin/acpid
2531 ?        Ssl    0:02 hald
2532 ?        S      0:00 hald-runner
2564 ?        S      0:00 hald-addon-input: Listening on /dev/input/ event1 /dev/input/event0
2580 ?        S      0:00 hald-addon-acpi: listening on acpid socket /var/run/acpid.socket
2618 ?        Ss     0:00 /usr/sbin/sshd
2629 ?        Ss     0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid 
2778 ?        S      0:00 qmgr -l -t fifo -u
2811 ?        S      0:56 /usr/bin/python /usr/sbin/osad --pid-file /var/run/osad.pid
2887 ?        S<     0:00 [dm_bufio_cache]
3008 ?        Ss     0:00 rhnsd
3117 ?        S      9:44 /usr/lib/vmware-tools/sbin64/vmtoolsd -n vmusr
3195 ?        S      0:00 /usr/libexec/polkit-1/polkitd
3825 ?        S<     0:17 [loop0]
3828 ?        S<     0:21 [loop1]
3830 ?        S<     0:00 [kdmflush]
3833 ?        S<     0:00 [kcopyd]
3834 ?        S<     0:00 [dm-thin]
6876 ?        S      0:00 (unlinkd)
19358 ?        S      0:00 [flush-0:19]
24484 ?        S<     0:00 /sbin/udevd -d
24921 ?        S<     0:00 /sbin/udevd -d
26201 ?        Ss     0:00 cachefilesd -f /etc/cachefilesd.conf
29311 ?        S<     0:00 [kdmflush]
29316 ?        S      0:00 [jbd2/dm-6-8]
29317 ?        S<     0:00 [ext4-dio-unwrit]

几天前,我确信在比较两个系统上sysctl -a的输出时,我发现了这个问题,显示fs.nfs.idmap_cache_timeout的差异,在6.6上设置为600,在6.7设置为0,但也改变了它没有达到预期的效果.

我确实找到了另一个有用的命令：rpcdebug -m nfs -s all打印出大量有关缓存到系统日志的调试信息(在我的情况下为/ var / log / messages).做我的ls的大多数条目看起来如下

Feb 27 10:45:16 host kernel: NFS: nfs_lookup_revalidate(//opt) is valid
Feb 27 10:45:16 host kernel: NFS: nfs_lookup_revalidate(opt/gcc) is valid
Feb 27 10:45:16 host kernel: NFS: nfs_lookup_revalidate(gcc/gcc-5.3.0) is valid
Feb 27 10:45:16 host kernel: NFS: nfs_lookup_revalidate(gcc-5.3.0/lib64) is valid

其中每秒阻止多个实例(即使是lookupcache = all).

干杯!

顺便说一下,使用软装置要小心：

With soft-mounted filesystems,you have to worry about damaging data due to incomplete writes,losing access to the text segment of a swapped process,and making soft-mounted filesystems more tolerant of variances in server response time.

To guarantee data integrity,all filesystems mounted read-write should be hard-mounted.

Soft mount issues

建议在所有NFS安装的文件系统上使用hard,intr.

原文链接：https://www.f2er.com/oracle/205365.html

Oracle Linux 6.7上的NFS性能问题：NFS上的共享库减慢了系统速度

猜你在找的Oracle相关文章