我有一个设置,Nginx充当CherryPy应用服务器前面的反向代理.我正在使用ab来比较通过Nginx而不是Nginx的性能,并注意到前一种情况的最坏情况表现更差:
$ab -n 200 -c 10 'http://localhost/noop'
This is ApacheBench,Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss,Zeus Technology Ltd,http://www.zeustech.net/
Licensed to The Apache Software Foundation,http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Finished 200 requests
Server Software: Nginx
Server Hostname: localhost
Server Port: 80
Document Path: /noop
Document Length: 0 bytes
Concurrency Level: 10
Time taken for tests: 3.145 seconds
Complete requests: 200
Failed requests: 0
Write errors: 0
Total transferred: 29600 bytes
HTML transferred: 0 bytes
Requests per second: 63.60 [#/sec] (mean)
Time per request: 157.243 [ms] (mean)
Time per request: 15.724 [ms] (mean,across all concurrent requests)
Transfer rate: 9.19 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 1
Processing: 5 48 211.7 31 3007
Waiting: 5 48 211.7 31 3007
Total: 5 48 211.7 31 3007
Percentage of the requests served within a certain time (ms)
50% 31
66% 36
75% 39
80% 41
90% 46
95% 51
98% 77
99% 252
100% 3007 (longest request)
$ab -n 200 -c 10 'http://localhost:8080/noop'
This is ApacheBench,http://www.apache.org/
Benchmarking localhost (be patient)
Completed 100 requests
Completed 200 requests
Finished 200 requests
Server Software: CherryPy/3.2.0
Server Hostname: localhost
Server Port: 8080
Document Path: /noop
Document Length: 0 bytes
Concurrency Level: 10
Time taken for tests: 0.564 seconds
Complete requests: 200
Failed requests: 0
Write errors: 0
Total transferred: 27600 bytes
HTML transferred: 0 bytes
Requests per second: 354.58 [#/sec] (mean)
Time per request: 28.202 [ms] (mean)
Time per request: 2.820 [ms] (mean,across all concurrent requests)
Transfer rate: 47.79 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 1.7 0 11
Processing: 6 26 23.5 24 248
Waiting: 3 25 23.6 23 248
Total: 6 26 23.4 24 248
Percentage of the requests served within a certain time (ms)
50% 24
66% 27
75% 29
80% 31
90% 34
95% 40
98% 51
99% 234
100% 248 (longest request)
可能是什么导致了这个?我唯一能想到的是Nginx以不同于他们到达的顺序向后端发送请求,但这似乎难以置信.
该机器是具有2个内核的EC2 c1.medium实例,CherryPy使用具有10个线程的线程池,并且Nginx具有worker_connections = 1024.
更新:两个令人困惑的发现:
>在给定的并发性下,发送更多请求可提高性能.对于40和40个请求的并发性,我的中位时间为3秒,最大值为10.5秒;在并发40和200个请求的情况下,我的中位数为38毫秒(!),最大值为7.5秒.事实上,200个请求的总时间少了! (6.5s vs.4.5s for 40).这一切都是可重复的.
>使用strace监控两个Nginx工作进程可以大大提高其性能,例如中位时间为3s至77ms,没有明显改变其行为. (我测试了一个非常重要的API调用,并确认strace不会改变响应,以及所有这些性能观察结果仍然存在.)这也是可重复的.
>后端上的侦听队列太小导致偶尔的侦听队列溢出(Linux通常配置为在这种情况下丢弃SYN数据包,从而使其与数据包丢失不可分割;请参阅netstat -s | grep侦听以确定它是否是问题).
> localhost上的状态防火墙接近它对状态数量的限制,并因此丢弃一些随机SYN数据包.
>由于套接字处于TIME_WAIT状态,系统超出套接字/本地端口,如果使用Linux,请参阅this question.
您必须仔细检查您的操作系统,找出原因并相应地配置您的操作系统.您可能还需要遵循适用于您的操作系统的某些网络子系统调整指南.请注意,此处EC2可能有点具体,因为有报告称EC2实例的网络性能非常有限.
从Nginx的角度来看,任何解决方案都会有或多或少的错误(因为问题不在于Nginx,而是在无法处理负载和丢弃数据包的操作系统中).不过,您可以尝试一些技巧来减少OS网络子系统的负载:
>配置keepalive connections to a backend.
>配置后端以侦听unix域套接字(如果您的后端支持它),并配置Nginx以代理对它的请求.