Parallel matches: 23077072 in 242ms Single Threaded matches: 23077072 in 2314ms
Parallel matches: 23077169 in 2322ms Single Threaded matches: 23077169 in 2316ms
如果我没有关闭程序,并且等待几分钟(不是几秒钟,但是几分钟),再次运行通行证之前,我再次获得第一次启动程序时得到的结果(响应时间提高10倍) 。
unit ParallelTests; interface uses Winapi.Windows,Winapi.Messages,System.SysUtils,System.Variants,System.Classes,Vcl.Graphics,System.Threading,System.SyncObjs,System.Diagnostics,Vcl.Controls,Vcl.Forms,Vcl.Dialogs,Vcl.StdCtrls; type TForm1 = class(TForm) Button1: TButton; Memo1: TMemo; SingleThreadCheckBox: TCheckBox; ParallelCheckBox: TCheckBox; UnitsEdit: TEdit; Label1: TLabel; procedure Button1Click(Sender: TObject); private { Private declarations } public { Public declarations } end; var Form1: TForm1; implementation {$R *.dfm} procedure TForm1.Button1Click(Sender: TObject); var matches: integer; i,j: integer; sw: TStopWatch; maxItems: integer; referenceStr: string; begin sw := TStopWatch.Create; maxItems := 5000; Randomize; SetLength(referenceStr,120000); for i := 1 to 120000 do referenceStr[i] := Chr(Ord('a') + Random(26)); if ParallelCheckBox.Checked then begin matches := 0; sw.Reset; sw.Start; TParallel.For(1,MaxItems,procedure (Value: Integer) var index: integer; found: integer; begin found := 0; for index := 1 to length(referenceStr) do begin if (((Value mod 26) + ord('a')) = ord(referenceStr[index])) then begin inc(found); end; end; TInterlocked.Add(matches,found); end); sw.Stop; Memo1.Lines.Add('Parallel matches: ' + IntToStr(matches) + ' in ' + IntToStr(sw.ElapsedMilliseconds) + 'ms'); end; if SingleThreadCheckBox.Checked then begin matches := 0; sw.Reset; sw.Start; for i := 1 to MaxItems do begin for j := 1 to length(referenceStr) do begin if (((i mod 26) + ord('a')) = ord(referenceStr[j])) then begin inc(matches); end; end; end; sw.Stop; Memo1.Lines.Add('Single Threaded matches: ' + IntToStr(Matches) + ' in ' + IntToStr(sw.ElapsedMilliseconds) + 'ms'); end; end; end.
请注意,我无法在AWS m3.large实例(根据AWS的2个vcpu)中重现此信息。在这种情况下,我总是轻微的改善,而且后续的TParallel呼叫也没有得到更糟的结果。
Parallel matches: 23077054 in 2057ms Single Threaded matches: 23077054 in 2900ms
UPDATE: After testing it with varIoUs instances of different vcpu
counts in AWS,this seems to be the behavIoUr:
- 36 vcpus (c4.8xlarge). You have to wait minutes between subsequent calls to a vanilla TParallel call (it makes it unusable for
production)- 32 vcpus (c3.8xlarge). You have to wait minutes between subsequent calls to a vanilla TParallel call (it makes it unusable for
production)- 16 vcpus (c3.4xlarge). You have to wait sub second times. It could be usable if load is low but response time still important
- 8 vcpus (c3.2xlarge). It seems to work normally
- 4 vcpus (c3.xlarge). It seems to work normally
- 2 vcpus (m3.large). It seems to work normally
.我用XE7更新1和OTL r1397构建。我使用的OTL源对应于3.04版本。我用32位Windows编译器构建,使用发布版本选项。
我的测试机是运行Windows 7 x64的双Intel Xeon E5530。该系统有两个四核处理器。这是总共8个处理器,但系统说由于超线程而有16个处理器。经验告诉我,超线程只是营销guff,我从来没有看到在这台机器上超过8倍的扩展。
program SystemThreadingTest; {$APPTYPE CONSOLE} uses System.Diagnostics,System.Threading; const maxItems = 5000; DataSize = 100000; procedure DoTest; var matches: integer; i,j: integer; sw: TStopWatch; referenceStr: string; begin Randomize; SetLength(referenceStr,DataSize); for i := low(referenceStr) to high(referenceStr) do referenceStr[i] := Chr(Ord('a') + Random(26)); // parallel matches := 0; sw := TStopWatch.StartNew; TParallel.For(1,maxItems,procedure(Value: integer) var index: integer; found: integer; begin found := 0; for index := low(referenceStr) to high(referenceStr) do if (((Value mod 26) + Ord('a')) = Ord(referenceStr[index])) then inc(found); AtomicIncrement(matches,found); end); Writeln('Parallel matches: ',matches,' in ',sw.ElapsedMilliseconds,'ms'); // serial matches := 0; sw := TStopWatch.StartNew; for i := 1 to maxItems do for j := low(referenceStr) to high(referenceStr) do if (((i mod 26) + Ord('a')) = Ord(referenceStr[j])) then inc(matches); Writeln('Serial matches: ','ms'); end; begin while True do DoTest; end.
program OTLTest; {$APPTYPE CONSOLE} uses Winapi.Windows,OtlParallel; const maxItems = 5000; DataSize = 100000; procedure ProcessThreadMessages; var msg: TMsg; begin while PeekMessage(Msg,PM_REMOVE) and (Msg.Message <> WM_QUIT) do begin TranslateMessage(Msg); DispatchMessage(Msg); end; end; procedure DoTest; var matches: integer; i,DataSize); for i := low(referenceStr) to high(referenceStr) do referenceStr[i] := Chr(Ord('a') + Random(26)); // parallel matches := 0; sw := TStopWatch.StartNew; Parallel.For(1,maxItems).Execute( procedure(Value: integer) var index: integer; found: integer; begin found := 0; for index := low(referenceStr) to high(referenceStr) do if (((Value mod 26) + Ord('a')) = Ord(referenceStr[index])) then inc(found); AtomicIncrement(matches,'ms'); ProcessThreadMessages; // serial matches := 0; sw := TStopWatch.StartNew; for i := 1 to maxItems do for j := low(referenceStr) to high(referenceStr) do if (((i mod 26) + Ord('a')) = Ord(referenceStr[j])) then inc(matches); Writeln('Serial matches: ','ms'); end; begin while True do DoTest; end.
Parallel matches: 19230817 in 374ms Serial matches: 19230817 in 2423ms Parallel matches: 19230698 in 374ms Serial matches: 19230698 in 2409ms Parallel matches: 19230556 in 368ms Serial matches: 19230556 in 2433ms Parallel matches: 19230635 in 2412ms Serial matches: 19230635 in 2430ms Parallel matches: 19230843 in 2441ms Serial matches: 19230843 in 2413ms Parallel matches: 19230905 in 2493ms Serial matches: 19230905 in 2423ms Parallel matches: 19231032 in 2430ms Serial matches: 19231032 in 2443ms Parallel matches: 19230669 in 2440ms Serial matches: 19230669 in 2473ms Parallel matches: 19230811 in 2404ms Serial matches: 19230811 in 2432ms ....
Parallel matches: 19230667 in 422ms Serial matches: 19230667 in 2475ms Parallel matches: 19230663 in 335ms Serial matches: 19230663 in 2438ms Parallel matches: 19230889 in 395ms Serial matches: 19230889 in 2461ms Parallel matches: 19230874 in 391ms Serial matches: 19230874 in 2441ms Parallel matches: 19230617 in 385ms Serial matches: 19230617 in 2524ms Parallel matches: 19231021 in 368ms Serial matches: 19231021 in 2455ms Parallel matches: 19230904 in 357ms Serial matches: 19230904 in 2537ms Parallel matches: 19230568 in 373ms Serial matches: 19230568 in 2456ms Parallel matches: 19230758 in 333ms Serial matches: 19230758 in 2710ms Parallel matches: 19230580 in 371ms Serial matches: 19230580 in 2532ms Parallel matches: 19230534 in 336ms Serial matches: 19230534 in 2436ms Parallel matches: 19230879 in 368ms Serial matches: 19230879 in 2419ms Parallel matches: 19230651 in 409ms Serial matches: 19230651 in 2598ms Parallel matches: 19230461 in 357ms ....
有关于新的System.Threading库的许多bug报告。所有的迹象表明它的质量差。 Embarcadero在发布子标准库代码方面有着悠久的历史。我正在考虑TMonitor,XE3字符串帮助器,早期版本的System.IoUtils,FireMonkey。列表继续。
看来,质量是Embarcadero的一个大问题。代码释放相当明确地没有被充分测试,如果有的话。这对于线程库而言尤其麻烦,其中的错误可能处于休眠状态,只能在特定的硬件/软件配置中公开。 TMonitor的经验使我相信,Embarcadero没有足够的专业知识来生产高品质,正确的线程代码。
编辑:原始的OTL版本的程序有一个活的内存泄漏发生,因为一个丑陋的实现细节。 Parallel.For使用.Unobserved修饰符创建任务。这导致所述任务仅在某些内部消息窗口接收到“任务已终止”消息时被破坏。该窗口与Parallel.For调用者的线程相同,即在这种情况下在主线程中创建。由于主线程没有处理消息,任务从未被破坏,内存消耗(加上其他资源)刚刚堆积。有可能是由于该程序在一段时间后挂起。