我正在研究生成一个充满随机字节的文件(750 MB)。我在一个单独的线程中使用的代码如下所示:
我分配了一个这样大小的缓冲区,因为在磁盘上写入消耗更多的时间:
function Generate(buf:Pointer):DWORD;stdcall; var i:DWORD; begin for i := 0 to keysize -1 do PByte(DWORD(buf) + i)^ := Random(256); Result:=0; end;
问题在于,在整个流程完成之前需要花费很长时间。任何想法更快的方法?如果没有任何选择,我将尝试在汇编中实现它。
解决方法
这听起来像一个很好的练习问题,所以我继续实施并行解决方案。它使用稍微超过3秒钟来生成750 MB文件,并在其工作期间使用超过90%的cpu。 (SSD磁盘也有帮助,需要3,5秒来在RAID0磁盘对上生成文件,4秒钟可以在较慢的512 GB磁盘上生成文件。)
所有重用代码都可以与OpenBSD许可证一起使用(这几乎是“您希望使用”):DSiWin32,GpStuff,GpRandomGen,Otl*。
uses DSiWin32,GpStuff,GpRandomGen,OtlCommon,OtlCollections,OtlParallel; {$R *.dfm} procedure FillBuffer(buf: pointer; bufSize: integer; randomGen: TGpRandom); var buf64: PInt64; buf8 : PByte; i : integer; rnd : int64; begin buf64 := buf; for i := 1 to bufSize div SizeOf(int64) do begin buf64^ := randomGen.Rnd64; Inc(buf64); end; rnd := randomGen.Rnd64; buf8 := PByte(buf64); for i := 1 to bufSize mod SizeOf(int64) do begin buf8^ := rnd AND $FF; rnd := rnd SHR 8; Inc(buf8); end; end; { FillBuffer } procedure CreateRandomFile(fileSize: integer; output: TStream); const CBlockSize = 1 * 1024 * 1024 {1 MB}; var buffer : TOmniValue; lastBufferSize: integer; memStr : TMemoryStream; numBuffers : integer; outQueue : IOmniBlockingCollection; begin outQueue := TOmniBlockingCollection.Create; numBuffers := (fileSize - 1) div CBlockSize + 1; lastBufferSize := (fileSize - 1) mod CBlockSize + 1; Parallel.ForEach(1,numBuffers).NoWait .NumTasks(Environment.Process.Affinity.Count) .OnStop( procedure begin outQueue.CompleteAdding; end) .Initialize( procedure(var taskState: TOmniValue) begin taskState := TGpRandom.Create; end) .Finalize( procedure(const taskState: TOmniValue) begin taskState.AsObject.Free; end) .Execute( procedure(const value: integer; var taskState: TOmniValue) var buffer : TMemoryStream; bytesToWrite: integer; begin if value = numBuffers then bytesToWrite := lastBufferSize else bytesToWrite := CBlockSize; buffer := TMemoryStream.Create; buffer.Size := bytesToWrite; FillBuffer(buffer.Memory,bytesToWrite,taskState.AsObject as TGpRandom); outQueue.Add(buffer); end); for buffer in outQueue do begin memStr := buffer.AsObject as TMemoryStream; output.CopyFrom(memStr,0); FreeAndNil(memStr); end; end; procedure TForm43.btnRandomClick(Sender: TObject); var fileStr: TFileStream; time : int64; begin time := DSiTimeGetTime64; try fileStr := TFileStream.Create('e:\0\random.dat',fmCreate); try CreateRandomFile(750*1024*1024,fileStr); finally FreeAndNil(fileStr); end; finally Caption := Format('Completed in %d ms',[DSiElapsedTime64(time)]); end; end;
编辑:在这种情况下使用ForEach并不是真正优雅的解决方案,所以我用Parallel.ParallelTask和更好的IOmniCounter来增强OmniThreadLibrary。使用SVN版本的993(或更新版本),您可以如下解决这个多生产者单消费者问题。
procedure CreateRandomFile(fileSize: integer; output: TStream); const CBlockSize = 1 * 1024 * 1024 {1 MB}; var buffer : TOmniValue; memStr : TMemoryStream; outQueue : IOmniBlockingCollection; unwritten: IOmniCounter; begin outQueue := TOmniBlockingCollection.Create; unwritten := CreateCounter(fileSize); Parallel.ParallelTask.NoWait .NumTasks(Environment.Process.Affinity.Count) .OnStop(Parallel.CompleteQueue(outQueue)) .Execute( procedure var buffer : TMemoryStream; bytesToWrite: integer; randomGen : TGpRandom; begin randomGen := TGpRandom.Create; try while unwritten.Take(CBlockSize,bytesToWrite) do begin buffer := TMemoryStream.Create; buffer.Size := bytesToWrite; FillBuffer(buffer.Memory,randomGen); outQueue.Add(buffer); end; finally FreeAndNil(randomGen); end; end ); for buffer in outQueue do begin memStr := buffer.AsObject as TMemoryStream; output.CopyFrom(memStr,0); FreeAndNil(memStr); end; end;
EDIT2:关于这个问题的更长的博客文章:Life after 2.1: Parallel data production (Introducing Parallel.Task)