我有一个散列哈希%signal_db.一个典型的元素是:$signal_db {$cycle} {$key}.有10,000个信号和10,000个密钥.
有没有办法优化(按时)这段代码:
foreach my $cycle (sort numerically keys %signal_db) { foreach my $key (sort keys %{$signal_db{$cycle}}) { print $signal_db{$cycle}{$key}.$key."\n"; } }
元素必须以与我的代码相同的顺序打印.
解决方法
两个微优化:映射内部哈希而不是常量解除引用和缓冲而不是常量打印.可以使用替代存储格式摆脱排序,测试两种变体.结果:
Rate original try3 alternative alternative2 original 46.1/s -- -12% -21% -32% try3 52.6/s 14% -- -10% -22% alternative 58.6/s 27% 11% -- -13% alternative2 67.5/s 46% 28% 15% --
结论:
最好使用预先存储的存储格式,但没有C win可能在100%之内(在我的测试数据集上).提供的有关数据的信息表明,外部哈希中的键几乎是连续的数字,所以这就要求数组.
脚本:
#!/usr/bin/env perl use strict; use warnings; use Benchmark qw/timethese cmpthese/; my %signal_db = map { $_ => {} } 1..1000; %$_ = map { $_ => $_ } 'a'..'z' foreach values %signal_db; my @signal_db = map { { cycle => $_ } } 1..1000; $_->{'samples'} = { map { $_ => $_ } 'a'..'z' } foreach @signal_db; my @signal_db1 = map { $_ => [] } 1..1000; @$_ = map { $_ => $_ } 'a'..'z' foreach grep ref $_,@signal_db1; use Sort::Key qw(nsort); sub numerically { $a <=> $b } my $result = cmpthese( -2,{ 'original' => sub { open my $out,'>','tmp.out'; foreach my $cycle (sort numerically keys %signal_db) { foreach my $key (sort keys %{$signal_db{$cycle}}) { print $out $signal_db{$cycle}{$key}.$key."\n"; } } },'try3' => sub { open my $out,'tmp.out'; foreach my $cycle (map $signal_db{$_},sort numerically keys %signal_db) { my $tmp = ''; foreach my $key (sort keys %$cycle) { $tmp .= $cycle->{$key}.$key."\n"; } print $out $tmp; } },'alternative' => sub { open my $out,'tmp.out'; foreach my $cycle (map $_->{'samples'},@signal_db) { my $tmp = ''; foreach my $key (sort keys %$cycle) { $tmp .= $cycle->{$key}.$key."\n"; } print $out $tmp; } },'alternative2' => sub { open my $out,'tmp.out'; foreach my $cycle (grep ref $_,@signal_db1) { my $tmp = ''; foreach (my $i = 0; $i < @$cycle; $i+=2) { $tmp .= $cycle->[$i+1].$cycle->[$i]."\n"; } print $out $tmp; } },} );