我有一个应用程序读取一大块文本数据到一个标量,有时甚至大小的GB.我使用该标量的substr将大部分数据读入另一个标量,并用空字符串替换提取的数据,因为它不再需要在第一个标量中.最近我发现Perl并没有释放第一个标量的内存,而是认识到它的逻辑长度已经改变了.所以我需要做的是将数据从第一个标量再次提取到第三个标题中,将第一个标量取反,并将提取的数据放回原位.只有这样,第一个标量占用的内存真的被释放了.将undef分配给标量或小于分配的内存块的其他值不会改变任何关于分配的内存的内容.
以下是我现在所做的:
$$extFileBufferRef = substr($$contentRef,$offset,$length,''); $length = length($$contentRef); my $content = substr($$contentRef,$length); $$contentRef = undef( $$contentRef) || $content;
$$contentRef可能是例如5 GB的大小在第一行,我提取4,9 GB的数据并替换提取的数据.第二行现在报告例如100 MB的数据作为字符串的长度,但例如Devel :: Size :: total_size仍然会输出为该标量分配5 GB的数据.并且将undef等等分配给$$contentRef似乎并没有改变这一点,所以我需要调用undef作为该标量的函数.
我可以预期,在使用substr后,$$contentRef背后的内存已经至少部分被释放了.似乎不是这样的…
那么,如果变量超出范围,内存是否被释放?如果是这样,为什么要将undef分配给不同于称为undef作为同一标量的函数?
解决方法
你的分析是正确的.
$perl -MDevel::Peek -e' my $x; $x .= "x" for 1..100; Dump($x); substr($x,50,length($x),""); Dump($x); ' SV = PV(0x24208e0) at 0x243d550 ... CUR = 100 # length($x) == 100 LEN = 120 # 120 bytes are allocated for the string buffer. SV = PV(0x24208e0) at 0x243d550 ... CUR = 50 # length($x) == 50 LEN = 120 # 120 bytes are allocated for the string buffer.
Perl不仅会分配字符串,甚至不会释放超出范围的变量,而是在下次输入范围时重新使用它们.
$perl -MDevel::Peek -e' sub f { my ($set) = @_; my $x; if ($set) { $x = "abc"; $x .= "def"; } Dump($x); } f(1); f(0); ' SV = PV(0x3be74b0) at 0x3c04228 # PV: Scalar may contain a string REFCNT = 1 FLAGS = (POK,pPOK) # POK: Scalar contains a string PV = 0x3c0c6a0 "abcdef"\0 # The string buffer CUR = 6 LEN = 10 # Allocated size of the string buffer SV = PV(0x3be74b0) at 0x3c04228 # Could be a different scalar at the same address,REFCNT = 1 # but it's truly the same scalar FLAGS = () # No "OK" flags: undef PV = 0x3c0c6a0 "abcdef"\0 # The same string buffer CUR = 6 LEN = 10 # Allocated size of the string buffer
逻辑是,如果你需要一次内存,那么你很有可能再需要它了.
由于相同的原因,将undef分配给标量不会释放其字符串缓冲区.但Perl给你一个机会释放缓冲区,如果你想要的,所以传递一个标量到undef强制释放标量的内部缓冲区.
$perl -MDevel::Peek -e' my $x = "abc"; $x .= "def"; Dump($x); $x = undef; Dump($x); undef $x; Dump($x); ' SV = PV(0x37d1fb0) at 0x37eec98 # PV: Scalar may contain a string REFCNT = 1 FLAGS = (POK,pPOK) # POK: Scalar contains a string PV = 0x37e8290 "abcdef"\0 # The string buffer CUR = 6 LEN = 10 # Allocated size of the string buffer SV = PV(0x37d1fb0) at 0x37eec98 # PV: Scalar may contain a string REFCNT = 1 FLAGS = () # No "OK" flags: undef PV = 0x37e8290 "abcdef"\0 # The string buffer is still allcoated CUR = 6 LEN = 10 # Allocated size of the string buffer SV = PV(0x37d1fb0) at 0x37eec98 # PV: Scalar may contain a string REFCNT = 1 FLAGS = () # No "OK" flags: undef PV = 0 # The string buffer has been freed.