让我们说,我把六面刀60次,我分别获得16,5,9,6,15个角色,分别是1号到6号.数字1和6显示过多和
there’s only about a 1.8% chance of that being random.如果我使用
Statistics::ChiSquare,它打印出来:
There's a >1% chance,and a <5% chance,that this data is random.
所以不仅是一个糟糕的界面(我不能直接得到这些数字),而是舍入误差是很重要的.
更糟糕的是,如果我滚2个六面骰子呢?得到任何特定数字的可能性是:
Sum Frequency Relative Frequency 2 1 1/36 3 2 2/36 4 3 3/36 5 4 4/36 6 5 5/36 7 6 6/36 8 5 5/36 9 4 4/36 10 3 3/36 11 2 2/36 12 1 1/36
Statistics::ChiSquare used to have a chisquare_nonuniform() function,但它被删除.
所以数字是圆滑的,我不能用于一个不均匀的分配.给出实际频率列表和预期频率列表,Perl中计算卡方检验的最佳方法是什么?我在CPAN上找到的各种模块并没有帮助我,所以我猜想我错过了一些明显的东西.
解决方法
自己实现这个很简单,所以我不想上传另一个统计模块.
use Carp qw< croak >; use List::Util qw< sum >; use Statistics::Distributions qw< chisqrprob >; sub chi_squared_test { my %args = @_; my $observed = delete $args{observed} // croak q(Argument "observed" required); my $expected = delete $args{expected} // croak q(Argument "expected" required); @$observed == @$expected or croak q(Input arrays must have same length); my $chi_squared = sum map { ($observed->[$_] - $expected->[$_])**2 / $expected->[$_]; } 0 .. $#$observed; my $degrees_of_freedom = @$observed - 1; my $probability = chisqrprob($degrees_of_freedom,$chi_squared); return $probability; } say chi_squared_test observed => [16,7,17],expected => [(10) x 6];
输出:0.018360