numpy – 在pandas 0.10.1上使用pandas.read_csv指定dtype float32

我试图读一个简单的空间分隔的文件用pandas read_csv方法。然而，熊猫似乎没有服从我的dtype参数。也许我错误地指定它？

我已经把我对read_csv的一些复杂的调用归结为这个简单的测试用例。我实际上在我的“真实”场景中使用转换器的参数，但我删除了为简单。

下面是我的ipython会话：

>>> cat test.out
a b
0.76398 0.81394
0.32136 0.91063
>>> import pandas
>>> import numpy
>>> x = pandas.read_csv('test.out',dtype={'a': numpy.float32},delim_whitespace=True)
>>> x
         a        b
0  0.76398  0.81394
1  0.32136  0.91063
>>> x.a.dtype
dtype('float64')

我也试过这个用numpy.int32或numpy.int64的dtype。这些选择导致异常：

AttributeError: 'NoneType' object has no attribute 'dtype'

我假设AttributeError是因为pandas不会自动尝试转换/截断浮点值为整数？

我在一个32位的机器上运行32位版本的Python。

>>> !uname -a
Linux ubuntu 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:25:36 UTC 2011 i686 i686 i386 GNU/Linux
>>> import platform
>>> platform.architecture()
('32bit','ELF')
>>> pandas.__version__
'0.10.1'

解决方法

0.10.1并不真正支持float32

见http://pandas.pydata.org/pandas-docs/dev/whatsnew.html#dtype-specification

你可以在0.11这样做：

# dont' use dtype converters explicity for the columns you care about
# they will be converted to float64 if possible,or object if they cannot
df = pd.read_csv('test.csv'.....)

#### this is optional and related to the issue you posted ####
# force anything that is not a numeric to nan
# columns are the list of columns that you are interesetd in
df[columns] = df[columns].convert_objects(convert_numeric=True)


    # astype
    df[columns] = df[columns].astype('float32')

see http://pandas.pydata.org/pandas-docs/dev/basics.html#object-conversion

Its not as efficient as doing it directly in read_csv (but that requires

我已经确认用0.11-dev，这个DOES工作(对32位和64位，结果是一样的)

In [5]: x = pd.read_csv(StringIO.StringIO(data),dtype={'a': np.float32},delim_whitespace=True)

In [6]: x
Out[6]: 
         a        b
0  0.76398  0.81394
1  0.32136  0.91063

In [7]: x.dtypes
Out[7]: 
a    float32
b    float64
dtype: object

In [8]: pd.__version__
Out[8]: '0.11.0.dev-385ff82'

In [9]: quit()
vagrant@precise32:~/pandas$ uname -a
Linux precise32 3.2.0-23-generic-pae #36-Ubuntu SMP Tue Apr 10 22:19:09 UTC 2012 i686 i686 i386 GNU/Linux

 some low-level changes)

原文链接：/css/220792.html

numpy – 在pandas 0.10.1上使用pandas.read_csv指定dtype float32

解决方法

猜你在找的CSS相关文章