python - Adding Column headers to pandas dataframe.. but NAN's all the data even though headers are same dimension -

- August 15, 2013

i trying add column headers csv file have parsed dataframe withing pandas.

dftrades = pd.read_csv('pnl1.txt',delim_whitespace=true,header=none,); dftrades = dftrades.drop(dftrades.columns[[3,4,6,8,10,11,13,15,17,18,25,27,29,32]], axis=1)     # note: 0 indexed dftrades = dftrades.set_index([dftrades.index]); df = pd.dataframe(dftrades,columns=['tradedate',                                       'tradetime',                                       'cumpnl',                                       'dailycumpnl',                                       'realisedpnl',                                       'unrealisedpnl',                                       'ccyccy',                                       'ccyccypnldaily',                                       'position',                                       'candleopen',                                       'candlehigh',                                       'candlelow',                                       'candleclose',                                       'candledir',                                       'candledirswings',                                       'tradeamount',                                       'rate',                                       'pnl/trade',                                       'venue',                                       'ordertype',                                       'orderid'                                       'code']);   print df

the structure of data is:

01/10/2015 05:47.3  190 190 -648 838 eurnok -648 0  0 611   -1137   -648 h 2     -1000000   9.465   -648    internal    ioc 287

what pandas returns is:

  tradedate  tradetime  cumpnl  dailycumpnl  realisedpnl  unrealisedpnl  \ 0            nan        nan     nan          nan          nan            nan   ...

i appreciate advice on issue.

thanks

ps. ed answer. have tried suggestion with

df = dftrades.columns=['tradedate',                    'tradetime',                    'cumpnl',                    'dailycumpnl',                    'realisedpnl',                    'unrealisedpnl',                    'ccyccy',                    'ccyccypnldaily',                    'position',                    'candleopen',                    'candlehigh',                    'candlelow',                    'candleclose',                    'candledir',                    'candledirswings',                    'tradeamount',                    'rate',                    'pnl/trade',                    'venue',                    'ordertype',                    'orderid'                    'code'];

but problem has morphed to:

 valueerror: length mismatch: expected axis has 22 elements, new values have     21 elements

i have taken shape of matrix , got: dftrades.shape

(12056, 22)

so sadly still need :(

assign directly columns:

df.columns = ['tradedate',                                       'tradetime',                                       'cumpnl',                                       'dailycumpnl',                                       'realisedpnl',                                       'unrealisedpnl',                                       'ccyccy',                                       'ccyccypnldaily',                                       'position',                                       'candleopen',                                       'candlehigh',                                       'candlelow',                                       'candleclose',                                       'candledir',                                       'candledirswings',                                       'tradeamount',                                       'rate',                                       'pnl/trade',                                       'venue',                                       'ordertype',                                       'orderid'                                       'code']

what you're doing reindexing , because columns don't agree nans you're passing df data align on existing column names , index values.

you can see same semantic behaviour here:

in [240]: df = pd.dataframe(data= np.random.randn(5,3), columns = np.arange(3)) df  out[240]:           0         1         2 0  1.037216  0.761995  0.153047 1 -0.602141 -0.114032 -0.323872 2 -1.188986  0.594895 -0.733236 3  0.556196  0.363965 -0.893846 4  0.547791 -0.378287 -1.171706  in [242]: df1 = pd.dataframe(df, columns = list('abc')) df1  out[242]:       b   c 0 nan nan nan 1 nan nan nan 2 nan nan nan 3 nan nan nan 4 nan nan nan

alternatively can pass np array data:

df = pd.dataframe(dftrades.values,columns=['tradedate',  in [244]: df1 = pd.dataframe(df.values, columns = list('abc')) df1  out[244]:                   b         c 0  1.037216  0.761995  0.153047 1 -0.602141 -0.114032 -0.323872 2 -1.188986  0.594895 -0.733236 3  0.556196  0.363965 -0.893846 4  0.547791 -0.378287 -1.171706

Search This Blog

Stadnd

python - Adding Column headers to pandas dataframe.. but NAN's all the data even though headers are same dimension -

Comments

Post a Comment

Popular posts from this blog

python - Statsmodels.api Logit model error ValueError: endog must be in the unit interval -

Capture and play voice with Asterisk ARI -

c++ - Can not find the "fiostream.h" file -