有没有快速实现以下输出的方法?
输入:
- Code Items
- 123 eq-hk
- 456 ca-eu; tp-lbe
- 789 ca-us
- 321 go-ch
- 654 ca-au; go-au
- 987 go-jp
- 147 co-ml; go-ml
- 258 ca-us
- 369 ca-us; ca-my
- 741 ca-us
- 852 ca-eu
- 963 ca-ml; co-ml; go-ml
输出:
- Code eq ca go co tp
- 123 hk
- 456 eu lbe
- 789 us
- 321 ch
- 654 au au
- 987 jp
- 147 ml ml
- 258 us
- 369 us,my
- 741 us
- 852 eu
- 963 ml ml ml
我再次遇到循环和一个非常难看的代码,使其工作.如果有一种优雅的方式来实现这一点?
谢谢!
解决方法
- import pandas as pd
- df = pd.DataFrame([
- ('123','eq-hk'),('456','ca-eu; tp-lbe'),('789','ca-us'),('321','go-ch'),('654','ca-au; go-au'),('987','go-jp'),('147','co-ml; go-ml'),('258',('369','ca-us; ca-my'),('741',('852','ca-eu'),('963','ca-ml; co-ml; go-ml')],columns=['Code','Items'])
- # Get item type list from each row,sum (concatenate) the lists and convert
- # to a set to remove duplicates
- item_types = set(df['Items'].str.findall('(\w+)-').sum())
- print(item_types)
- # {'ca','co','eq','go','tp'}
- # Generate a column for each item type
- df1 = pd.DataFrame(df['Code'])
- for t in item_types:
- df1[t] = df['Items'].str.findall('%s-(\w+)' % t).apply(lambda x: ''.join(x))
- print(df1)
- # Code ca tp eq co go
- #0 123 hk
- #1 456 eu lbe
- #2 789 us
- #3 321 ch
- #4 654 au au
- #5 987 jp
- #6 147 ml ml
- #7 258 us
- #8 369 usmy
- #9 741 us
- #10 852 eu
- #11 963 ml ml ml