はじめに
ここでは、二次元の表形式データであるDataframeの作成方法について説明する。
解説
モジュールのインポート
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd
import numpy as np
pd.options.display.notebook_repr_html = False
SeriesからDataframeを作成
データの生成
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#データの生成
data_1= [30,40,50]
data_2= [3,2,1]
index = ['BT','GT','PT']
Seriesの作成
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Seires1の作成
Ser1 = pd.Series(data_1, index = index)
'''
Ser1
BT 30
GT 40
PT 50
dtype: int64
'''
#Seires2の作成
Ser2 = pd.Series(data_2, index = index)
Ser2
'''
BT 3
GT 2
PT 1
dtype: int64
'''
2つのSeriesを作成する。
Dataframeの作成
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Seires1,2 からDataframeを作成
data = pd.DataFrame({'value': Ser1,
'rank': Ser2})
data
'''
value rank
BT 30 3
GT 40 2
PT 50 1
'''
Dataframeにおいて、辞書形式でカラム名とSeriesを指定することでDataframeを作成できる。
インデックスとカラム名の取得
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
data.index
#Index(['BT', 'GT', 'PT'], dtype='object')
data.columns
#Index(['value', 'rank'], dtype='object')
.indexでインデックスを取得でき、.columnsでカラム名を取得できる。
np.arrayからDataframeを作成
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#Df From np array
data_array = np.random.rand(3, 3)
data_array
'''
array([[0.51504088, 0.47938992, 0.29317853],
[0.74293612, 0.68519342, 0.28975474],
[0.71271923, 0.49397093, 0.01258567]])
'''
pd.DataFrame(data_array)
'''
0 1 2
0 0.515041 0.479390 0.293179
1 0.742936 0.685193 0.289755
2 0.712719 0.493971 0.012586
'''
pd.Dataframe(np.array)
でDatafameとなる。
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
pd.DataFrame(np.random.rand(3, 3), columns = index)
'''
BT GT PT
0 0.934911 0.627968 0.056176
1 0.038749 0.403837 0.347080
2 0.938361 0.253888 0.073236
'''
pd.DataFrame(np.random.rand(3, 3), index = index)
'''
0 1 2
BT 0.622366 0.235205 0.721277
GT 0.313697 0.027025 0.365088
PT 0.539653 0.253789 0.388588
'''
columnsとindexを指定することで数字のカラム名とインデックスではなく、任意のカラム名とインデックスにすることができる。
辞書からDataframeの作成
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#辞書からDataframeを作成
a = {'a': 1., 'b': 2., 'c': 3.}
b = {'b': 4., 'c': 5., 'a': 6.}
pd.DataFrame([a,b],index = ['A','B'])
'''
a b c
A 1.0 2.0 3.0
B 6.0 4.0 5.0
'''
辞書内の順序がバラバラでもソートされてDataframeとなる。
コメント