01. 데이터 불러오기 및 데이터 확인

1. 데이터 로드

 - df = pd.read_확장자 (대상, sep, encoding)

 - sep 인자 : \t 기준으로 구분

 - encoding : "euc-kr" 한글 / UTF-8

DataUrl = 'https://raw.githubusercontent.com/Datamanim/pandas/main/lol.csv'
df = pd.read_csv(DataUrl,sep='\t')

 

2. 상위, 하위 데이터 출력

- df.head() : 기본은 5개

- df.tail() : 기본은 5개

 

3. 데이터 구조 파악

- df.index : 인덱스 정보파악

df.index
RangeIndex(start=0, stop=51490, step=1)

- df.shape : 행과 열의 갯수 파악

df.shape
(51490, 61)

- df.info() : 결측치 파악에 유용 

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 51490 entries, 0 to 51489
Data columns (total 61 columns):
 #   Column              Non-Null Count  Dtype
---  ------              --------------  -----
 0   gameId              51490 non-null  int64
 1   creationTime        51490 non-null  int64
 2   gameDuration        51490 non-null  int64
 3   seasonId            51490 non-null  int64
 4   winner              51490 non-null  int64
 5   firstBlood          51490 non-null  int64
 6   firstTower          51490 non-null  int64
 7   firstInhibitor      51490 non-null  int64
 8   firstBaron          51490 non-null  int64
 9   firstDragon         51490 non-null  int64
 10  firstRiftHerald     51490 non-null  int64
 11  t1_champ1id         51490 non-null  int64
 12  t1_champ1_sum1      51490 non-null  int64
 13  t1_champ1_sum2      51490 non-null  int64
 14  t1_champ2id         51490 non-null  int64
 15  t1_champ2_sum1      51490 non-null  int64
 16  t1_champ2_sum2      51490 non-null  int64
 17  t1_champ3id         51490 non-null  int64
 18  t1_champ3_sum1      51490 non-null  int64
 19  t1_champ3_sum2      51490 non-null  int64
 20  t1_champ4id         51490 non-null  int64
 21  t1_champ4_sum1      51490 non-null  int64
 22  t1_champ4_sum2      51490 non-null  int64
 23  t1_champ5id         51490 non-null  int64
 24  t1_champ5_sum1      51490 non-null  int64
 25  t1_champ5_sum2      51490 non-null  int64
 26  t1_towerKills       51490 non-null  int64
 27  t1_inhibitorKills   51490 non-null  int64
 28  t1_baronKills       51490 non-null  int64
 29  t1_dragonKills      51490 non-null  int64
 30  t1_riftHeraldKills  51490 non-null  int64
 31  t1_ban1             51490 non-null  int64
 32  t1_ban2             51490 non-null  int64
 33  t1_ban3             51490 non-null  int64
 34  t1_ban4             51490 non-null  int64
 35  t1_ban5             51490 non-null  int64
 36  t2_champ1id         51490 non-null  int64
 37  t2_champ1_sum1      51490 non-null  int64
 38  t2_champ1_sum2      51490 non-null  int64
 39  t2_champ2id         51490 non-null  int64
 40  t2_champ2_sum1      51490 non-null  int64
 41  t2_champ2_sum2      51490 non-null  int64
 42  t2_champ3id         51490 non-null  int64
 43  t2_champ3_sum1      51490 non-null  int64
 44  t2_champ3_sum2      51490 non-null  int64
 45  t2_champ4id         51490 non-null  int64
 46  t2_champ4_sum1      51490 non-null  int64
 47  t2_champ4_sum2      51490 non-null  int64
 48  t2_champ5id         51490 non-null  int64
 49  t2_champ5_sum1      51490 non-null  int64
 50  t2_champ5_sum2      51490 non-null  int64
 51  t2_towerKills       51490 non-null  int64
 52  t2_inhibitorKills   51490 non-null  int64
 53  t2_baronKills       51490 non-null  int64
 54  t2_dragonKills      51490 non-null  int64
 55  t2_riftHeraldKills  51490 non-null  int64
 56  t2_ban1             51490 non-null  int64
 57  t2_ban2             51490 non-null  int64
 58  t2_ban3             51490 non-null  int64
 59  t2_ban4             51490 non-null  int64
 60  t2_ban5             51490 non-null  int64
dtypes: int64(61)
memory usage: 24.0 MB

* 결측치 확인

- df.isnull().sum() : is null에 대해 True : 1, False : 0 으로 합계

df.isnull().sum()
gameId          0
creationTime    0
gameDuration    0
seasonId        0
winner          0
               ..
t2_ban1         0
t2_ban2         0
t2_ban3         0
t2_ban4         0
t2_ban5         0
Length: 61, dtype: int64

'Data Science > python' 카테고리의 다른 글

2. Selection  (0) 2021.12.22
1. Viewing data  (0) 2021.12.22
리스트(list)  (0) 2021.12.19
문자열(string) (2)  (0) 2021.12.19
문자열(string) (1)  (0) 2021.12.19