当前位置:网站首页>Hands on deep learning pytorch version exercise answer - 2.2 preliminary knowledge / data preprocessing
Hands on deep learning pytorch version exercise answer - 2.2 preliminary knowledge / data preprocessing
2022-07-03 10:20:00 【Innocent^_^】
I am here jupyter notebook Completed exercises , It takes a little more time. After all, it's new , Like judging numbers 、 Judge whether the value is empty 、 Delete the specified columns , Give a reference to friends who read this book newly . First post the overall operation results , Code after
Here's the code section :
import os
os.makedirs(os.path.join('..','practice'),exist_ok=True)
practice_file=os.path.join('..','practice','student_scores.csv')
with open(practice_file,'w') as f:
f.write('stu_num,stu_name,stu_course,stu_score\n')
f.write('1,Lily,English,100\n')
f.write('1,Lily,Physics,80\n')
f.write('2,NA,Computer Science,90\n')
f.write('2,NA,Database,88\n')
f.write('NA,John,Math,99\n')
f.write('4,Lisa,NA,100\n')
f.write('5,NA,French,50\n')
f.write('6,GOGO,NA,10\n')
# Delete the column with the most missing values
import pandas as pd
practice_data=pd.read_csv(practice_file)
print(practice_data)
# Take the horizontal and longitudinal length
column_len,row_len=len(practice_data.iloc[0,:]),len(practice_data.iloc[:,0])
# Count each column nan Number of
nan_sum=[0 for i in range(column_len)]
print(" Column length :{}, Line length :{}, Statistics nan Array of :{} ".format(column_len,row_len,nan_sum))
import math
import numbers
for i in range(column_len):
for j in range(row_len):
#NaN There are two kinds of : Numbers and strings , This should be judged separately
# Note that the number one j+1 That's ok 、 The first i+1 Column
if isinstance(practice_data.iloc[j,i],numbers.Number) == True:
if math.isnan(practice_data.iloc[j,i]) == True:
nan_sum[i] += 1
else:
if practice_data.iloc[j,i]=='NaN':
nan_sum[i] += 1
max_index=[]# The reason for getting the list is that there may be multiple columns with the most nan
most_nan=max(nan_sum)
for i,j in enumerate(nan_sum):
if j == most_nan:
max_index.append(practice_data.columns.values[i])
print(" The column with the most missing values is {}".format(max_index))
print(" The column with the most missing values is not deleted :")
print(practice_data)
# Start deleting the column with the most missing values
# This drop Delete the name of the column , Remember to update
practice_data = practice_data.drop(max_index,axis=1)
print(" After deleting the column with the most missing values ")
print(practice_data)
# The preprocessed data set is transformed into tensor
practice_inputs = practice_data.iloc[:,:]
practice_inputs = pd.get_dummies(practice_inputs,dummy_na=True)
print(practice_inputs)
import torch
Z = torch.tensor(practice_inputs.values)
print(Z)
边栏推荐
- Deep learning by Pytorch
- 1. Finite Markov Decision Process
- Neural Network Fundamentals (1)
- 4.1 Temporal Differential of one step
- [combinatorics] combinatorial existence theorem (three combinatorial existence theorems | finite poset decomposition theorem | Ramsey theorem | existence theorem of different representative systems |
- 3.1 Monte Carlo Methods & case study: Blackjack of on-Policy Evaluation
- YOLO_ V1 summary
- 『快速入门electron』之实现窗口拖拽
- 波士顿房价预测(TensorFlow2.9实践)
- MySQL root user needs sudo login
猜你喜欢
Dictionary tree prefix tree trie
Leetcode - 1670 conception de la file d'attente avant, moyenne et arrière (conception - deux files d'attente à double extrémité)
LeetCode - 706 设计哈希映射(设计) *
Retinaface: single stage dense face localization in the wild
Leetcode-112: path sum
Label Semantic Aware Pre-training for Few-shot Text Classification
QT self drawing button with bubbles
2312、卖木头块 | 面试官与狂徒张三的那些事(leetcode,附思维导图 + 全部解法)
Label Semantic Aware Pre-training for Few-shot Text Classification
Boston house price forecast (tensorflow2.9 practice)
随机推荐
Basic use and actual combat sharing of crash tool
QT creator uses OpenCV Pro add
Window maximum and minimum settings
Policy gradient Method of Deep Reinforcement learning (Part One)
Matplotlib drawing
Leetcode - 460 LFU cache (Design - hash table + bidirectional linked hash table + balanced binary tree (TreeSet))*
QT setting suspension button
LeetCode - 895 最大频率栈(设计- 哈希表+优先队列 哈希表 + 栈) *
20220603数学:Pow(x,n)
Opencv notes 17 template matching
Flutter 退出当前操作二次确认怎么做才更优雅?
MySQL root user needs sudo login
Simulate mouse click
What did I read in order to understand the to do list
LeetCode - 705 设计哈希集合(设计)
20220608 other: evaluation of inverse Polish expression
Dictionary tree prefix tree trie
Leetcode-106:根据中后序遍历序列构造二叉树
Leetcode-513: find the lower left corner value of the tree
Synchronous vs asynchronous