当前位置:网站首页>Hands on deep learning pytorch version exercise answer - 2.2 preliminary knowledge / data preprocessing
Hands on deep learning pytorch version exercise answer - 2.2 preliminary knowledge / data preprocessing
2022-07-03 10:20:00 【Innocent^_^】
I am here jupyter notebook Completed exercises , It takes a little more time. After all, it's new , Like judging numbers 、 Judge whether the value is empty 、 Delete the specified columns , Give a reference to friends who read this book newly . First post the overall operation results , Code after

Here's the code section :
import os
os.makedirs(os.path.join('..','practice'),exist_ok=True)
practice_file=os.path.join('..','practice','student_scores.csv')
with open(practice_file,'w') as f:
f.write('stu_num,stu_name,stu_course,stu_score\n')
f.write('1,Lily,English,100\n')
f.write('1,Lily,Physics,80\n')
f.write('2,NA,Computer Science,90\n')
f.write('2,NA,Database,88\n')
f.write('NA,John,Math,99\n')
f.write('4,Lisa,NA,100\n')
f.write('5,NA,French,50\n')
f.write('6,GOGO,NA,10\n')
# Delete the column with the most missing values
import pandas as pd
practice_data=pd.read_csv(practice_file)
print(practice_data)
# Take the horizontal and longitudinal length
column_len,row_len=len(practice_data.iloc[0,:]),len(practice_data.iloc[:,0])
# Count each column nan Number of
nan_sum=[0 for i in range(column_len)]
print(" Column length :{}, Line length :{}, Statistics nan Array of :{} ".format(column_len,row_len,nan_sum))
import math
import numbers
for i in range(column_len):
for j in range(row_len):
#NaN There are two kinds of : Numbers and strings , This should be judged separately
# Note that the number one j+1 That's ok 、 The first i+1 Column
if isinstance(practice_data.iloc[j,i],numbers.Number) == True:
if math.isnan(practice_data.iloc[j,i]) == True:
nan_sum[i] += 1
else:
if practice_data.iloc[j,i]=='NaN':
nan_sum[i] += 1
max_index=[]# The reason for getting the list is that there may be multiple columns with the most nan
most_nan=max(nan_sum)
for i,j in enumerate(nan_sum):
if j == most_nan:
max_index.append(practice_data.columns.values[i])
print(" The column with the most missing values is {}".format(max_index))
print(" The column with the most missing values is not deleted :")
print(practice_data)
# Start deleting the column with the most missing values
# This drop Delete the name of the column , Remember to update
practice_data = practice_data.drop(max_index,axis=1)
print(" After deleting the column with the most missing values ")
print(practice_data)
# The preprocessed data set is transformed into tensor
practice_inputs = practice_data.iloc[:,:]
practice_inputs = pd.get_dummies(practice_inputs,dummy_na=True)
print(practice_inputs)
import torch
Z = torch.tensor(practice_inputs.values)
print(Z)
边栏推荐
- 20220609 other: most elements
- CV learning notes convolutional neural network
- Retinaface: single stage dense face localization in the wild
- LeetCode - 1172 餐盘栈 (设计 - List + 小顶堆 + 栈))
- 20220605数学:两数相除
- 使用密钥对的形式连接阿里云服务器
- Yocto Technology Sharing Phase 4: Custom add package support
- Leetcode-100: same tree
- Leetcode - 706 design hash mapping (Design)*
- [combinatorics] Introduction to Combinatorics (combinatorial idea 3: upper and lower bound approximation | upper and lower bound approximation example Remsey number)
猜你喜欢

CV learning notes - Stereo Vision (point cloud model, spin image, 3D reconstruction)

Leetcode-513:找树的左下角值

RESNET code details

LeetCode - 673. 最长递增子序列的个数

2312. Selling wood blocks | things about the interviewer and crazy Zhang San (leetcode, with mind map + all solutions)

LeetCode - 933 最近的请求次数

LeetCode - 900. RLE iterator

3.3 Monte Carlo Methods: case study: Blackjack of Policy Improvement of on- & off-policy Evaluation
![[LZY learning notes -dive into deep learning] math preparation 2.1-2.4](/img/92/955df4a810adff69a1c07208cb624e.jpg)
[LZY learning notes -dive into deep learning] math preparation 2.1-2.4

Opencv+dlib to change the face of Mona Lisa
随机推荐
Deep Reinforcement learning with PyTorch
2021-11-11 standard thread library
LeetCode - 460 LFU 缓存(设计 - 哈希表+双向链表 哈希表+平衡二叉树(TreeSet))*
2.1 Dynamic programming and case study: Jack‘s car rental
Leetcode-112:路径总和
Toolbutton property settings
Leetcode - 1670 conception de la file d'attente avant, moyenne et arrière (conception - deux files d'attente à double extrémité)
OpenCV Error: Assertion failed (size.width>0 && size.height>0) in imshow
[graduation season] the picture is rich, and frugality is easy; Never forget chaos and danger in peace.
Leetcode-404:左叶子之和
使用sed替换文件夹下文件
CV learning notes - reasoning and training
20220602 Mathematics: Excel table column serial number
Pycharm cannot import custom package
[LZY learning notes -dive into deep learning] math preparation 2.1-2.4
波士顿房价预测(TensorFlow2.9实践)
[LZY learning notes dive into deep learning] 3.5 image classification dataset fashion MNIST
Opencv Harris corner detection
CV learning notes - feature extraction
Policy Gradient Methods of Deep Reinforcement Learning (Part Two)