当前位置:网站首页>UCI-HAR数据集的处理
UCI-HAR数据集的处理
2022-06-11 10:55:00 【bu volcano】
import numpy as np
import pandas as pd
from collections import Counter
DATASET_PATH = "./UCI_data/raw/UCI HAR Dataset/"
INPUT_SIGNAL_TYPES = [ #文件前半部分公共名
"body_acc_x_",
"body_acc_y_",
"body_acc_z_",
"body_gyro_x_",
"body_gyro_y_",
"body_gyro_z_",
"total_acc_x_",
"total_acc_y_",
"total_acc_z_"
]
def load_x(X_signals_paths):
X_signals = []
for signal_type_path in X_signals_paths:#外面的for是赋文件路径然后分别进行内for操作
with open(signal_type_path, "r") as f:#里面的for就是 从打开的文件中加载行 然后对加载的行作两空格‘ ’代替1空格‘ ’的操作 然后删除头尾空白 然后通过单空格分离数据
X_signals.append( #将分离的数据组成新的list 最后serie从list中取数据组成数组
[np.array(serie, dtype=np.float32) #创建一个serie数组
for serie in [row.replace(' ', ' ').strip().split(' ') for row in f]]
) #从打开的文件中加载行 然后对加载的行作两空格‘ ’代替1空格‘ ’的操作 然后删除头尾空白 然后通过单空格分离数据
#将分离的数据放进X_signals
#print(np.array(X_signals).shape) #(1,7352,128) (特征数,样本个数,时间步长)
return np.transpose(X_signals, (1, 2, 0)) #0轴是第一个方括号代表序号 1轴是第二个方括号代表样本量 2轴是第二个方括号代表序列数据采样量
#(样本个数,时间步长,特征数) #然后transpose由0,1,2换成了换成样本量,采样量,序号 1,2,0
def load_y(y_path):
# Read dataset from disk, dealing with text file's syntax
with open(y_path, "r") as f:
y = np.array(
[elem for elem in [row.replace(' ', ' ').strip().split(' ') for row in f]],
dtype=np.int32
)
y = y.reshape(-1, )
# Substract 1 to each output class for friendly 0-based indexing
return y - 1 #y从1开始的
#数据文件的路径
train_x_signals_paths = [
DATASET_PATH + "train/Inertial Signals/" + signal + "train.txt" for signal in INPUT_SIGNAL_TYPES
] #用signal遍历train 9个文件名
test_x_signals_paths = [
DATASET_PATH + "test/Inertial Signals/" + signal + "test.txt" for signal in INPUT_SIGNAL_TYPES
]
#标签路径
train_y_path = DATASET_PATH + "train/y_train.txt"
test_y_path = DATASET_PATH + "test/y_test.txt"
#数据处理
train_x = load_x(train_x_signals_paths) #对9个文件名做load操作,做完一个操作后train_x是三维的
test_x = load_x(test_x_signals_paths)
#print(train_x_signals_paths)
# print("train_x.shape", train_x.shape)
#print("test_x.shape", test_x.shape)
#标签处理
train_y = load_y(train_y_path)
test_y = load_y(test_y_path)
# train_y_matrix = np.asarray(pd.get_dummies(train_y), dtype=np.int8) #先进行one-hot编码
# test_y_matrix = np.asarray(pd.get_dummies(test_y), dtype=np.int8) #然后转换成一数组
#这里就没用one hot
#print(train_y, Counter(train_y))
#print(test_y, Counter(test_y))
#print(train_y_matrix)
np.save("./UCI_data/np/x_train.npy", train_x)
np.save("./UCI_data/np/y_train.npy", train_y)
np.save("./UCI_data/np/x_test.npy", test_x)
np.save("./UCI_data/np/y_test.npy", test_y)
边栏推荐
- [games101] operation 2 -- triangle rasterization
- Mn Monet pagoda host system v1.5 release
- 数字藏品系统源码搭建
- 袋鼠云数栈基于CBO在Spark SQL优化上的探索
- 杰理之获取 BLE 区分复位跟唤醒【篇】
- 国际多语言出海商城返佣产品自动匹配订单源码
- 2022年最好的年金险产品是什么?
- Droid-slam: depth vision slam for monocular and binocular rgbd cameras
- 2022健博会,北京大健康产业展,艾灸健康展,北京健康服务展
- [DBSCAN] DBSCAN instance
猜你喜欢

杰理之获取 BLE 区分复位跟唤醒【篇】

Surrounddepth: self supervised multi camera look around depth estimation

Characteristics and classification of creation mode (single case, factory)

After 95, programmers in big factories were sentenced for deleting databases! Dissatisfied with the leaders because the project was taken over

Wechat cloud development al short video one click face changing applet source code

Using domestic MCU (national technology n32g031f8s7) to realize pwm+dma control ws2812

杰理之BLE 芯片供电范围及防烧芯片措施【篇】

Cloud development MBTI personality type test assistant wechat applet source code

(key points of software engineering review) Chapter IV overall design exercises

VOC格式数据集转yolo格式数据集的方法
随机推荐
Why does a ddrx power supply design require a VTT power supply
MySQL optimized learning diary 10 - locking mechanism
Rxjs Observable. Execute logical analysis of pipe passing in multiple operators
JS set IP mask
使用国产MCU(国民技术 N32G031F8S7) 实现 PWM+DMA 控制 WS2812
Xiao P weekly Vol.08
Team level safety training, new employee induction training education courseware, full content ppt application
数据库系统概论 ---- 第二章 -- 关系数据库(2.1~2.3)(重要知识点)
Jerry's ble spp open pin_ Code function [chapter]
js设置ip屏蔽
MYSQL(九)
MySQL (IX)
Shi Yigong: I was not interested in research until I graduated from my doctor's degree! I'm confused about the future, and I don't know what to do in the future
正大期货主账户预4 周三信息汇总
Content-Type: multipart/form-data; boundary=${bound}
Can't you be free without wealth?
使用Yolov5训练自己制作的数据集,快速上手
Jerry's acquisition of ble voltage detection and ADC detection inaccuracy [chapter]
迭代器模式--沙场秋点兵
The first day of the new year | at 8:00 p.m. tomorrow, pulsar Chinese developer and user group meeting registration