当前位置:网站首页>[new book recommendation] cleaning data for effective data science
[new book recommendation] cleaning data for effective data science
2022-06-30 09:42:00 【Librarian】
Hello everyone , The purpose of this account is to share some of the world's latest technical books and information for programmers who want to improve themselves , Today brings with it 2021 year 3 Month by month Packt The latest book on big data published by the publishing house , The language involved is Python and R Language .
Cleaning Data for Effective Data Science

author :David Mertz
Press. :Packt
Publication date :2021-03-31
ISBN:9781801071291
Book Introduction
In Data Science , Data analysis or machine learning , Most of the work required to achieve practical purposes is to clean up the data , This is self-evident . This book uses David The iconic friendly humorous style , The basic steps performed in each production data science or data analysis pipeline are discussed in detail , And it is ready for data visualization and modeling results .
This book delves into data extraction , Anomaly detection , Practical application of tools and technologies required for value estimation and functional engineering . Long exercises are provided at the end of each chapter , To practice the skills acquired .
You will first view such as JSON,CSV,SQL RDBMSes,HDF5,NoSQL database , Image format file and data capture of data format such as binary serialized data structure . Besides , This book provides many sample datasets and data files , Available for download and independent exploration .
Continue from format , You will estimate the missing value , Detect unreliable data and statistical anomalies , And generate the comprehensive functions necessary for successful data analysis and visualization .
By the end of the book , You will have an in-depth understanding of the data cleansing process required to perform actual data science and machine learning tasks .
What will you learn
How to carefully consider your data and ask the right questions
Identify problem data related to a single data point
Based on the data of the system “ shape ” Detect problem data
Remediate data integrity and health issues
Prepare data for analysis and machine learning tasks
Interpolate values into missing or unreliable data
Generation is more suitable for data science , A comprehensive function of data analysis or visualization objectives .
Who is this book for
This book is designed to make software developers interested in data analysis or scientific computing , Data scientist , Aspiring data scientists and students benefit .
Familiar with statistical knowledge , General concepts of machine learning , programing language (Python or R) Knowledge of data science and some understanding of data science will be very helpful . glossary , Reference materials and friendly help should help all readers quickly master .
This text will also be helpful for intermediate and advanced data scientists who want to improve their data hygiene rigor and review data preparation issues .
This is today's sharing , I wonder if it will help you , If you think it's good, please give me a compliment , It would be better if you could pay attention to me . If you want to get the of this book pdf You can click on the book's hyperlink , You are also welcome to leave messages and private letters in the comment area , I'll keep updating . I wish everyone can grow rapidly , Get rid of 996~ Refueling workers !!
边栏推荐
- Express file upload
- Comparison problems encountered in recent study
- AutoUpdater. Net client custom update file
- GPT (improving language understanding generative pre training) paper notes
- ABAP-时间函数
- thrift简单使用
- 目标检测yolov5开源项目调试
- 4. use ibinder interface flexibly for short-range communication
- POJ 1753 flip game (DFS 𞓜 bit operation)
- OCX child thread cannot trigger event event (forward)
猜你喜欢

Notes on masking and padding in tensorflow keras

Abstract classes and interfaces

Flutter 中的 ValueNotifier 和 ValueListenableBuilder

12. problem set: process, thread and JNI architecture

云技能提升好伙伴,亚马逊云师兄今天正式营业

MySQL index and data storage structure foundation

Small program learning path 1 - getting to know small programs

Xlnet (generalized autorefressive trainingfor language understanding) paper notes

Tutorial for beginners of small programs day01

NTP of Prometheus monitoring_ exporter
随机推荐
Numpy (time date and time increment)
Net framework system requirements
Flutter的特别之处在哪里
Deberta (decoding enhanced Bert with distinguished attention)
Deep Learning with Pytorch-Train A Classifier
Express file download
Microsoft. Bcl. Async usage summary -- in Net framework 4.5 project Net framework version 4.5 and above can use async/await asynchronous feature in C 5
直播带货源码开发中,如何降低直播中的延迟?
prometheus 监控之 ntp_exporter
MySQL internal component structure
DataTableToModelList实体类
JVM garbage collector G1 & ZGC details
Challenge transform() 2D
Acquisition de 100% des actions de Guilin latex par Guilin Robust Medical pour combler le vide de la gamme de produits Latex
Pytorch for former Torch users - Tensors
5. Messager framework and imessager interface
Guilin robust medical acquired 100% equity of Guilin Latex to fill the blank of latex product line
Xlnet (generalized autorefressive trainingfor language understanding) paper notes
Pipe pipe --namedpipe and anonymouspipe
ES6 learning path (III) deconstruction assignment