当前位置:网站首页>Chapter 2 Introduction to key technologies
Chapter 2 Introduction to key technologies
2022-06-27 04:26:00 【H`924】
Catalog
This project is through Python Language to design and implement . The main technologies used in the project are Python Reptiles ,Python Data analysis, etc . The data comes from www.fangjia.com.
2.1 Python Reptiles
Python Reptiles It is a common tool for collecting Internet data , In recent years, with the development of the Internet, it has developed rapidly . Using web crawlers to crawl network data, we must first understand the concept and main classification of the network , System structure of various reptiles 、 How it works , Common strategies , And the main application scenarios , meanwhile , For the sake of copyright and data security , We also need to understand the current application legitimacy of crawlers and the agreements that need to be observed when crawling websites . at present , Most websites allow the data crawled by crawlers to be used for personal use or scientific research . But if the crawled data is used for other purposes , Especially reprint or commercial use , Serious will violate the law or cause civil disputes . The following two kinds of data cannot be crawled , Not for commercial use . Personal privacy data : Such as name 、 Phone number 、 Age 、 Blood type 、 Marital status, etc , Crawling such data will violate the personal information protection law . Data that is explicitly prohibited from being accessed by others : For example, the user has set permission controls such as account and password , Encrypted content . Attention should also be paid to copyright related issues , Copyrighted content signed by the author is not allowed to be reproduced or used for commercial purposes after crawling .
2.2 Python Data analysis
Data analysis refers to the analysis of a large number of collected data with appropriate analytical methods , Extract useful information and form conclusions , The process of studying and summarizing the data in detail . Data mining in broad sense includes data analysis and data mining in narrow sense . In a narrow sense, data analysis refers to data analysis according to the purpose of analysis , Comparative analysis 、 Group analysis 、 Cross analysis and regression analysis , Process and analyze the collected data , Extract valuable information , Play the role of data , The process of obtaining the result of a characteristic statistic . Data mining is from a large number of 、 Not completely 、 Noisy 、 Vague 、 In random practical application data , By applying clustering model 、 Classification model 、 Regression and association rules , The process of tapping potential value . At present, the mainstream data analysis languages are Python、R、Matlab These three . among ,Python Has a rich and powerful library , It's often called glue language , To be able to make various modules in other languages ( In especial c and c++) Easily connected together , It's easier to learn 、 More rigorous programming language .R Language is used for statistical analysis 、 Drawing language and operating environment . It belongs to GNU A freedom of the system 、 free 、 Open source software .Matlab Is used to perform matrix operations 、 Plotting functions and data 、 Implementation algorithm 、 Create user interfaces and programs that connect to other programming languages , It is mainly used in engineering calculation 、 Control design 、 Signal processing and communication 、 The image processing 、 Signal detection 、 Financial modeling, design and analysis .
Python Data analysis mainly includes the following five advantages .
- The grammar is simple and concise
- There are many powerful Libraries
- Powerful
- Not only for research and prototyping , It also applies to building production systems
- Python It's a glue language , Can be easily bonded to components in other languages in a variety of ways .
边栏推荐
- 清华大学开源软件镜像站网址
- 微服务系统设计——分布式事务服务设计
- 百度飞桨“万有引力”2022首站落地苏州,全面启动中小企业赋能计划
- Matlab | drawing of three ordinate diagram based on block diagram layout
- [station B up dr_can learning notes] Kalman filter 3
- Learn crypto from Buu (Zhou Geng)
- 跟着BUU学习Crypto(周更)
- Fplan powerplan instance
- Advanced Mathematics (Seventh Edition) Tongji University exercises 1-10 personal solutions
- Common programming abbreviations for orbit attitude
猜你喜欢

日志收集系统

mysql数据库基础:DQL数据查询语言

nignx配置单ip限流

PostgreSQL基础命令教程:创建新用户admin来访问PostgreSQL

微服务系统设计——服务链路跟踪设计

Microservice system design -- service registration, discovery and configuration design

MySql最详细的下载教程

Fplan powerplan instance

微服务系统设计——微服务调用设计

Ledrui ldr6035 usb-c interface device supports rechargeable OTG data transmission scheme.
随机推荐
math_数集(数集符号)和集合论
Kotlin compose implicitly passes the parameter compositionlocalprovider
Advanced Mathematics (Seventh Edition) Tongji University exercises 1-10 personal solutions
2022-06-26:以下golang代码输出什么?A:true;B:false;C:编译错误。 package main import “fmt“ func main() { type
In a sense, the Internet has become an incubator and a parent
MobileNet系列(4):MobileNetv3网络详解
Further exploration of handler (I) (the most complete analysis of the core principle of handler)
[数组]BM94 接雨水问题-较难
Microservice system design -- distributed transaction service design
Microservice system design -- unified authentication service design
Is the truth XX coming? Why are test / development programmers unwilling to work overtime? This is a crazy state
010 C语言基础:C函数
乐得瑞LDR6035 USB-C接口设备支持可充电可OTG传输数据方案。
Installing MySQL on Windows
List of best reading materials for machine learning in communication
Argo Workflows —— Kubernetes的工作流引擎入门
从某种意义来讲,互联网业已成为了一个孵化器,一个母体
020 basics of C language: C language forced type conversion and error handling
微服务系统设计——分布式事务服务设计
微服务系统设计——分布式缓存服务设计