当前位置:网站首页>Chapter 2 Introduction to key technologies
Chapter 2 Introduction to key technologies
2022-06-27 04:26:00 【H`924】
Catalog
This project is through Python Language to design and implement . The main technologies used in the project are Python Reptiles ,Python Data analysis, etc . The data comes from www.fangjia.com.
2.1 Python Reptiles
Python Reptiles It is a common tool for collecting Internet data , In recent years, with the development of the Internet, it has developed rapidly . Using web crawlers to crawl network data, we must first understand the concept and main classification of the network , System structure of various reptiles 、 How it works , Common strategies , And the main application scenarios , meanwhile , For the sake of copyright and data security , We also need to understand the current application legitimacy of crawlers and the agreements that need to be observed when crawling websites . at present , Most websites allow the data crawled by crawlers to be used for personal use or scientific research . But if the crawled data is used for other purposes , Especially reprint or commercial use , Serious will violate the law or cause civil disputes . The following two kinds of data cannot be crawled , Not for commercial use . Personal privacy data : Such as name 、 Phone number 、 Age 、 Blood type 、 Marital status, etc , Crawling such data will violate the personal information protection law . Data that is explicitly prohibited from being accessed by others : For example, the user has set permission controls such as account and password , Encrypted content . Attention should also be paid to copyright related issues , Copyrighted content signed by the author is not allowed to be reproduced or used for commercial purposes after crawling .
2.2 Python Data analysis
Data analysis refers to the analysis of a large number of collected data with appropriate analytical methods , Extract useful information and form conclusions , The process of studying and summarizing the data in detail . Data mining in broad sense includes data analysis and data mining in narrow sense . In a narrow sense, data analysis refers to data analysis according to the purpose of analysis , Comparative analysis 、 Group analysis 、 Cross analysis and regression analysis , Process and analyze the collected data , Extract valuable information , Play the role of data , The process of obtaining the result of a characteristic statistic . Data mining is from a large number of 、 Not completely 、 Noisy 、 Vague 、 In random practical application data , By applying clustering model 、 Classification model 、 Regression and association rules , The process of tapping potential value . At present, the mainstream data analysis languages are Python、R、Matlab These three . among ,Python Has a rich and powerful library , It's often called glue language , To be able to make various modules in other languages ( In especial c and c++) Easily connected together , It's easier to learn 、 More rigorous programming language .R Language is used for statistical analysis 、 Drawing language and operating environment . It belongs to GNU A freedom of the system 、 free 、 Open source software .Matlab Is used to perform matrix operations 、 Plotting functions and data 、 Implementation algorithm 、 Create user interfaces and programs that connect to other programming languages , It is mainly used in engineering calculation 、 Control design 、 Signal processing and communication 、 The image processing 、 Signal detection 、 Financial modeling, design and analysis .
Python Data analysis mainly includes the following five advantages .
- The grammar is simple and concise
- There are many powerful Libraries
- Powerful
- Not only for research and prototyping , It also applies to building production systems
- Python It's a glue language , Can be easily bonded to components in other languages in a variety of ways .
边栏推荐
- How to make ef core 6 support dateonly type
- 【B站UP DR_CAN学习笔记】Kalman滤波3
- 【Unity】UI交互组件之按钮Button&可选基类总结
- QChart笔记2: 添加鼠标悬停显示
- 轨道姿态常用编程缩写
- Cache comprehensive project - seckill architecture
- 面对AI人才培养的“产学研”鸿沟,昇腾AI如何做厚产业人才黑土地?
- Microservice system design -- Distributed timing service design
- 面试-01
- Microservice system design -- unified authentication service design
猜你喜欢
![[station B up dr_can learning notes] Kalman filter 2](/img/52/777f2ad2db786c38fd9cd3fe55142c.gif)
[station B up dr_can learning notes] Kalman filter 2

Microservice system design -- distributed cache service design

卷积神经网络(CNN)网络结构及模型原理介绍

微服务系统设计——消息缓存服务设计
![[BJDCTF2020]The mystery of ip](/img/f8/c3a7334252724635d42c8db3d1bbb0.png)
[BJDCTF2020]The mystery of ip

Mysql database foundation: DQL data query language

LDR6028 手机设备一边充电一边OTG传输数据方案

微服务系统设计——微服务监控与系统资源监控设计

百度飞桨“万有引力”2022首站落地苏州,全面启动中小企业赋能计划
![[BJDCTF2020]The mystery of ip](/img/f8/c3a7334252724635d42c8db3d1bbb0.png)
[BJDCTF2020]The mystery of ip
随机推荐
WPF 开源控件库Extended WPF Toolkit介绍(经典)
Further exploration of handler (I) (the most complete analysis of the core principle of handler)
Kotlin Compose compositionLocalOf 与 staticCompositionLocalOf
Matlab | visualization of mathematical properties related to three interesting circles
Penetration test - directory traversal vulnerability
Argo workflows - getting started with kubernetes' workflow engine
Usage knowledge of mobile phones in new fields
021 C语言基础:递归,可变参数
百度飞桨“万有引力”2022首站落地苏州,全面启动中小企业赋能计划
LDR6028 手机设备一边充电一边OTG传输数据方案
面对AI人才培养的“产学研”鸿沟,昇腾AI如何做厚产业人才黑土地?
Description of replacement with STM32 or gd32
微服务系统设计——服务熔断和降级设计
Microservice system design -- unified authentication service design
009 basics of C language: C loop
【B站UP DR_CAN学习笔记】Kalman滤波1
[数组]BM94 接雨水问题-较难
mysql数据库基础:DQL数据查询语言
渗透测试-文件上传/下载/包含
微服务系统设计——消息缓存服务设计