当前位置:网站首页>Introduction to IoT Technologies: Chapter 6
Introduction to IoT Technologies: Chapter 6
2022-07-30 11:11:00 【Alice's Blog】
第六章:物联网数据处理
6.1 物联网与大数据
- The Internet of things to produce a large amount of data.So the Internet of things is necessarily big data iot.Data processing of the Internet of things must be big data processing.
- 大数据:无法在可承受的时间范围内用常规软件工具进行捕捉、管理和处理的数据集合.
- 物联网的数据处理需要新处理模式才能具有更强的决策力、Insight found force and routing capability.
- The Internet of things is usually a huge amounts of data、High growth and diversification of a kind of information assets.
1、The data characteristic of big data(5V)
- (1)海量(Volume)
- Iot mass heterogeneous data source sensors,The various states and describe the physical world change;
- Mass perception device;
- 海量节点;
- Most sensor nodes in a full-time working condition;
- 物联网数据由TB跃升到PB;
- 存在问题:These data are useful?正确的?
- (2)多样(Variety)
- The Internet of things application range,在不同领域、Different industries face不同类型、不同格式的数据,如网络日志、视频、图像、地理位置信息;
- The granularity of the iot data has obvious,The data is usually a multi-dimensional even高维的.Integration of multiple sensors,Sense the multiple attributes of an object simultaneously;
- Iot data has多源异构的特征.Data from different sensors more,Due to aware of the object and purpose of different,These devices have more data with different structure and semantic.
- (3)高速(Velocity)
- 数据增长速度快,处理速度也快,时效性要求高;
- The Internet of things directly associated with the real world,In many cases need to beReal-time access to control,At the same time higher data transmission rate is needed to support the real-time;
- 决策,检索,Communication need high speed(Such as retrieving news,如智能交通).
- (4)真实(Veracity)
- 指数据的质量和保真性.大数据环境下的数据最好具有较高的信噪比.
- 存在问题:The data is real?
- 虚假数据(错误数据).
- (5)价值(Value)
- 即低价值密度.随着数据量的增长,数据中有意义的信息却没有成相应比例增长.而价值同时与数据的真实性和数据处理时间相关;
- 例如:Sand gold;视频监控.

2、From the view - (1)结构化数据
- Refers to follow a standard pattern and structure(conform to a data model or schema),In the form of a two-dimensional table rows of data stored in relational database.结构化数据是先有结构、After the data.
- Structured data in short,Is a relational database, data.
- Due to the development of relational database,Therefore structured data storage、Analysis method and development of comprehensive,There are a large number of tools support structured data analysis,Analysis method of big department is given priority to with statistical analysis and data mining.
- 其中,关系型数据库(Relational Database)是创建在关系模型基础上的数据库,The relational model is 2 d table model,So a relational database, including some two-dimensional table and have a certain correlation between the tables.A relational database can be usedSQLLanguage by the inherent key value extract information.
- (2)非结构化数据
- Is not a unified data structure or model of data(如文本、图像、视频、音频等),不方便用二维逻辑表来表现.This part of the data in the enterprise data of large,And faster growth rate.
- Unstructured data is harder to be computer understand,Can't be processed directly or withSQL语句进行查询.Unstructured data often in binary large object(BLOB,将二进制数据存储为一个单一个体的集合)形式,The overall stored in a relational database in the;Or stored in a relational database(NoSQL数据库).Its processing analysis process is more complicated.
- (3)半结构化数据
- 半结构化数据,Refers to a certain structural,But do not have essentially relational,Between structured data and unstructured data completely completely data.
- It can be said to be structured data a,但是结构变化很大.因此,In order to understand the details of the data,Data can not be simply carried out in accordance with the unstructured or structured data processing,The need for special storage(化解为结构化数据/用XML格式来组织)和处理技术.
- Semi-structured data contains the tag,用来分隔语义元素以及对记录和字段进行分层.因此,它也被称为自描述的结构(The data stored in a tree or graph data structure).先有数据,再有结构.Two common semi-structured data:XML文件和JSON文件.Common sources include electronic conversion data(EDI)文件、扩展表、RSS源、传感器数据.
3、From the Angle of programming divide
- (1)编程语言
- 原始类型、多元组、记录单元、代数数据类型、Abstract data types, etc.
- Describes the practical application of the object,对象之间的关系,As well as to the operation of the object.
- (2)数据挖掘
- 记录数据、Based on the data diagram and sequence data, etc.
6.2 物联网数据存储
- 面临挑战:
- 海量存储空间;多源异构,The expression of data need to carefully consider;
- Support multiple granularity classification to store and retrieve,改善资源利用率,Increase the rate of resource acquisition;
- With real-time multidimensional detection;
- Redundant data needs to be compressed.
- 满足条件:
- 开放兼容,Interface and communication protocol to facilitate the discovery of Internet information,Locate and obtain;The complexity of the shielding interface,Compatible with multi-source heterogeneous Internet of things;
- 动态扩展,Including the dynamic extension storage capacity and data structure dynamic extensible;
- 可靠高效,支持高并发性,With high fault tolerance;
- 安全可信.
- 方案评价:
- 开放性、 扩展性、灵活性、可靠性、高效性、安全性 、可用性、实时性.

6.2.1 关系型数据库





- 关系数据库是建立在关系模型基础上的数据库
- 关系数据结构
- Relational data operation
- 关系完整性约束
关系数据结构
- <关系名>(属性名1,属性名2, …… 属性名N)
- shop(店名,地址,法人名,The operator name,电话)
- fruit(水果名,价格,库存量,质量等级)
- book(书名,Author name, 出版社,价格,页数,开本,ISBN,版本)
- student(姓名,学号,性别,宿舍,电话)
- 电话号码簿(电话号码,姓名)
Relational data operation
- 查询操作:选择、投影、连接、并、交、差
- 更新操作:增加、删除、修改数据的操作
Commonly used three kinds of relationship between arithmetic
- 选择运算

- 投影运算

- 连接运算


SQL(structured query language)结构化查询语言 - SQL(Structured Query Language),Is the relational database standard structured query language (SQL),1974Years develop.目前流行的是SQL-92标准,它是由ANSI(美国国家标准局)颁布的.
Structured query language (SQL) consists of three parts:
- 数据定义语言DDL
- 数据操纵语言DML
- 数据控制语言DCL
DDLUsed to define the database table
- Define the various tables(关系模式) What is the name of each table column attributes and types of definition
- 输入数据,修改数据
- 修改表的结构,如增加列
- Define the candidate code,建立索引
DMLUsed to maintain the data in the database
- For a variety of processing data in the table
- In the table data query select
- 在表中插入一行数据(一条记录、一个元组)insert
- To delete a row in the table(一条记录、一个元组)delete
- Change a line of data in the table(一条记录、一个元组)update
DCLTo protect the safe operation database
- 授权给用户 grant
- 回收授权 revoke





6.2.2 非关系型数据库
- 非关系型数据库包括:
- (1) 键值存储数据库
- 使用一个哈希表,There is a specific key value and a pointer to a specific data
- 简单,容易部署
- In view of the partial update query efficiency
- (2)列存储数据库
- 应对分布式存储的海量数据
- The key points to more than one column,Column by column family arrangements
- (3)文档型数据库
- Data is a versioned documents,Semi-structured documents,如Json;
- And key-value stores are similar,Is the upgrade version of key-value stores
- (4)图形数据库
- Using the graphical model database
- (5)Database aware
- For industrial automation,物联网等领域
- Can be either for relational data management,Can also be stored online real-time characteristic of time-series data
- 提供SQL标准接口,Also provide real-time data publish-subscribe,历史查询,The historical data analysis, etc
- Positioned to meet the enterprise application database










HDFS
- 内部机制是将一个文件分割成一个或多个块,这些块被存储在一组数据节点中.
- 名字节点用来操作文件命名空间的文件或目录操作,如打开,关闭,重命名等等.它同时确定块与数据节点的映射.数据节点负责来自文件系统客户的读写请求.
- 数据节点同时还要执行块的创建,删除,和来自名字节点的块复制指令.







6.3 Internet of cloud computing and virtualization













6.4 Internet data analysis and mining
6.4.1 The pretreatment of the data and knowledge discovery












6.4.2 数据挖掘














6.4.3 并行处理MapReduce














6.4.4 并行处理Spark











6.5 Iot data retrieval
6.5.1 文本检索
Text retrieval is built around correlation:
- Based on the text retrieval
- Based on the structure of the retrieval
- Based on the users' information retrieval






6.5.2 Streaming media retrieval








6.6 Iot data visualization technology

















边栏推荐
- Basemap and Seaborn
- WARN: Establishing SSL connection without server's identity verification is not recommended when connecting to mysql
- (***Key points***) Flink common memory problems and tuning guide (1)
- MySQL | Subqueries
- 【云原生】-Docker安装部署分布式数据库 OceanBase
- 正则表达式快速入门笔记
- 原生js 创建表格
- vscode中写markdown格式笔记的配置过程和相关语法
- WebAPI 复习
- spark udf accepts and handles null values.
猜你喜欢

4. yolov5-6.0 ERROR: AttributeError: 'Upsample' object has no attribute 'recompute_scale_factor' solution

【HMS core】【FAQ】HMS Toolkit典型问题合集1

Matplotlib--plot markers

VLAN实验

ESP32CAM 1838接收红外遥控器信号
![【 HMS core 】 【 Analytics Kit] [FAQ] how to solve the payment amount in huawei pay analysis shows zero problem?](/img/f3/b9256fc04d1c9e15c74d2fc14db0fb.png)
【 HMS core 】 【 Analytics Kit] [FAQ] how to solve the payment amount in huawei pay analysis shows zero problem?

梅科尔工作室-看鸿蒙设备开发实战笔记七——网络应用开发

我又造了个轮子:GrpcGateway

Pytorch中 nn.Transformer的使用详解与Transformer的黑盒讲解

Selected System Design | Design of CAN Bus Controller Based on FPGA (with Code)
随机推荐
第1章 Kali与靶机系统
还在用Swagger?我推荐这款零代码侵入的接口管理神器
干货|语义网、Web3.0、Web3、元宇宙这些概念还傻傻分不清楚?(中)
MySQL | Subqueries
电压跟随器不要随便加
关于verilog的时延研究
WARN: Establishing SSL connection without server's identity verification is not recommended when connecting to mysql
RandLA-Net复现记录
PyQt5 - draw sine curve with pixels
ODrive应用 #4 配置参数&指令「建议收藏」
Basemap和Seaborn
unity3d C#语言基础(继承)
ESP32CAM 1838接收红外遥控器信号
208. 实现 Trie (前缀树)
paging
Pytorch中 nn.Transformer的使用详解与Transformer的黑盒讲解
【HMS core】【Analytics Kit】【FAQ】如何解决华为分析付费分析中付款金额显示为0的问题?
实现web实时消息推送的7种方案
【ASP.NET Core】选项类的依赖注入
【C和指针第七章】可变参数列表