当前位置:网站首页>Pyspark operator processing spatial data full parsing (4): let's talk about spatial operations first
Pyspark operator processing spatial data full parsing (4): let's talk about spatial operations first
2022-07-06 17:33:00 【51CTO】
stay PySpark To deal with spatial data , First of all, let's talk about the problem of space operation .
Spatial operation is the operation rule between spatial data , For example, two faces intersect :

Such calculations , There are usually two situations , One is to judge , Whether it intersects ; The other is to take out the intersecting parts .
So borrow the interrogative sentence form in English grammar , Space operations are also divided into two categories :

The first kind of space operation , Compare general questions in English , It's actually a kind of Judgment of spatial relationship , No new results will be generated , Only return the judgment of the relationship between the two spatial data involved in the operation , For example, the two sides of Figure 1 , Calculate the spatial relationship , The judgment condition is “ The intersection ” Words , Only one result will be returned boolean value :True.

The second category , It is the so-called space geometric operation , This kind of analysis and comparison is only the first kind of relational operation , Meeting Generate new data ; For example, figure 1 , The analysis condition is “ The intersection ” Words , If the result returned is face feature , Then the intersecting parts will be taken out :

Of course , The algorithm of spatial relation is very mature , If you are interested, you can check it by yourself , We do not popularize basic algorithms here , And there is no need to build your own wheels , Implement the principle of applying what you have learned , We only know how to use .
First, let's talk about the most widely used spatial algorithm library in the industry .
stay OGC standard ( Open Geospatial Information Alliance (Open Geospatial Consortium)) Before , do GIS Every organization of has developed its own set of spatial object rules and spatial computing rules , All kinds of flowers bloom ( a literary style …… Two flowers ), Then because it was too messy , therefore OGC Born in the sky , A top-level architecture of spatial object standard is given , Below this top floor, everyone continues to blossom in their own style , But at least there is a general agreement . OGC The following operations are defined on vector data :
- First, some geometric information is defined :

- Some definitions of spatial relationship judgment :

These spatial relationships , All of them True/False.
Finally, space geometric operations :

These geometric operations , A new geometric object will be generated .
Above is OGC Specify the relationship and operation rules of some vector data , As long as meet OGC Standard spatial algorithm made by any organization and unit , Will include these basic algorithms , And after thousands of tempering evolution , We don't have to write another set by ourselves .
These algorithms , How to achieve it ? stay OGC below , It is generally divided into the implementation of the following two systems :

In the open source system , There are two implementations :
The first is the most widely used in the industry GDAL,GDAL The full name of is spatial data abstract library (Geospatial Data Abstraction Library), Mainly used C++ To achieve , In this system , It has also derived and expanded countless branches , such as PostGIS、Python GDAL/OGR,R Linguistic RGDAL Bag, etc .
The other is JAVA Under the system JTS(Java Topology Suite:Java Topology Suite ), This package is in GIS Although the reputation of the open source system in the field is not obvious , But if it's not GIS Professional , Lack of brilliance , The space applications derived below him include GeoServer This kind of open source WebGIS The top system ,Oracle Spatial This enterprise class spatial processing plug-in , It includes Spark The following is for space processing GeoSpark wait , All applied JTS.
Then there is the closed source system , As the operating system industry says, closed source is represented by Microsoft , that GIS The boundary is based on ESRI For the main target .Esri stay OGC A set of geometric relation algorithms are developed under the standard , be called Esri Geometry, In addition to the above functions OGC Beyond relevant standards , There are also many self expanding algorithms , There are also some differences in geometric organization patterns , So Esri Of arcpy On the whole , The geometric structure is shown as :

All geometry , All are constructed with points as the benchmark structure , Geometric point structure + Spatial relations , It becomes various geometric elements . These geometric elements contain the following attributes :

Of course , It also includes various spatial operations , Here is a part :

These algorithms , It's very easy to understand when you take it out alone , For example, let's do some simple calculations : Here we mainly use GDAL To achieve :

Other methods , You are interested in checking by yourself API that will do , I won't explain them one by one here .
How do these spatial algorithms work in PySpark Use it inside ? What are the conditions ? We'll talk about it next time .
To be continued .
边栏推荐
- Idea breakpoint debugging skills, multiple dynamic diagram package teaching package meeting.
- Connect to LAN MySQL
- Program counter of JVM runtime data area
- 肖申克的救赎有感
- 自动化运维利器-Ansible-Playbook
- SQL调优小记
- TCP connection is more than communicating with TCP protocol
- Learn the wisdom of investment Masters
- CTF逆向入门题——掷骰子
- [mmdetection] solves the installation problem
猜你喜欢

Wu Jun trilogy insight (IV) everyone's wisdom

06 products and promotion developed by individuals - code statistical tools

自动化运维利器ansible基础

Akamai anti confusion

Yarn: unable to load file d:\programfiles\nodejs\yarn PS1, because running scripts is prohibited on this system

Integrated development management platform

手把手带你做强化学习实验--敲级详细

List set data removal (list.sublist.clear)

Some feelings of brushing leetcode 300+ questions

Virtual machine startup prompt probing EDD (edd=off to disable) error
随机推荐
yarn : 无法加载文件 D:\ProgramFiles\nodejs\yarn.ps1,因为在此系统上禁止运行脚本
Flink parsing (VII): time window
Selenium test of automatic answer runs directly in the browser, just like real users.
Flink 解析(四):恢复机制
[VNCTF 2022]ezmath wp
Instructions for Redux
微信防撤回是怎么实现的?
TCP连接不止用TCP协议沟通
Only learning C can live up to expectations TOP4 S1E6: data type
Concept and basic knowledge of network layering
Some feelings of brushing leetcode 300+ questions
CentOS7上Redis安装
[mmdetection] solves the installation problem
DataGridView scroll bar positioning in C WinForm
05 personal R & D products and promotion - data synchronization tool
[reverse intermediate] eager to try
Case: check the empty field [annotation + reflection + custom exception]
The problem of "syntax error" when uipath executes insert statement is solved
The most complete tcpdump and Wireshark packet capturing practice in the whole network
Wu Jun's trilogy experience (VII) the essence of Commerce