当前位置:网站首页>Pyspark operator processing spatial data full parsing (4): let's talk about spatial operations first

Pyspark operator processing spatial data full parsing (4): let's talk about spatial operations first

2022-07-06 17:33:00 51CTO


stay PySpark To deal with spatial data , First of all, let's talk about the problem of space operation .

Spatial operation is the operation rule between spatial data , For example, two faces intersect :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _java

Such calculations , There are usually two situations , One is to judge , Whether it intersects ; The other is to take out the intersecting parts .

So borrow the interrogative sentence form in English grammar , Space operations are also divided into two categories :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _spark_02

The first kind of space operation , Compare general questions in English , It's actually a kind of Judgment of spatial relationship , No new results will be generated , Only return the judgment of the relationship between the two spatial data involved in the operation , For example, the two sides of Figure 1 , Calculate the spatial relationship , The judgment condition is “ The intersection ” Words , Only one result will be returned boolean value :True.

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _java_03

The second category , It is the so-called space geometric operation , This kind of analysis and comparison is only the first kind of relational operation , Meeting Generate new data ; For example, figure 1 , The analysis condition is “ The intersection ” Words , If the result returned is face feature , Then the intersecting parts will be taken out :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _spark_04

Of course , The algorithm of spatial relation is very mature , If you are interested, you can check it by yourself , We do not popularize basic algorithms here , And there is no need to build your own wheels , Implement the principle of applying what you have learned , We only know how to use .

First, let's talk about the most widely used spatial algorithm library in the industry .

stay OGC standard ( Open Geospatial Information Alliance (Open Geospatial Consortium)) Before , do GIS Every organization of has developed its own set of spatial object rules and spatial computing rules , All kinds of flowers bloom ( a literary style …… Two flowers ), Then because it was too messy , therefore OGC Born in the sky , A top-level architecture of spatial object standard is given , Below this top floor, everyone continues to blossom in their own style , But at least there is a general agreement . OGC The following operations are defined on vector data : 

  • First, some geometric information is defined :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _ Spatial data _05

  • Some definitions of spatial relationship judgment :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _ Spatial data _06

These spatial relationships , All of them True/False.

Finally, space geometric operations :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _ Spatial data _07

These geometric operations , A new geometric object will be generated .

Above is OGC Specify the relationship and operation rules of some vector data , As long as meet OGC Standard spatial algorithm made by any organization and unit , Will include these basic algorithms , And after thousands of tempering evolution , We don't have to write another set by ourselves .

These algorithms , How to achieve it ? stay OGC below , It is generally divided into the implementation of the following two systems :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _java_08

In the open source system , There are two implementations :



The first is the most widely used in the industry GDAL,GDAL The full name of is spatial data abstract library (Geospatial Data Abstraction Library), Mainly used C++ To achieve , In this system , It has also derived and expanded countless branches , such as PostGIS、Python GDAL/OGR,R Linguistic RGDAL Bag, etc .






The other is JAVA Under the system JTS(Java Topology Suite:Java Topology Suite ), This package is in GIS Although the reputation of the open source system in the field is not obvious , But if it's not GIS Professional , Lack of brilliance , The space applications derived below him include GeoServer This kind of open source WebGIS The top system ,Oracle Spatial This enterprise class spatial processing plug-in , It includes Spark The following is for space processing GeoSpark wait , All applied JTS.




Then there is the closed source system , As the operating system industry says, closed source is represented by Microsoft , that GIS The boundary is based on ESRI For the main target .Esri stay OGC A set of geometric relation algorithms are developed under the standard , be called Esri Geometry, In addition to the above functions OGC Beyond relevant standards , There are also many self expanding algorithms , There are also some differences in geometric organization patterns , So Esri Of arcpy On the whole , The geometric structure is shown as :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _spark_09

All geometry , All are constructed with points as the benchmark structure , Geometric point structure + Spatial relations , It becomes various geometric elements . These geometric elements contain the following attributes :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _spark_10

Of course , It also includes various spatial operations , Here is a part :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _ Spatial data _11

These algorithms , It's very easy to understand when you take it out alone , For example, let's do some simple calculations : Here we mainly use GDAL To achieve :

PySpark Operator processing spatial data full parsing (4): First, let's talk about spatial operations _spark_12

Other methods , You are interested in checking by yourself API that will do , I won't explain them one by one here .

How do these spatial algorithms work in PySpark Use it inside ? What are the conditions ? We'll talk about it next time .

To be continued .

原网站

版权声明
本文为[51CTO]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/187/202207060933292825.html