当前位置:网站首页>Pyspark operator processing spatial data full parsing (5): how to use spatial operation interface in pyspark
Pyspark operator processing spatial data full parsing (5): how to use spatial operation interface in pyspark
2022-07-06 17:37:00 【51CTO】
park Distributed Computing ,PySpark Actually Python Called Spark The bottom frame of , So how are these frameworks invoked ? The last one said Python Inside use GDAL Space operators implemented by packages , What about these whole call processes ? Today, let's explore .
The first article in this series said , Want to run PySpark, Need to use Py4J This package , The function of this package is to use Python To call Java Objects in the virtual machine , The principle is :

Python Algorithm passed Socket Send the task to Java The virtual machine ,JAVA The virtual machine passes through Py4J This package is parsed , And then call Spark Of Worker Computing node , Then restore the task to Python Realization , After the execution is complete , Walk backwards again .
Please check this article for specific instructions , I won't repeat it :
http://sharkdtu.com/posts/pyspark-internal.html
You can see it , use Python What you write , Finally, Worker End , Also use Python Algorithm or package to implement , Now let's do an experiment :

utilize Python Of sys Package to view the running Python Version of , use socket Package to view the machine name of the node , These two bags are Python Peculiar , if PySpark Just run Java Words , On different nodes , It should be impossible to implement .
I have two machines here , named sparkvm.com and sparkvmslave.com, among Sparkvm.com yes master + worker, and sparkvmslave.com just worker.
The last execution shows , Different results are returned on different nodes .
It can be seen from the above experiment that , On different computing nodes , The end use is Python Algorithm package for , So how to use spatial analysis algorithm on different nodes ?
stay Spark On , Utilized Algorithm plug-in In this way :

As long as the same is installed on different nodes Python Algorithm package , You can do it , The key point is to need Configure the system Python, because PySpark The default call is system Python.
Let's do another experiment :

And then in PySpark An example is running above :

Two nodes , Why is it all executed on one node ? have a look debug Log out :

Found in 153 Node , An exception has been thrown , Said he didn't find pygeohash package .
Next I'm in 153 above , hold pygeohash Package installation :

Then execute the above content again :

Finally, let's take advantage of gdal Spatial algorithm interface , Let's run an example :

To be continued
The source code can be passed through my Gitee perhaps github download :
github: https://github.com/allenlu2008/PySparkDemo
gitee: https://gitee.com/godxia/PySparkDemo
边栏推荐
- Uipath browser performs actions in the new tab
- PostgreSQL 14.2, 13.6, 12.10, 11.15 and 10.20 releases
- Flink 解析(一):基础概念解析
- How uipath determines that an object is null
- MySQL error reporting solution
- 信息与网络安全期末复习(完整版)
- JVM 垃圾回收器之Serial SerialOld ParNew
- Flink parsing (V): state and state backend
- Based on infragistics Document. Excel export table class
- [VNCTF 2022]ezmath wp
猜你喜欢

C WinForm series button easy to use

关于Selenium启动Chrome浏览器闪退问题

Take you hand-in-hand to do intensive learning experiments -- knock the level in detail

Kali2021 installation and basic configuration

Flink 解析(一):基础概念解析

学 SQL 必须了解的 10 个高级概念

Akamai anti confusion

自动答题 之 Selenium测试直接运行在浏览器中,就像真正的用户在操作一样。

Flink源码解读(二):JobGraph源码解读

Re signal writeup
随机推荐
Xin'an Second Edition: Chapter 12 network security audit technology principle and application learning notes
PostgreSQL 14.2, 13.6, 12.10, 11.15 and 10.20 releases
Akamai 反混淆篇
[VNCTF 2022]ezmath wp
Wu Jun's trilogy experience (VII) the essence of Commerce
C# NanoFramework 点灯和按键 之 ESP32
The problem of "syntax error" when uipath executes insert statement is solved
BearPi-HM_ Nano development board "flower protector" case
pip install pyodbc : ERROR: Command errored out with exit status 1
Automatic operation and maintenance sharp weapon ansible Playbook
Redis installation on centos7
[mmdetection] solves the installation problem
基于Infragistics.Document.Excel导出表格的类
Openharmony developer documentation open source project
Solr appears write Lock, solrexception: could not get leader props in the log
分布式(一致性协议)之领导人选举( DotNext.Net.Cluster 实现Raft 选举 )
Automatic operation and maintenance sharp weapon ansible Foundation
03 products and promotion developed by individuals - plan service configurator v3.0
Case: check the empty field [annotation + reflection + custom exception]
06 products and promotion developed by individuals - code statistical tools