当前位置:网站首页>Integration of ongdb graph database and spark

Integration of ongdb graph database and spark

2022-07-04 16:40:00 Ma Chao's blog

Quickly explore graph data and graph calculation

Graph computing is the study of anything in the objective world and the relationship between things , Make a complete description of it 、 A technique of calculation and analysis . Graph computation depends on the underlying graph data model , Calculate and analyze on the basis of graph data model Spark Is a very popular and mature and stable computing engine . The following article starts from ONgDB And Spark Start of integration 【 Use TensorFlow The scheme of analyzing graph data with equal depth learning framework is beyond the scope of this paper , Only from the field of graph database Spark The integration of is a popular solution , You can do some calculation and pre training of basic map data and submit it to TensorFlow】, Introduce the specific integration implementation scheme . Downloading the source code of the case project can help novices quickly start exploring , No need to step on the pit . The general process is first Spark Cluster integration diagram database plug-in , Then use specific API Build graph data analysis code .

stay Spark Cluster installation neo4j-spark plug-in unit

  • Download components
https://github.com/ongdb-contrib/neo4j-spark-connector/releases/tag/2.4.1-M1
  • Download components on spark Installation directory jars Folder
E:\software\ongdb-spark\spark-2.4.0-bin-hadoop2.7\jars

The basic component depends on information

  • Version information
Spark 2.4.0  http://archive.apache.org/dist/spark/spark-2.4.0/
ONgDB 3.5.x
Neo4j-Java-Driver 1.7.5
Scala 2.11
JDK 1.8
hadoop-2.7.7
https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/
neo4j-spark-connector-full-2.4.1-M1 https://github.com/neo4j-contrib/neo4j-spark-connector
  • Download the installation package for
hadoop-2.7.7
spark-2.4.0-bin-hadoop2.7
winutils
neo4j-spark-connector-full-2.4.1-M1 【 hold jar Put the bag in spark/jars Folder 】
scala-2.11.12

Create test data

UNWIND range(1,100) as id
CREATE (p:Person {id:id}) WITH collect(p) as people
UNWIND people as p1
UNWIND range(1,10) as friend
WITH p1, people[(p1.id + friend) % size(people)] as p2
CREATE (p1)-[:KNOWS {years: abs(p2.id - p2.id)}]->(p2)
FOREACH (x in range(1,1000000) | CREATE (:Person {name:"name"+x, age: x%100}));
UNWIND range(1,1000000) as x
MATCH (n),(m) WHERE id(n) = x AND id(m)=toInt(rand()*1000000)
CREATE (n)-[:KNOWS]->(m);

remarks

  • Case project 【 To avoid stepping on this under the pit Java-Scala The mixed case project can be referred to 】
https://github.com/ongdb-contrib/ongdb-spark-java-scala-example

If there is a problem downloading the dependent package, please check whether the following website can be downloaded normally Spark dependent JAR package

http://dl.bintray.com/spark-packages/maven
  • Screenshot of case project 【 Start locally before use Spark】
  • Please read the original text for the installation of relevant components and other references
原网站

版权声明
本文为[Ma Chao's blog]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/185/202207041452570781.html