当前位置:网站首页>Parallel one degree relation query

Parallel one degree relation query

2022-06-13 03:19:00 Tnoy. Ma


Here’s the table of contents:

Parallel one degree relation query

     Parallel query can significantly improve query performance in large data volume , Through clever use Cypher With stored procedures , Can achieve a lot of practical queries .

One 、 Query requirement

    [A=[A1,A2,A3,…,An],B=[B1,B2,B3,…,Bm],A and B Is a collection of two nodes . Need to check A Each element in the and B Whether each element in the has a degree relationship , And return the related entity pairs . Parallel one degree relational query problem
 Insert picture description here

Two 、 Write a basic query

     This query implements the search A Each element in the and B Whether each element in the has a once related requirement , The basic function is realized . The query is executed sequentially , Can't be parallel .
     The set is defined in the upper half of the query a and b, The Cartesian product is used to combine the elements of the two lists into the lower half of the query , namely apoc.cypher.run Part of . stay apoc.cypher.run It realizes the query to judge whether two nodes have a one-time relationship , When there is no relation, the query will not be pushed down . stay RETURN The start and end nodes of the partial return relationship .

WITH 
	['Lilly Wachowski','Carrie-Anne Moss','Laurence Fishburne'] AS a,
	['Taylor Hackford','Al Pacino','Charlize Theron'] AS b
UNWIND a AS ale
UNWIND b AS ble
WITH ale,ble
CALL apoc.cypher.run(
	'MATCH (a:Person)-[r]-(b:Person) WHERE a.name={ale} AND b.name={ble} RETURN r LIMIT 1',
	{ale:ale,ble:ble}
	) 
	YIELD value 
WITH value.r AS r
RETURN STARTNODE(r) AS sNode,ENDNODE(r) AS eNode

3、 ... and 、 Optimize queries using parallelism

     stay Two Based on the query , Use apoc.cypher.parallel2 Parallel mode optimization , Similarly, the parallelism of many degree relations can also be realized in this way . By default , The maximum number of parallels is CPU Number of cores x 100. for example , If the database is assigned 4 Kernel , Then the maximum number of parallel processes is 400. The performance of batch query through this statement will be at least 50% The above promotion .

CALL apoc.cypher.parallel2(
  'WITH $a AS a,$b AS b UNWIND a AS ale UNWIND b AS ble WITH ale,ble CALL apoc.cypher.run( \'MATCH (a:Person)-[r]-()-[*..3]-(b:Person) WHERE a.name={ale} AND b.name={ble} RETURN r LIMIT 1\', {ale:ale,ble:ble} ) YIELD value WITH value.r AS r RETURN STARTNODE(r) AS sNode,ENDNODE(r) AS eNode ',
  {a:['Lilly Wachowski','Carrie-Anne Moss','Laurence Fishburne'],b:['Taylor Hackford','Al Pacino','Charlize Theron']},
  'a'
)
原网站

版权声明
本文为[Tnoy. Ma]所创,转载请带上原文链接,感谢
https://yzsam.com/2022/02/202202280531435690.html