我刚刚在http://www.neo4j.org/learn/online_course完成培训,有几个关于实验室答案的问题。
第一个来自第2课中的Advanced Graph Lab。(没有给出答案,也没有在图小部件中验证)
问题是:推荐3个基努·里维斯应该合作(但没有)的演员。提示是,你应该基本上选择三个与基努没有ACTED_IN的电影有ACTED_IN关系的人。
该图具有具有ACTED_IN关系和定向关系的Person节点和Movie节点。
我想出了这个:
MATCH (a:Person)-[:ACTED_IN]->(movie:Movie)
WHERE NOT (:Person {name:"Keanu Reeves"})-[:ACTED_IN]->(movie)
RETURN a, count(movie)
ORDER BY count(movie) DESC
LIMIT 3
但我不知道这是否真的排除了同一部电影,或者只是基努·里维斯(因为被遣返的演员没有出现在基努的电影中,但他们可能已经被遣返了。
到目前为止,我已经找到了两个解决方案。
1:推荐基努·里维斯没有演过的最忙碌的演员。
MATCH (p:Person)-[:ACTED_IN]->(m)
WHERE p.name <> 'Keanu Reeves'
AND NOT (p)-[:ACTED_IN]->()<-[:ACTED_IN]-(:Person{name:'Keanu Reeves'})
RETURN p.name, count(m) AS rating
ORDER BY count(m) DESC
LIMIT 3;
它产生
p.name | rating
--------------------------
Tom Hanks | 12
Meg Ryan | 5
Cuba Gooding Jr.| 4
2:推荐与基努·里维斯合作最多的演员
MATCH (f:Person)-[:ACTED_IN]->(m)<-[:ACTED_IN]-(c:Person),
(k:Person{name:'Keanu Reeves'})
WHERE c.name <> 'Keanu Reeves'
AND (f)-[:ACTED_IN]->()<-[:ACTED_IN]-(k)
AND NOT (c)-[:ACTED_IN]->()<-[:ACTED_IN]-(k)
RETURN c.name, count(c) AS Rating
ORDER BY Rating desc
LIMIT 3;
它产生
p.name | rating
--------------------------
Danny DeVito | 2
J.T. Walsh | 2
Tom Hanks | 2
今天我遇到了这个问题,我做了什么
MATCH (keanu:Person)-[:ACTED_IN]->(movie),
(playedwith:Person)-[:ACTED_IN]->(movie),
(playedwith)-[t:ACTED_IN]->(othermovie),
(other:Person)-[:ACTED_IN]->(othermovie)
WHERE keanu.name = "Keanu Reeves"
AND NOT (other)-[:ACTED_IN]->(movie)
AND NOT (keanu)-[:ACTED_IN]->(othermovie)
RETURN other.name
,collect(DISTINCT othermovie)
,collect(DISTINCT playedwith)
,count(DISTINCT playedwith)
ORDER BY count(DISTINCT playedwith)desc
LIMIT 3
因为有这么多的Distict,我不喜欢它,但这是结果:
other.name | collect(DISTINCT othermovie) | collect(DISTINCT playedwith) | count(DISTINCT playedwith)
-----------------------------------------------------------------------------------------------------------------------------
Tom Hanks | ["Cloud Atlas", | ["Hugo Weaving","Charlize Theron"] | 2
| "That Thing You Do"] |
Tom Cruise | ["A Few Good Men"] | ["Jack Nicholson"] | 1
Robin Williams| ["The Birdcage"] | ["Gene Hackman"] | 1
所以我找到了两种看起来不错的不同方法。第一个找到了最有“ACTED_IN同一部电影”路径的人,他们原来的人不是基努·里维斯有“ACTED_IN同一部电影”关系的人。
第二个发现有人没有和基努·里维斯一起ACTED_IN过电影,但被拍过最多电影的人命令。
当然,在所有分享这种关系的演员之间建立一种“WORKED_WITH”的关系,然后寻找基努没有WORKED_WITH的每个人,这是最容易的,但我想这违背了这个问题的乐趣。
第一个解决方案很简单,看起来也很准确:
MATCH (a:Person {name:"Keanu Reeves"})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(b:Person)
WITH collect(b.name) AS FoF
MATCH (c:Person)-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(d:Person)
WHERE not c.name IN FoF AND c.name <> "Keanu Reeves"
RETURN distinct c.name, count(distinct d)
ORDER BY count(distinct d) desc
limit 3
它返回:
c.name | count(distinct d)
-------------------------------
Tom Hanks | 34
Cuba Gooding Jr.| 24
Tom Cruise | 23
其中d是与c“ACTED_IN”的人数。
编辑添加:
在回答之后,我使用了他们更精简的查询方法来得出这个结论:
MATCH (a:Person)-[:ACTED_IN]->()<-[:ACTED_IN]-(b:Person)
WHERE a.name <>'Keanu Reeves'
AND NOT (a)-[:ACTED_IN]->()<-[:ACTED_IN]-(b:Person {name:'Keanu Reeves'})
RETURN a.name, count(Distinct b) AS Rating
ORDER BY Rating DESC
LIMIT 3
它返回与上述相同的内容。
或者,我将此用于在大多数电影中工作的人:
MATCH (a:Person {name:"Keanu Reeves"})-[:ACTED_IN]->(:Movie)<-[:ACTED_IN]-(b:Person)
WITH collect(b.name) AS FoF
MATCH (c:Person)-[:ACTED_IN]->(m:Movie)<-[:ACTED_IN]-(d:Person)
WHERE not c.name IN FoF AND c.name <> "Keanu Reeves"
RETURN distinct c.name, count(distinct m)
ORDER BY count(distinct m) desc
limit 3
它返回:
c.name | count(distinct m)
-------------------------------------------
Tom Hanks | 11
Meg Ryan | 5
Cuba Gooding Jr. | 4
其中m是他们参与过的电影数量。