Refining results
This guide describes how to refine the results from Cypher® queries by filtering, ordering, comparing, aliasing, and more.
Example dataset
After you create a free Aura instance, use the "Connect" button and select "Query". In the Cypher editor, copy and paste the following Cypher and execute the query:
CREATE (matrix:Movie {title: 'The Matrix', released: 1997})
CREATE (cloudAtlas:Movie {title: 'Cloud Atlas', released: 2012})
CREATE (forrestGump:Movie {title: 'Forrest Gump', released: 1994})
CREATE (larryCrowne:Movie {title: 'Larry Crowne', released: 2011})
CREATE (keanu:Person {name: 'Keanu Reeves', born: 1964})
CREATE (robert:Person {name: 'Robert Zemeckis', born: 1951})
CREATE (tom:Person {name: 'Tom Hanks', born: 1956})
CREATE (tom)-[:ACTED_IN {roles: ['Forrest']}]->(forrestGump)
CREATE (tom)-[:ACTED_IN {roles: ['Zachry']}]->(cloudAtlas)
CREATE (tom)-[:ACTED_IN {roles: ['Larry Crowne']}]->(larryCrowne)
CREATE (tom)-[:DIRECTED]->(larryCrowne)
CREATE (robert)-[:DIRECTED]->(forrestGump)
CREATE (diana:Person {name: "Diana"})
CREATE (melissa:Person {name: "Melissa", twitter: "@melissa"})
CREATE (dan:Person {name: "Dan", twitter: "@dan", yearsExperience: 6})
CREATE (sally:Person {name: "Sally", yearsExperience: 4})
CREATE (john:Person {name: "John", yearsExperience: 5})
CREATE (jennifer:Person {name: "Jennifer", twitter: "@jennifer", yearsExperience: 5})
CREATE (joe:Person {name: "Joe"})
CREATE (mark:Person {name: "Mark", twitter: "@mark"})
CREATE (ann:Person {name: "Ann"})
CREATE (xyz:Company {name: "XYZ"})
CREATE (x:Company {name: "Company X"})
CREATE (a:Company {name: "Company A"})
CREATE (Neo4j:Company {name: "Neo4j"})
CREATE (abc:Company {name: "ABC"})
CREATE (query:Technology {type: "Query Languages"})
CREATE (etl:Technology {type: "Data ETL"})
CREATE (integrations:Technology {type: "Integrations"})
CREATE (graphs:Technology {type: "Graphs"})
CREATE (dev:Technology {type: "Application Development"})
CREATE (java:Technology {type: "Java"})
CREATE (diana)-[:LIKES]->(query)
CREATE (melissa)-[:LIKES]->(query)
CREATE (dan)-[:LIKES]->(etl)<-[:LIKES]-(melissa)
CREATE (xyz)<-[:WORKS_FOR]-(sally)-[:LIKES]->(integrations)<-[:LIKES]-(dan)
CREATE (sally)<-[:IS_FRIENDS_WITH]-(john)-[:LIKES]->(java)
CREATE (john)<-[:IS_FRIENDS_WITH]-(jennifer)-[:LIKES]->(java)
CREATE (john)-[:WORKS_FOR]->(xyz)
CREATE (sally)<-[:IS_FRIENDS_WITH]-(jennifer)-[:IS_FRIENDS_WITH]->(melissa)
CREATE (joe)-[:LIKES]->(query)
CREATE (x)<-[:WORKS_FOR]-(diana)<-[:IS_FRIENDS_WITH]-(joe)-[:IS_FRIENDS_WITH]->(mark)-[:LIKES]->(graphs)<-[:LIKES]-(jennifer)-[:WORKS_FOR {startYear: 2017}]->(Neo4j)
CREATE (ann)<-[:IS_FRIENDS_WITH]-(jennifer)-[:IS_FRIENDS_WITH]->(mark)
CREATE (john)-[:LIKES]->(dev)<-[:LIKES]-(ann)-[:IS_FRIENDS_WITH]->(dan)-[:WORKS_FOR]->(abc)
CREATE (ann)-[:WORKS_FOR]->(abc)
CREATE (a)<-[:WORKS_FOR]-(melissa)-[:LIKES]->(graphs)<-[:LIKES]-(diana)
Filter
You can filter results and return only a subset of data by using the WHERE subclause or the FILTER (available only in Cypher 25) clause.
Property
You can filter results by property. For example, to find which people in the example dataset have Twitter handles, you can use the following query:
MATCH (p:Person)
RETURN p.name, p.twitter
Since you MATCH all Person nodes, they are all returned, regardless of whether they have a twitter property or not.
Out of the 12 Person nodes in the graph, only four (Melissa, Dan, Jennifer, Mark) contain a value for the twitter property.
Since the query matches and returns all Person nodes, you get another eight null results.
In Neo4j, properties only exist (are stored) if they have a value.
A null property value is not stored.
This ensures that only valuable, necessary information is retained for your nodes and relationships.
To return only results that contain a value, you can use the WHERE subclause and specify that it returns only results that are not null:
MATCH (p:Person)
WHERE p.twitter IS NOT NULL
RETURN p.name, p.twitter
Or, alternatively, use FILTER with Cypher 25:
CYPHER 25
MATCH (p:Person)
FILTER p.twitter IS NOT NULL
RETURN p.name, p.twitter
|
If you are using any other Cypher version, you need to add To see a comparison between |
The result is:
| p.name | p.twitter |
|---|---|
|
|
|
|
|
|
|
|
Rows: 4 |
|
Strings and partial values
You can filter results based on parts of the value stored in a node or relationship using specific predicates together with WHERE:
STARTS WITH
The STARTS WITH operator searches for property values that begin with a string you specify, for example, the letter "M":
MATCH (p:Person)
WHERE p.name STARTS WITH 'M'
RETURN p.name;
The result is:
| p.name |
|---|
|
|
Rows: 2 |
CONTAINS
The CONTAINS operator checks if a specified string is part of a property value, for example, if it contains the letter "a":
MATCH (p:Person)
WHERE p.name CONTAINS 'a'
RETURN p.name;
The result is:
| p.name |
|---|
|
|
|
|
|
|
|
Rows: 7 |
ENDS_WITH
The ENDS_WITH operator searches for a specified string at the end of a property value, for example, a name that ends with "n":
MATCH (p:Person)
WHERE p.name ENDS WITH 'n'
RETURN p.name;
The result is:
| p.name |
|---|
|
|
|
Rows: 3 |
Alternatively, you can use regular expressions, without string operators, to find a property value. For example, if you want to find any names that include "Jo" while respecting case-sensitivity (queries are case sensitive by default):
MATCH (p:Person)
WHERE p.name =~ 'Jo.*'
RETURN p.name
| p.name |
|---|
|
|
Rows: 2 |
If you don’t want your query to be case sensitive, you can use toLower() with CONTAINS:
MATCH (p:Person)
WHERE toLower(p.name) CONTAINS "jo"
RETURN p.name
Or, alternatively, use toUpper():
MATCH (p:Person)
WHERE toUpper(p.name) CONTAINS "JO"
RETURN p.name
Both options return all name property values that contain "jo" regardless of the letter case.
IN
You can use the IN operator to test whether a property exists in a given list of values.
For example, if any Person node in the graph has "1", "5" or "6" values for yearsExperience:
MATCH (p:Person)
WHERE p.yearsExperience IN [1, 5, 6]
RETURN p.name, p.yearsExperience
The result shows that no one has 1 year of experience, but three people have 5 or 6:
| p.name | p.yearsExperience |
|---|---|
|
|
|
|
|
|
Rows: 3 |
|
Patterns
For example, to find people who are friends with someone who works for Neo4j, you can filter on whether there exists a [:WORKS_FOR]→(:Company {name:'Neo4j'}) relationship from the p:Person in the friendship:
MATCH (p:Person)-[r:IS_FRIENDS_WITH]->(friend:Person)
WHERE exists((p)-[:WORKS_FOR]->(:Company {name: 'Neo4j'}))
RETURN p, r, friend;
Which gives you the result:
Alternatively, you can also filter the result by excluding certain patterns from the results: For example, if you want to find which of Jennifer’s friends do not work for any company, use the following query:
MATCH (p:Person)-[r:IS_FRIENDS_WITH]->(friend:Person)
WHERE p.name = 'Jennifer'
AND NOT exists((friend)-[:WORKS_FOR]->(:Company))
RETURN friend.name;
You get only "Mark" as the result.
OPTIONAL patterns
Sometimes you want to return results from queries even if they don’t meet all the criteria.
For that, you can use an OPTIONAL MATCH which returns null values where there is no match for the optional clause.
For example, if you search for people whose name starts with a "J" and who optionally work for a company:
MATCH (p:Person)
WHERE p.name STARTS WITH 'J'
OPTIONAL MATCH (p)-[:WORKS_FOR]-(other:Company)
RETURN p.name, other.name;
You get the following result:
| p.name | other.name |
|---|---|
|
|
|
|
|
|
Rows: 3 |
|
Notice that Joe is returned because his name starts with the letter 'J', but his company’s name is null.
This is because he does not have a WORKS_FOR relationship to a COMPANY node.
Since you used OPTIONAL MATCH, his Person node is still returned from the first match, but the optional match is not found, so a null result is returned for other.name.
To see the difference, try running the query without the OPTIONAL in front of the second match.
Joe’s row is no longer returned in the result.
This is because Cypher reads the statement with an AND match, so the person must match the first criterium (name starts with "J") and the second criterium (person works for a company).
More complex patterns
You can query for patterns that have more than one relationship. For example, if you want to know who works for the same company and are also friends:
MATCH (p1:Person)-[:WORKS_FOR]->(company:Company)<-[:WORKS_FOR]-(p2:Person),
(p1)-[:IS_FRIENDS_WITH]->(p2)
RETURN p1.name AS people1, p2.name AS people2, company.name AS company
The result is:
| people1 | people2 | company |
|---|---|---|
|
|
|
|
|
|
Rows: 2 |
||
Note that the query has a comma at the end of the first line, and another pattern is added to MATCH on the next line.
This allows you to chain different patterns together, similar to when you used WHERE exists(<pattern>) previously.
With this structure, you can add multiple different patterns and link them together, allowing you to traverse various pieces of the graph with certain patterns.
For more information on how to query for patterns, see Cypher → Patterns.
Compare values
Equality
You can use comparisons to filter values, starting with equality:
MATCH (m:Movie)
WHERE m.title = 'The Matrix'
RETURN m
The result is:
| m |
|---|
|
Rows: 1 |
Numerical
Another option is the numerical comparison to confirm the existence of values within a list.
The WHERE clause in the following example includes a greater-than comparison to find which movies were released after the year "2000":
MATCH (m:Movie)
WHERE m.released > 2000
RETURN m.title, m.released
The result is:
| m | m.release |
|---|---|
|
|
|
|
Rows: 2 |
|
Cypher has other mathematical operators that allow for other types of numerical comparison. See Cypher → Mathematical operators for the complete list of operators and examples.
Boolean
Cypher’s boolean operators allow you to combine or evaluate logical conditions.
For example, to find who in the graph has directed a movie but didn’t act in it:
MATCH (p:Person)-[:DIRECTED]->(m)
WHERE NOT (p)-[:ACTED_IN]->()
RETURN p, m
The result is:
This is because the query looks for Person nodes that are connected to Movie nodes through the DIRECTED relationship only and not ACTED_IN.
If the ACTED_IN relationship wasn’t a restriction anymore:
MATCH (p:Person)-[:DIRECTED]->(m)
RETURN p, m
Then the result includes Tom Hanks and the movie "Larry Crowne" because Tom Hanks both directed and acted in the movie:
See Cypher → Boolean operators for the complete list of operators and examples.
Range of values
You can query data within a certain range in number or date, for example, if you want to find events within a certain timeline, age values, etc.
If you want to know who has experience within the range of three to seven years:
MATCH (p:Person)
WHERE 3 <= p.yearsExperience <= 7
RETURN p
The result is:
Aliasing results
By default, Cypher returns a label as the name of the column in the tabular results.
However, you can use an alias to make the results more understandable by using AS.
For example, if you want to have the tabular result with a column titled name instead of the label p.name:
MATCH (p:Person)
RETURN p.name AS name
The result is:
| name |
|---|
|
|
|
|
|
|
|
|
|
|
|
|
Rows: 12 |
Avoiding duplication
For example, to find which people are connected through a IS_FRIENDS_WITH relationship:
MATCH (p:Person)-[:IS_FRIENDS_WITH]-(friend:Person)
RETURN friend.name;
The result is:
| friend.name |
|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rows: 18 |
Note that some names appear more than once.
For example, "Sally" appears twice because she is connected to two people through two IS_FRIENDS_WITH relationships.
To avoid this, add DISTINCT after RETURN:
MATCH (p:Person)-[:IS_FRIENDS_WITH]->(friend:Person)
RETURN DISTINCT friend.name;
The result is now:
| friend.name |
|---|
|
|
|
|
|
|
|
Rows: 7 |
Order
If you want to have the results ordered, you can use the clause ORDER BY after the RETURN clause.
For example, to list how many years of experience each person has in descending order:
MATCH (p:Person)
RETURN p.yearsExperience AS yearsExperience
ORDER BY yearsExperience DESC
This is the result:
| yearsExperience |
|---|
|
|
|
|
|
|
|
|
|
|
|
|
Rows: 12 |
Note that the first 8 rows contain null as result.
This means that not all returned Person nodes have the property yearsExperience.
Since null is considered the highest possible value when sorting in descending order, they are returned before the numerical values.
If the order was ascending, null values would appear last.
Optionally, you can avoid returning null values by using WHERE and IS NOT NULL:
MATCH (p:Person)
WHERE p.yearsExperience IS NOT NULL
RETURN p.yearsExperience AS yearsExperience
ORDER BY yearsExperience DESC
Limit
You can use LIMIT when you want to restrict the number of results returned.
For example, the current example graph has twelve Person nodes and if you want to retrieve only three:
MATCH (p:Person)
RETURN p
LIMIT 3
The result is:
| p |
|---|
|
|
|
Rows: 3 |
Since you didn’t use ORDER BY to specify an order to the results, the query retrieves the first three Person nodes that were added to the graph.
To get random results, you can add ORDER BY rand() to the query:
MATCH (p:Person)
RETURN p
ORDER BY rand()
LIMIT 3
Now every time you run this query, you get three random Person nodes.
Aggregate
Cypher provides several aggregating functions that let you calculate a single value from a set of values.
You can use them with the RETURN or the WITH clauses to aggregate results.
| Function | Description |
|---|---|
Returns the average of a set of |
|
Returns a list containing the values returned by an expression. |
|
Returns the number of values or rows. |
|
Returns the maximum value in a set of values. |
|
Returns the minimum value in a set of values. |
|
Returns the percentile of a value over a group using linear interpolation. |
|
Returns the nearest |
|
Returns the standard deviation for the given value over a group for a sample of a population. |
|
Returns the standard deviation for the given value over a group for an entire population. |
|
Returns the sum of a set of INTEGER, FLOAT or DURATION values. |
Refer to Cypher → Aggregating for the complete list of all aggregating functions and examples.
Unwind
If you have a list that you want to inspect or separate the values, Cypher offers the UNWIND function for that.
For example, if you want to see which people in the example graph like "Graphs" and/or "Query Languages":
WITH ['Graphs','Query Languages'] AS likedTech
UNWIND likedTech AS technology
MATCH (p:Person)-[r:LIKES]-(t:Technology {type: technology})
RETURN t.type, p.name AS people;
The result is:
| t.type | people |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rows: 7 |
|
You can use the collect() function to group results together.
If you rewrite the query like this:
WITH ['Graphs','Query Languages'] AS likedTech
UNWIND likedTech AS technology
MATCH (p:Person)-[r:LIKES]-(t:Technology {type: technology})
RETURN t.type, collect(p.name) AS people;
The result looks tidier:
| t.type | people |
|---|---|
|
|
|
|
Rows: 2 |
|
Number of items
If you have a list of elements, you can also return the number of elements in that list using the size() function.
For example, if you want to know how many friends each person in the graph has:
MATCH (p:Person)-[:IS_FRIENDS_WITH]-(friend:Person)
RETURN p.name, size(collect(friend.name)) AS numberOfFriends;
Note that the collect() function is used inside the size() function.
If you didn’t have it, that is, only size(friend.name), what the function would do is to count how many characters the string values of the friend.name properties have.
Since you want to know how many friends each person has, then collect() works as an aggregating function that aggregates all friend.name values and size() returns the amount in number.
The result is:
| p.name | numberOfFriends |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Rows: 9 |
|
Another point to note is that if you add a direction to the relationship:
MATCH (p:Person)-[:IS_FRIENDS_WITH]->(friend:Person)
RETURN p.name, size(collect(friend.name)) AS numberOfFriends;
You get a different result:
| p.name | numberOfFriends |
|---|---|
|
|
|
|
|
|
|
|
Rows:4 |
|
This is because the IS_FRIENDS_WITH relationship is directed.
For example, Jennifer is connected to Mark through an outgoing IS_FRIENDS_WITH relationship from Jennifer to Mark (i.e. (jennifer)-[:IS_FRIENDS_WITH]→(mark)), not (mark)-[:IS_FRIENDS_WITH]→(jennifer).
In practice, the fact that Jennifer and Mark are friends is what matters, so the relationship’s direction is irrelevant. However, when adding this information to the graph, the relationship must have a direction. On the other hand, querying for it doesn’t require direction.
Creating a new pattern with the opposite direction (i.e. (mark)-[:IS_FRIENDS_WITH]→(jennifer)) wouldn’t be incorrect from a syntax perspective, but it’s duplicate information when the direction is irrelevant to answer the question "who is friends with whom?".
This is why a more accurate result is achieved by using a query with an undirected relationship.
Keep learning
Cypher has many other resources for refining query results. If you want to keep learning, see the following pages:
-
Composed queries → See how to use
UNION,WHEN, andNEXTto combine queries, create linear compositions and different query branches. -
SKIP→ Defines from which row to start including the rows in the output. -
Expressions → Operators and expressions that can be used to evaluate values.
-
Functions → Full list of existing Cypher functions.
-
Indexes → Strategies for speeding up data retrieval.