Skip to main content

Joins

Learn how to join data.

A collection can be joined with another collection using Collection.Join or Collection.EquiJoin. A list can be joined with another list using List.Join or List.EquiJoin.

Join vs EquiJoin

An EquiJoin is a special type of join where the matching condition must be an equality, i.e. when the two lists or collections being joined should be matched using an equality. EquiJoins offer better performance and should be used when possible.

Example

For this example, we use two datasets: a CSV file with information on airports, and a JSON file with information on trips.

The CSV file with the dataset on airports looks like:

"AirportID","Name","City","Country","IATA_FAA","ICAO","Latitude","Longitude","Altitude","Timezone","DST","TZ"
1,"Goroka","Goroka","Papua New Guinea","GKA","AYGA",-6.081689,145.391881,5282,10,"U","Pacific/Port_Moresby"
2,"Madang","Madang","Papua New Guinea","MAG","AYMD",-5.207083,145.7887,20,10,"U","Pacific/Port_Moresby"
3,"Mount Hagen","Mount Hagen","Papua New Guinea","HGU","AYMH",-5.826789,144.295861,5388,10,"U","Pacific/Port_Moresby"
4,"Nadzab","Nadzab","Papua New Guinea","LAE","AYNZ",-6.569828,146.726242,239,10,"U","Pacific/Port_Moresby"
...

The JSON file with a information on trips looks like:

[
{ "reason": "Holidays", "origin": "GVA", "destination": "KEF", "dates": ... }
{ "reason": "Work", "origin": "GVA", "destination": "SFO", "dates": ... }
{ "reason": "Holidays", "origin": "GVA", "destination": "LCY", "dates": ... }
]

We want to join both collections using the IATA_FAA field of the CSV file with the destination field of the JSON file. Since we want an equality, we will use an equijoin. The Collection.EquiJoin function receives both collections to join, then a function per collection to obtain the join key, as shown:

let trips = Json.InferAndRead("https://raw-tutorial.s3.eu-west-1.amazonaws.com/trips.json"),
airports = Csv.InferAndRead("https://raw-tutorial.s3.eu-west-1.amazonaws.com/airports.csv")
in Collection.EquiJoin(trips, airports, trip -> trip.destination, airport -> airport.IATA_FAA)

We can also express the same join condition but using Collection.Join instead. The Collection.Join function receives both collections to join, then a function with elements from both collection that defines the join condition, as shown:

let airports = Csv.InferAndRead("https://raw-tutorial.s3.eu-west-1.amazonaws.com/airports.csv"),
trips = Json.InferAndRead("https://raw-tutorial.s3.eu-west-1.amazonaws.com/trips.json")
in Collection.Join(trips, airports, (trip, airport) -> trip.destination == airport.IATA_FAA)
info

If you want to try an example using joins, you can deploy the following endpoint:

Tutorial 3 - Join two collections
Learn how to join data from two datasets.

Sample usage:

/tutorial/3-join