Complex Datasets
Learn how to handle complex data.
Let's assume a JSON document, which contains product costs by country. This document contains nested collections: there is a list of country/products entries, and then within each products field, there is a list of category/cost entries. This is a typical example of nested data, where collections can contain other collections nested inside them.
[
{
"country": "CH",
"products": [ { "category": "Keyboard", "cost": 50 }, { "category": "Keyboard", "cost": 70 }, { "category": "Monitor", "cost": 450 } ]
},
{
"country": "US",
"products": [ { "category": "Keyboard", "cost": 20 }, { "category": "Monitor", "cost": 200 } ]
}
]
Sometimes we need to bring items of a nested collection to the top. This is called unnesting a collection.
If you want to try this example, you can deploy the following endpoint:
Tutorial 9 - Handling nested data in a collection
- Overview
- Code
Sample usage:
/tutorial/9-explode?maxCost=<int>
For instance, to obtain the list products that cost less than 300:
/tutorial/9-explode?maxCost=300
main(maxCost: int = 300) =
let
sales = Json.InferAndRead("https://raw-tutorial.s3.eu-west-1.amazonaws.com/sales.json"),
explodedSales = Collection.Explode(
sales,
sale -> Collection.Filter(sale.products, product -> product.cost <= maxCost)
)
in
Collection.Transform(explodedSales, sale -> {country: sale.country, category: sale.category, cost: sale.cost})
For instance, suppose we want to "move" the fields category
and cost
of the inner collection items to the parent.
Doing so would lead to a collection of flat records, such as:
country | category | cost |
---|---|---|
CH | Keyboard | 50 |
CH | Keyboard | 70 |
CH | Monitor | 450 |
US | Keyboard | 20 |
US | Monitor | 200 |
To accomplish this, use Collection.Explode
for collections, or List.Explode
for list.
These functions receives two arguments: the collection or list to process, and a function with the inner collection to "explode" to the top-level.
For instance, assume the following code:
let items = Json.InferAndRead("https://raw-tutorial.s3.eu-west-1.amazonaws.com/sales.json")
in Collection.Explode(items, i -> i.products)
This leads to the original dataset augmented with the columns of the nested collection (category, cost):
[
{
"country": "CH",
"products": [ { "category": "Keyboard", "cost": 50 }, { "category": "Keyboard", "cost": 70 }, { "category": "Monitor", "cost": 450 } ],
"category": "Keyboard",
"cost": 50
},
{
"country": "CH",
"products": [ { "category": "Keyboard", "cost": 50 }, { "category": "Keyboard", "cost": 70 }, { "category": "Monitor", "cost": 450 } ],
"category": "Keyboard",
"cost": 70
},
{
"country": "CH",
"products": [ { "category": "Keyboard", "cost": 50 }, { "category": "Keyboard", "cost": 70 }, { "category": "Monitor", "cost": 450 } ],
"category": "Monitor",
"cost": 450
},
{
"country": "US",
"products": [ { "category": "Keyboard", "cost": 20 }, { "category": "Monitor", "cost": 200 } ],
"category": "Keyboard",
"cost": 20
},
{
"country": "US",
"products": [ { "category": "Keyboard", "cost": 20 }, { "category": "Monitor", "cost": 200 } ],
"category": "Monitor",
"cost": 200
}
]
Note that category and cost are now part of the top-level collection.
The function passed as an argument to Collection.Explode
can apply arbitary computations.
For instance, in the following example the products inner collections are filtered by item price:
main(maxCost: int) =
let items = Json.InferAndRead("https://raw-tutorial.s3.eu-west-1.amazonaws.com/sales.json"),
unnested = Collection.Explode(items, i -> Collection.Filter(i.products, p -> p.cost <= maxCost))
in Collection.Transform(unnested, i -> { i.country, i.category, i.cost })