Skip to main content

Example: Integrating data across multiple S3 accounts

This example illustrates how to create an API that merges data from two separate S3 buckets each under separate AWS accounts.


If you are not familiar with RAW we recommend checking out our Getting started guide first. To use RAW, you need an account which you can create and use for free here.

We have been given two sets of credentials for two different S3 buckets, each of which contains JSON files of a specific format.


If you want to try this example, you can deploy the following endpoint:

Read dta from S3 buckets across separate AWS accounts
Learn how to serve data live from two S3 buckets across two separate AWS accounts.



Here's for example a file found in bucket s3://log-server-a.

"creation_date": "2022-04-01",
"entries": [{"hostname": "host01"}, {"hostname": "host02"}]

Here's a file found in the second bucket s3://log-server-b.

"creation_date": "2022-04-03",
"entries": [{"hostname": "host95"}, {"hostname": "host96"}, {"hostname": "host97"}]

We're interested in the content of the entries field of these JSON files. Our goal is to read every JSON file across both buckets and merge their entries lists into a single one.

["host01","host02", ...., "host95","host96","host97"]

The code executed by the REST API works as follows.

The read_logs function computes the list of all hostnames found in a given bucket.

read_logs(path: string, aws_config: record(region: string, accessKey: string, secret: string)) =
// list all files of the bucket path
bucket = S3.Build(path, region=aws_config.region, accessKey=aws_config.accessKey, secretKey=aws_config.secret),
files = Location.Ls(bucket),
// open each file as JSON
contents = List.Transform(files, f -> Json.Read(f, json_type)),
// `Explode` the entries field
entries = List.Explode(contents, c -> c.entries)
// project only the 'hostname' column to obtain the expected list of strings

read_logs is called on both s3://log-server-a and s3://log-server-b with the corresponding set of credentials.

awsAccountA = {region: "eu-west-1", accessKey: "<access-key-for-a>", secret: "<secret-for-a>"},
awsAccountB = {region: "eu-west-1", accessKey: "<access-key-for-b>", secret: "<secret-for-b>"}
// Union the lists returned by `read_logs` for both buckets/accounts.
in List.Union(
read_logs("s3://log-server-a/*.json", awsAccountA),
read_logs("s3://log-server-b/*.json", awsAccountB)

Never store sensitive information as clear text in the code. Instead use secrets, which are key/value pairs that are stored securely outside of the source code. Secrets can be accessed using the built-in function Environment.Secret.

Ready to try it out?Register for free and start building today!

Otherwise, if you have questions/comments, join us on Discord!