Skip to main content

Accessing data on files

Learn how to access data stored in files.

info

If you want to try this example, you can deploy the following endpoint:

API over CSV file
Find airports by location, optionally filtering by country, city and/or IATA code.

Sample usage:

/api/csv[?country=<string>&city=<string>&iata=<string>]

For instance, to ask for airports in Portugal and in the city of Lisbon, use:

/api/csv?country=Portugal&city=Lisbon
API over JSON file
Search medical information on patients, optionally filtering by country, age or diagnosis information.

Sample usage:

/api/json[?country=<string>&minYear=<int>&maxYear=<int>&code=<string>]

The following URL returns the patients born between 1990 and 1995 that were diagnosed with L53.3.

/api/json?minYear=1990&maxYear=1995&code=L53.3
API over XML file
Search for a person by name.

Sample usage:

/api/xml[?name=<person_name>]

For instance, to get the information about bob use:

/api/xml?name=bob

Quick Summary

  • Use the package corresponding for the data format desired. Refer to the CSV, JSON, XML, text files and binary data packages.
  • For CSV, JSON, XML, use either InferAndRead or the Read version, depending on whether the schema is known or not.
  • For locations that have specific properties or credentials, use the corresponding function package such as Http.Get or S3.Build.

Introduction

RAW includes built-in support for reading CSV, JSON, XML, text and binary files.

For instance, the following code reads a CSV file from an HTTP URL:

Csv.InferAndRead("http://example.org/data.csv")

There are specific functions for the various formats supported. These functions follow the same pattern: they receive a location and, when applicable, other format-specific optional arguments (e.g. the delimiter of a CSV file).

The location is either specified as a string with the URL, or can be obtained through another function call such as Http.Get or S3.Build, for when location-specific properties need to be passed. Refer to the Examples below for more details.

Schema Detection

For formats like CSV, JSON and XML, the system can either auto-detect the schema of the data or the user can specify it manually as an argument. The methods that auto-detect the schema are named InferAndRead (e.g. Csv.InferAndRead) while the methods that receive the type from the user are named Read (e.g. Csv.Read).

The detection of the schema occurs prior to the execution of the program. This means the program is slightly slower to execute as a separate execution happens in advance to detect the schema. Moreover, if the location cannot be determined prior to the execution (e.g. the location is not statically defined in the program but is received as an external argument), then this auto-detection of the format cannot be used and the user must specify the type of the data.

Examples

This example reads CSV data from a given URL. The type of the data in the CSV file is not specified, so we use the Csv.InferAndRead:

Csv.InferAndRead("http://example.org/data.csv")

If the data type is known or cannot be inferred, we use Csv.Read instead as in:

Csv.Read("http://example.org/data.csv", type collection(record(name: string, age: int)))

If the URL requires credentials we must specify those as part of the location. For instance, to read the file from a private S3 bucket we use S3.Build. This allows the user to specific location-specific properties and returns a location which can then be passed to the read function:

let
location = S3.Build("s3://my-bucket/data.csv", accessKey = "<AWS ACCESS KEY>", secretKey = "<AWS SECRET KEY>")
in
Csv.InferAndRead(location)