Skip to main content

How to build and host (for free!) an API that shares data from S3 using RAW

In this example, we are going to build an API that reads data from S3. This data will then be hosted as a REST API. For this, we are going to be using RAW, a platform to quickly build and host APIs. To follow along, you will need an account, which you can create and use for free here if you don't have one already.

Let's get started!

info

If you are familiar with RAW and want to deploy this endpoint on your account, click below:

Parsing logs
Parse a log file using regexes.

Usage:

/sharing-logs?start=<timestamp>&end=<timestamp>

For instance, to ask for logs from the 3rd January 2015 until the 4th January 2015, use:

/sharing-logs?start=2015-01-03T00:00&end=2015-01-04T00:00

Create a blank API endpoint in RAW

First, we should create a free account:


Once you login to RAW, you should head up to the 'Workspace' section as shown below.

Choose the workspace

Then, click on 'Add +' to create a new endpoint and choose a new 'Blank Endpoint' as shown below. You could also choose an existing template, but for this example, let's start blank and write the code ourselves.

Create blank endpoint

Write the endpoint code

Now that you have a blank endpoint, let's start writing some code. Before we start, here's an overview of the RAW Workspace.

Overview of the workspace

Now let's follow in steps:

Step 1: Write the code

In RAW, endpoints are written in Snapi, a simple-to-use programming language specifically created specifically for building APIs. You will see this is very simple to create.

Let's copy/paste the following code for our endpoint (see figure, as step '1'):

main(start: timestamp, end: timestamp) =
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l, """(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0),"yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{ timestamp: timestamp, level: level, message: message }
)
in
Collection.Filter(parsed,
l -> l.timestamp > start and l.timestamp < end
)

// The following test will run if you press the play button.
main(Timestamp.Build(2015, 1, 4, 0, 0), Timestamp.Build(2015, 1, 5, 0, 0))

Don't worry if you don't follow all the code just yet. This will be explained in detail below.

Step 2: Test the code

Next, let's test the code. Click on the play button (shown in the figure as step '2') and you will have a live preview of the result of calling the last line of the code.

Step 3: Choose the final URL

You can choose the exact path where your API will be hosted. This is shown in the figure as step '3'.

Step 4: Edit the metadata

Optionally, you can edit the metadata metadata (shown in the figure as step '4'). The metadata is important since RAW includes a built-in API Catalog that helps you and your users find API endpoints later.

Step 5: Deploy the endpoint live!

We are almost done. Now click to deploy your endpoint (shown in the figure as step '5').

Congratulations, your API is now published! It will be served right away and visible in the API Catalog as well.

How does the code work?

Let's look closer at how the Snapi code works!

main(start: timestamp, end: timestamp) =
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l, """(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0), "yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{ timestamp: timestamp, level: level, message: message }
)
in
Collection.Filter(parsed,
l -> l.timestamp > start and l.timestamp < end
)

// The following test will run if you press the play button.
main(Timestamp.Build(2015, 1, 4, 0, 0), Timestamp.Build(2015, 1, 5, 0, 0))
  • Line 1 defines the main method. Its arguments will become query parameters in the URL call. In this case, the arguments are the start and end timestamps for the logs to be queried from S3.
  • Lines 3-5 defines the file that contains the data. It is a log file stored on S3.
  • Line 6 applies a transformation to every row of the data. This transformation applies a regular expression to the log file. It then extracts 3 groups and assignes them to different identifiers in Lines 9-11. Line 13 builds a new record as the output of the transformation in each row with the log event timestamp, log level and log message.
  • Line 16 applies a filter so that only rows within the given timestamps passed by the user as query parameters are returned.
  • Line 21 defines the test to run when the play button is pressed.

Let's improve this API!

Now that you understand the basic concepts, there's many improvements that can be done. Below is a list of pointers:

What's next!

Take a look at other examples, or join us on Discord to learn more!

Ready to try it out?Register for free and start building today!

Otherwise, if you have questions/comments, join us on Discord!