Skip to main content

How to build and host (for free!) an API that shares data from S3 using RAW

In this example, we are going to build an API that reads data from S3. This data will then be hosted as a REST API. For this, we are going to be using RAW, a platform to quickly build and host APIs.

Let's get started!

info

If you are familiar with RAW and want to deploy this endpoint on your account, click below:

Changing output of data depending of user scopes
Parse a log file and show subset of data depending of user scopes.

This example illustrates how to create a REST API where the data displayed will depend on the scopes/permissions of the user calling the endpoint.

For this example we will read a log file which has 3 levels of log statements INFO, WARN and ERROR. The data access rules are:

  • Users without relevant scopes can only see INFO statements.
  • Users with the scope monitoring can see INFO and WARN statements.
  • Users with the scope admin scope can see all statements.

Here's an excerpt of the log file:

2015-01-01T05:54:15 WARN vibration close to treshold, check instrumentation panel ASAP.
2015-01-01T05:54:58 INFO calibration at 100%, checking inner sub-systems.
2015-01-01T05:55:41 ERROR voltage not measured for more than 25 seconds, reboot machine.
2015-01-01T05:56:24 INFO cleaning procedure schedulled soon, performing sub task 111.
2015-01-01T05:57:07 INFO task 155 schedulled soon, preparing next task.

Parsing the file

We can use String.ReadLines and Regex.Groups to extract the timestamp, level and message from each line of text:

parse() = 
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l,"""(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0),"yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{timestamp: timestamp, level: level, message: message}
)
in
parsed

The output of this function looks like this:

[
{
"timestamp": "2015-01-01T05:54:15.000",
"level": "WARN",
"message": "vibration close to treshold, check instrumentation panel ASAP."
},
{
"timestamp": "2015-01-01T05:54:58.000",
"level": "INFO",
"message": "calibration at 100%, checking inner sub-systems."
},
{
"timestamp": "2015-01-01T05:55:41.000",
"level": "ERROR",
"message": "voltage not measured for more than 25 seconds, reboot machine."
},
{
"timestamp": "2015-01-01T05:56:24.000",
"level": "INFO",
"message": "cleaning procedure schedulled soon, performing sub task 111."
},
{
"timestamp": "2015-01-01T05:57:07.000",
"level": "INFO",
"message": "task 155 schedulled soon, preparing next task."
}
]

Usage

/aws/s3/logs-scopes

Create a blank API endpoint in RAW

Once you login to RAW, you should head up to the 'Workspace' section as shown below.

Choose the workspace

Then, click on 'Add Endpoint' to create a new endpoint and choose a new 'Snapi' as shown below. You could also choose an existing template, but for this example, let's start blank and write the code ourselves.

Create blank endpoint

Write the endpoint code

Now that you have a blank endpoint, let's start writing some code. Before we start, here's an overview of the RAW Workspace.

Overview of the workspace

Now let's follow in steps:

Step 1: Write the code

In RAW, endpoints are written in Snapi, a simple-to-use programming language specifically created specifically for building APIs. You will see this is very simple to create.

Let's copy/paste the following code for our endpoint (see figure, as step '1'):

main(start: timestamp, end: timestamp) =
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l, """(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0),"yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{ timestamp: timestamp, level: level, message: message }
)
in
Collection.Filter(parsed,
l -> l.timestamp > start and l.timestamp < end
)

// The following test will run if you press the play button.
main(Timestamp.Build(2015, 1, 4, 0, 0), Timestamp.Build(2015, 1, 5, 0, 0))

Don't worry if you don't follow all the code just yet. This will be explained in detail below.

Step 2: Test the code

Next, let's test the code. Click on the play button (shown in the figure as step '2') and you will have a live preview of the result of calling the last line of the code.

Step 3: Choose the final URL

You can choose the exact path where your API will be hosted. This is shown in the figure as step '3'.

Step 4: Edit the metadata

Optionally, you can edit the metadata metadata (shown in the figure as step '4'). The metadata is important since RAW includes a built-in API Catalog that helps you and your users find API endpoints later.

Step 5: Deploy the endpoint live!

We are almost done. Now click to deploy your endpoint (shown in the figure as step '5').

Congratulations, your API is now published! It will be served right away and visible in the API Catalog as well.

How does the code work?

Let's look closer at how the Snapi code works!

main(start: timestamp, end: timestamp) =
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l, """(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0), "yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{ timestamp: timestamp, level: level, message: message }
)
in
Collection.Filter(parsed,
l -> l.timestamp > start and l.timestamp < end
)

// The following test will run if you press the play button.
main(Timestamp.Build(2015, 1, 4, 0, 0), Timestamp.Build(2015, 1, 5, 0, 0))
  • Line 1 defines the main method. Its arguments will become query parameters in the URL call. In this case, the arguments are the start and end timestamps for the logs to be queried from S3.
  • Lines 3-5 defines the file that contains the data. It is a log file stored on S3.
  • Line 6 applies a transformation to every row of the data. This transformation applies a regular expression to the log file. It then extracts 3 groups and assignes them to different identifiers in Lines 9-11. Line 13 builds a new record as the output of the transformation in each row with the log event timestamp, log level and log message.
  • Line 16 applies a filter so that only rows within the given timestamps passed by the user as query parameters are returned.
  • Line 21 defines the test to run when the play button is pressed.

Let's improve this API!

Now that you understand the basic concepts, there's many improvements that can be done. Below is a list of pointers:

What's next!

Take a look at other examples, or join us in our Community to learn more!

Ready to try it out?

Pick a template!

Otherwise, if you have questions/comments, join us in our Community!