How to build and host (for free!) an API that shares data from S3 using RAW
In this example, we are going to build an API that reads data from S3. This data will then be hosted as a REST API. For this, we are going to be using RAW, a platform to quickly build and host APIs. To follow along, you will need an account, which you can create and use for free here if you don't have one already.
Let's get started!
If you are familiar with RAW and want to deploy this endpoint on your account, click below:
Changing output of data depending of user scopes
- Overview
- Code
This example illustrates how to create a REST API where the data displayed will depend on the scopes/permissions of the user calling the endpoint.
For this example we will read a log file which has 3 levels of log statements INFO
, WARN
and ERROR
.
The data access rules are:
- Users without relevant scopes can only see
INFO
statements. - Users with the scope
monitoring
can seeINFO
andWARN
statements. - Users with the scope
admin
scope can see all statements.
Here's an excerpt of the log file:
2015-01-01T05:54:15 WARN vibration close to treshold, check instrumentation panel ASAP.
2015-01-01T05:54:58 INFO calibration at 100%, checking inner sub-systems.
2015-01-01T05:55:41 ERROR voltage not measured for more than 25 seconds, reboot machine.
2015-01-01T05:56:24 INFO cleaning procedure schedulled soon, performing sub task 111.
2015-01-01T05:57:07 INFO task 155 schedulled soon, preparing next task.
Parsing the file
We can use String.ReadLines
and Regex.Groups
to extract the timestamp, level and message from each line of text:
parse() =
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l,"""(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0),"yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{timestamp: timestamp, level: level, message: message}
)
in
parsed
The output of this function looks like this:
[
{
"timestamp": "2015-01-01T05:54:15.000",
"level": "WARN",
"message": "vibration close to treshold, check instrumentation panel ASAP."
},
{
"timestamp": "2015-01-01T05:54:58.000",
"level": "INFO",
"message": "calibration at 100%, checking inner sub-systems."
},
{
"timestamp": "2015-01-01T05:55:41.000",
"level": "ERROR",
"message": "voltage not measured for more than 25 seconds, reboot machine."
},
{
"timestamp": "2015-01-01T05:56:24.000",
"level": "INFO",
"message": "cleaning procedure schedulled soon, performing sub task 111."
},
{
"timestamp": "2015-01-01T05:57:07.000",
"level": "INFO",
"message": "task 155 schedulled soon, preparing next task."
}
]
Usage
/aws/s3/logs-scopes
main() =
let
lines = String.ReadLines("s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"),
parsed = Collection.Transform(
lines,
(l) ->
let
groups = Regex.Groups(l, "(\\d+-\\d+-\\d+T\\d+:\\d+:\\d+) (\\w+) (.*)"),
timestamp = Timestamp.Parse(List.Get(groups, 0), "yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{timestamp: timestamp, level: level, message: message}
)
in
Collection.Filter(
parsed,
(x) ->
if List.Contains(Environment.Scopes(), "admin") then
true
else
if List.Contains(Environment.Scopes(), "monitoring") then
x.level == "WARN" or x.level == "INFO"
else
x.level == "INFO"
)
Create a blank API endpoint in RAW
First, we should create a free account:
Once you login to RAW, you should head up to the 'Workspace' section as shown below.
Then, click on 'Add +' to create a new endpoint and choose a new 'Blank Endpoint' as shown below. You could also choose an existing template, but for this example, let's start blank and write the code ourselves.
Write the endpoint code
Now that you have a blank endpoint, let's start writing some code. Before we start, here's an overview of the RAW Workspace.
Now let's follow in steps:
Step 1: Write the code
In RAW, endpoints are written in Snapi, a simple-to-use programming language specifically created specifically for building APIs. You will see this is very simple to create.
Let's copy/paste the following code for our endpoint (see figure, as step '1'):
main(start: timestamp, end: timestamp) =
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l, """(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0),"yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{ timestamp: timestamp, level: level, message: message }
)
in
Collection.Filter(parsed,
l -> l.timestamp > start and l.timestamp < end
)
// The following test will run if you press the play button.
main(Timestamp.Build(2015, 1, 4, 0, 0), Timestamp.Build(2015, 1, 5, 0, 0))
Don't worry if you don't follow all the code just yet. This will be explained in detail below.
Step 2: Test the code
Next, let's test the code. Click on the play button (shown in the figure as step '2') and you will have a live preview of the result of calling the last line of the code.
Step 3: Choose the final URL
You can choose the exact path where your API will be hosted. This is shown in the figure as step '3'.
Step 4: Edit the metadata
Optionally, you can edit the metadata metadata (shown in the figure as step '4'). The metadata is important since RAW includes a built-in API Catalog that helps you and your users find API endpoints later.
Step 5: Deploy the endpoint live!
We are almost done. Now click to deploy your endpoint (shown in the figure as step '5').
Congratulations, your API is now published! It will be served right away and visible in the API Catalog as well.
How does the code work?
Let's look closer at how the Snapi code works!
main(start: timestamp, end: timestamp) =
let
lines = String.ReadLines(
"s3://raw-tutorial/ipython-demos/predictive-maintenance/machine_logs.log"
),
parsed = Collection.Transform(lines, l ->
let
groups = Regex.Groups(l, """(\d+-\d+-\d+T\d+:\d+:\d+) (\w+) (.*)"""),
timestamp = Timestamp.Parse(List.Get(groups, 0), "yyyy-M-d\'T\'H:m:s"),
level = List.Get(groups, 1),
message = List.Get(groups, 2)
in
{ timestamp: timestamp, level: level, message: message }
)
in
Collection.Filter(parsed,
l -> l.timestamp > start and l.timestamp < end
)
// The following test will run if you press the play button.
main(Timestamp.Build(2015, 1, 4, 0, 0), Timestamp.Build(2015, 1, 5, 0, 0))
- Line 1 defines the
main
method. Its arguments will become query parameters in the URL call. In this case, the arguments are the start and end timestamps for the logs to be queried from S3. - Lines 3-5 defines the file that contains the data. It is a log file stored on S3.
- Line 6 applies a transformation to every row of the data. This transformation applies a regular expression to the log file. It then extracts 3 groups and assignes them to different identifiers in Lines 9-11. Line 13 builds a new record as the output of the transformation in each row with the log event timestamp, log level and log message.
- Line 16 applies a filter so that only rows within the given timestamps passed by the user as query parameters are returned.
- Line 21 defines the test to run when the play button is pressed.
Let's improve this API!
Now that you understand the basic concepts, there's many improvements that can be done. Below is a list of pointers:
- To secure the endpoint, read this guide.
- To improve performance with caching, read this guide.
- Learn how to invite users.
- To create API keys, read this guide.
- If you prefer, paid plans have access to GitHub integration, which allow teams to collaboratively build code together, write test suites, and provides a complete CI/CD flow.
What's next!
Take a look at other examples, or join us on Discord to learn more!
Ready to try it out?
Register for free and start building today!Otherwise, if you have questions/comments, join us on Discord!