Parsing Machine LogsΒΆ

This example shows how to transform an unstructured machine log into a form that can be queried.

We use the file machine_log.log, which can be found the S3 tutorial bucket raw-tutorial.

Then execute the query:

machine_log := READ_LINES("s3://raw-tutorial/machine_log.log");

SELECT * FROM machine_log
PARSE AS r"(\\d+/\\d+/\\d+ \\d+:\\d+:\\d+) (\\w+):? (.*)"
INTO (timestamp: _1, debug_level: _2, message: _3)

Note that:

  • The PARSE AS takes a single line of unstructured text and parses it into multiple fields using a regular expression;

  • The INTO is used to rename the default field names of groups in the regular expression from _1, _2, etc to more meaningful names.