First Query

Let's write our first queries in RAW and along the way introduce some basic concepts in the system.

We start by initializing the RAW magic (the RAW client) for Jupyter.

Our first query uses public data stored on S3, available through https.

Note the use of %%rql, which is the RAW magic command to run an RQL query; RQL stands for the RAW Query Language

This query is reading a CSV file, projecting all columns in the file, and returning the first 5 rows.

The READ keyword tells RAW to read data from the given URL. This is common in RAW: data sources are specified in the query.

As the query starts to execute, RAW will infer the format and structure of the data. In this case, the data is a CSV file and RAW makes it available as expected: as a table. The column names are retrieved from the CSV file, which contains a header in the first line.

We can execute more complex queries using regular SQL language features. For example:

Here we see countries with most airports.

With the exception of the READ keyword, this query looks like normal SQL. The READ keyword, however, means users do not have to create tables and load data into them, or discover the schema. This is done automatically by RAW.

For performance, however, RAW will create caches of the data. We will discuss caching in RAW in more details later in this tutorial.

Next: Data Discovery