So far we have run queries using the %%rql magic.

As data analysis gets more complex, it is difficult to express all transformations in a single query.

Views are a mechanism in RAW to store and reference queries you’ve done before.

They are conceptually similar to virtual views in classical databases.

%load_ext raw_magic

To define a view, we use the %%view magic, as:

%%view <view name>


Let’s define our first view, which simply list all the contents of a file.

%%view airports

SELECT * FROM READ("https://raw-tutorial.s3.amazonaws.com/airports.csv")

This creates a view named airports, which consists of reading the CSV file.

To use this view, we can just reference it in a query.


SELECT * FROM airports LIMIT 2
1GorokaGorokaPapua New GuineaGKAAYGA-6.081689145.391881528210.0UPacific/Port_Moresby
2MadangMadangPapua New GuineaMAGAYMD-5.207083145.78872010.0UPacific/Port_Moresby

This query now resembles even more traditional SQL.

In fact, it achieved a form of “virtualization” by hiding the exact location of the original data, and presenting it to the consumer under a name.

If the location of the file were changed, or its query changed to clean some data for instance, the view name would be kept stable.

We can create other views similarly:

%%view airports_per_country

SELECT Country, COUNT(*) AS Number_Airports
FROM READ("https://raw-tutorial.s3.amazonaws.com/airports.csv")
GROUP BY Country
ORDER BY Number_Airports DESC
View "airports_per_country" was replaced

SELECT * FROM airports_per_country ORDER BY Number_Airports DESC LIMIT 5
United States1697

We can also list the views available using a Jupyter magic: