15. Output Formats

We have seen how queries in RAW can read data from multiple input formats.

But RAW can also output data in multiple formats. That is, a RAW query can output data directly in a given format. That’s because the output in RAW is a stream of data.

For this, however, we cannot use the Jupyter Magic, since that is configured to output data in a format specific for Python and Jupyter notebooks. If we use instead the Python CLI API (or the command-line tools), we can see this behaviour.

We start by initializing the RAW Python API:

[2]:
from rawapi import new_raw_client

cli = new_raw_client()

Then let’s do a query and ask for its output in the CSV format. For this we use the RAW Python API query_as method, which receives the query and the output format as arguments.

[12]:
query = """
SELECT *
FROM READ("https://raw-tutorial.s3.amazonaws.com/airports.csv")
LIMIT 10
"""

output_format = "csv"

input_stream = cli.query_as(query, output_format)

print(input_stream.read().decode("utf-8"))
AirportID,Name,City,Country,IATA_FAA,ICAO,Latitude,Longitude,Altitude,Timezone,DST,TZ
1,Goroka,Goroka,Papua New Guinea,GKA,AYGA,-6.081689,145.391881,5282,10.0,U,Pacific/Port_Moresby
2,Madang,Madang,Papua New Guinea,MAG,AYMD,-5.207083,145.7887,20,10.0,U,Pacific/Port_Moresby
3,Mount Hagen,Mount Hagen,Papua New Guinea,HGU,AYMH,-5.826789,144.295861,5388,10.0,U,Pacific/Port_Moresby
4,Nadzab,Nadzab,Papua New Guinea,LAE,AYNZ,-6.569828,146.726242,239,10.0,U,Pacific/Port_Moresby
5,Port Moresby Jacksons Intl,Port Moresby,Papua New Guinea,POM,AYPY,-9.443383,147.22005,146,10.0,U,Pacific/Port_Moresby
6,Wewak Intl,Wewak,Papua New Guinea,WWK,AYWK,-3.583828,143.669186,19,10.0,U,Pacific/Port_Moresby
7,Narsarsuaq,Narssarssuaq,Greenland,UAK,BGBW,61.160517,-45.425978,112,-3.0,E,America/Godthab
8,Nuuk,Godthaab,Greenland,GOH,BGGH,64.190922,-51.678064,283,-3.0,E,America/Godthab
9,Sondre Stromfjord,Sondrestrom,Greenland,SFJ,BGSF,67.016969,-50.689325,165,-3.0,E,America/Godthab
10,Thule Air Base,Thule,Greenland,THU,BGTL,76.531203,-68.703161,251,-4.0,E,America/Thule

Let’s now re-do the same query, requesting a JSON output.

[13]:
query = """
SELECT *
FROM READ("https://raw-tutorial.s3.amazonaws.com/airports.csv")
LIMIT 10
"""

output_format = "json"

input_stream = cli.query_as(query, output_format)

print(input_stream.read().decode("utf-8"))
[{"AirportID":1,"Name":"Goroka","City":"Goroka","Country":"Papua New Guinea","IATA_FAA":"GKA","ICAO":"AYGA","Latitude":-6.081689,"Longitude":145.391881,"Altitude":5282,"Timezone":10.0,"DST":"U","TZ":"Pacific/Port_Moresby"},{"AirportID":2,"Name":"Madang","City":"Madang","Country":"Papua New Guinea","IATA_FAA":"MAG","ICAO":"AYMD","Latitude":-5.207083,"Longitude":145.7887,"Altitude":20,"Timezone":10.0,"DST":"U","TZ":"Pacific/Port_Moresby"},{"AirportID":3,"Name":"Mount Hagen","City":"Mount Hagen","Country":"Papua New Guinea","IATA_FAA":"HGU","ICAO":"AYMH","Latitude":-5.826789,"Longitude":144.295861,"Altitude":5388,"Timezone":10.0,"DST":"U","TZ":"Pacific/Port_Moresby"},{"AirportID":4,"Name":"Nadzab","City":"Nadzab","Country":"Papua New Guinea","IATA_FAA":"LAE","ICAO":"AYNZ","Latitude":-6.569828,"Longitude":146.726242,"Altitude":239,"Timezone":10.0,"DST":"U","TZ":"Pacific/Port_Moresby"},{"AirportID":5,"Name":"Port Moresby Jacksons Intl","City":"Port Moresby","Country":"Papua New Guinea","IATA_FAA":"POM","ICAO":"AYPY","Latitude":-9.443383,"Longitude":147.22005,"Altitude":146,"Timezone":10.0,"DST":"U","TZ":"Pacific/Port_Moresby"},{"AirportID":6,"Name":"Wewak Intl","City":"Wewak","Country":"Papua New Guinea","IATA_FAA":"WWK","ICAO":"AYWK","Latitude":-3.583828,"Longitude":143.669186,"Altitude":19,"Timezone":10.0,"DST":"U","TZ":"Pacific/Port_Moresby"},{"AirportID":7,"Name":"Narsarsuaq","City":"Narssarssuaq","Country":"Greenland","IATA_FAA":"UAK","ICAO":"BGBW","Latitude":61.160517,"Longitude":-45.425978,"Altitude":112,"Timezone":-3.0,"DST":"E","TZ":"America/Godthab"},{"AirportID":8,"Name":"Nuuk","City":"Godthaab","Country":"Greenland","IATA_FAA":"GOH","ICAO":"BGGH","Latitude":64.190922,"Longitude":-51.678064,"Altitude":283,"Timezone":-3.0,"DST":"E","TZ":"America/Godthab"},{"AirportID":9,"Name":"Sondre Stromfjord","City":"Sondrestrom","Country":"Greenland","IATA_FAA":"SFJ","ICAO":"BGSF","Latitude":67.016969,"Longitude":-50.689325,"Altitude":165,"Timezone":-3.0,"DST":"E","TZ":"America/Godthab"},{"AirportID":10,"Name":"Thule Air Base","City":"Thule","Country":"Greenland","IATA_FAA":"THU","ICAO":"BGTL","Latitude":76.531203,"Longitude":-68.703161,"Altitude":251,"Timezone":-4.0,"DST":"E","TZ":"America/Thule"}]

This ability to generate output in multiple formats helps RAW integrate with external tools. The output format is always part of a query request: in the case of Jupyter notebooks, the user does not need to specify this since RAW’s Jupyter client specifies it internally, requesting output in a format convenient for use in Jupyter.