Livebook.dev The Livebook Blog logo

The Livebook Blog

Back to Homepage Subscribe to Updates

Labels

  • All Posts
  • releases
  • features
  • tutorials
  • announcements

Jump to Month

  • February 2023
  • January 2023
  • December 2022
  • October 2022
  • September 2022
  • August 2022
  • July 2022
  • May 2022
  • January 2022
  • April 2021
Powered️ byAnnounceKit

Create yours, for free!

tutorials
5 months ago

How to query and visualize data from Amazon Athena using Livebook

Livebook has built-in integrations with many data sources, including Amazon Athena.

In this blog post, you'll learn how to use Livebook to connect to Amazon Athena, execute a SQL query against it, and visualize the data.

You can also run this tutorial inside your Livebook instance by clicking the button below:

Run in Livebook

Connecting to Amazon Athena using the Database connection Smart cell

To connect to Amazon Athena, you'll need the following info from your AWS account:

  • AWS access key ID
  • AWS secret access key
  • Athena database name
  • S3 bucket to write query results to

Now, let's create an Amazon Athena connection using a Database connection Smart cell. Click the options "Smart > Database connection > Amazon Athena":

Once you've done that, you'll see a Smart cell with input fields to configure your Amazon Athena connection:

Fill in the following fields to configure your connection:

  • Add your AWS access key ID
  • Add your AWS secret access key
  • Add your Athena database name
  • Add your S3 bucket in the "Output Location" field

Now, click the "Evaluate" icon to run that Smart cell. Once you've done that, the Smart cell will configure the connection and assign it to a variable called conn.

Querying Amazon Athena using the SQL Query Smart cell

Before querying Athena, we need to have an Athena table to query from. So, let's create one.

We'll create an Athena table based on a public dataset published on AWS Open Data. We'll use the GHCN-Daily dataset, which contains climate records measured by thousands of climate stations worldwide. Let's create an Athena table called stations with metadata about those climate stations.

To do that, Add a new SQL Query Smart cell by clicking the options "Smart > SQL Query":

Copy and paste the SQL code below to the SQL Query cell:

CREATE EXTERNAL TABLE IF NOT EXISTS default.stations (
  station_id string, 
  latitude double, 
  longitude double, 
  elevation double, 
  name string)
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.RegexSerDe' 
WITH SERDEPROPERTIES ( 
  'input.regex'='([^ ]*) *([^ ]*) *([^ ]*) *([^ ]*) *(.+)$') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  's3://livebook-blog/amazon-athena-integration'
TBLPROPERTIES (
  'typeOfData'='file')

Now, click the "Evaluate" icon to run that Smart cell. Now we have an Athena table to query from.

Add a new SQL Query Smart cell and copy and paste the following SQL query to it:

select * from default.stations order by station_id

Evaluate that cell. It will query your Athena table and assign the results to a result2 variable. You'll see the result of that query in a tabular format like this:

Visualizing geographic coordinates data using the Map Smart cell

Notice that the table we created has each climate station's latitude and longitude information. We can visualize that data with a map visualization using the Map Smart cell. Let's do that.

Add a Map Smart cell by clicking the options "Smart > Map":


Your Map Smart cell will look something like this:

Use your Map Smart cell to configure:

  • the layer name
  • the data source
  • the coordinates format
  • the longitude field
  • the latitude field

Once you've configured the Smart cell, you can evaluate it, and it will build a map for you. It will look something like this:

That's it! Using Livebook Smart cells, you can connect to an Amazon Athena database, execute a SQL query against it and visualize the result.


tutorials
7 months ago

How to query and visualize data from Google BigQuery using Livebook

Querying and visualizing data from a database is a common and recurring task. That's the kind of thing you don't want to repeat yourself, writing the same code repeatedly. That's where Livebook Smart cells come in. It helps you to automate any workflow you want.

Livebook has built-in Smart cells that help you query and visualize data from multiple databases, like PostgreSQL, MySQL, Google BigQuery, AWS Athena, and SQLite.

This article explains how to use Livevook Smart cells to query and visualize data from a Google BigQuery dataset.

If you like video format, here's a video version of this article:

You can also run this tutorial inside your Livebook instance:

Run in Livebook

Connecting to Google BigQuery using the Database connection Smart cell

Before connecting to a Google BigQuery dataset, you need to create a Google Cloud service account and a service account key. Follow the steps in this guide and download the generated service account key JSON file. That file contains the info needed to configure a connection to Google BigQuery.

Now, create a new notebook. Then, let's create a Google BigQuery connection using a Database connection smart cell. Click the options "Smart > Database connection > Google BigQuery":

Once you've done that, you'll see a Smart cell with input fields to configure your Google BigQuery connection:

Configure your Google BigQuery connection by following the instructions inside the Smart cell:

  • Add your Google Cloud Project ID
  • Upload your JSON credentials file

Pro tip: if you have an authenticated gcloud CLI in the machine running your Livebook instance, then Livebook will automatically pick up your Google Cloud credentials and use it for access. No need to upload the JSON file. 😉

Now, click the "Evaluate" icon to run that Smart cell. Once you do that, the Smart cell will configure the connection and assign it to a variable called conn.

Querying Google BigQuery using the SQL Query smart cell

Now, let's use your Google BigQuery connection to query a Google BigQuery public dataset.

Add a new SQL Query smart cell by clicking the options "Smart > SQL Query":

Once done, you can write and execute a SQL query inside that cell. Let's query a public Google BigQuery dataset.

Copy the following query to the cell:

select t.year, t.country_name, t.midyear_population
from bigquery-public-data.census_bureau_international.midyear_population as t
where year < 2022
order by year

And execute the cell. The Smart cell will execute the query and assign its result to a variable called result. It also will show you the result of the query in a table format, like this:

Visualizing data from Google BigQuery using the Chart smart cell

Now we can visualize the result from that query using a Chart smart cell.

Add a new Chart smart cell by clicking the option "Smart > Chart":

Your newly created Chart smart cell will look something like this:

Now, let's use that cell to visualize the results from our query.

The query we wrote returned the population per year per country from the International Census Data, which is published as a Google BigQuery public dataset. We can use that data to visualize how the world population has been growing over the years.

Use your Chart smart cell to configure:

  • the chart's title
  • the chart's width
  • the chart's type
  • the x-axis and its type
  • the y-axis, its type, and aggregate

Once you've configured the cell, you can evaluate it, and it will build a chart for you. It will look something like this:


That's it! Using Livebook Smart cells, you can connect to a Google BigQuery dataset, execute a SQL query and visualize the results with a chart. And since this is quite a common task, Smart cells enable you to do that without writing a single line of code! 

And, if you need more customization in any part of the process, easy peasy. You can easily convert a Smart cell to a code cell and edit the code generated for you. To do that, click in a Smart cell, and then click the "Convert to Code cell" icon:

When you click the "Convert to Code cell" icon, it will transform your Smart cell into a Code cell. You'll be able to see the code that was running behind the Smart cell and edit it as you like: