Connecting Superset and Amazon Athena
Apache Superset is an open-source business intelligence tool that can be connected with many different data sources. Amazon Athena is a query engine built on top of Presto and can be used to analyze data stored in S3.
This post shows how to use Amazon Athena as a data source.
Install Superset
If you’re running Windows 10 I recommend installing Superset in Ubuntu on Windows.
Start installing prerequisites:
Create a virtual environment for the Superset installation:
The following may not be needed but I had to change version of the following dependencies like so:
Now install Superset:
In order to make Superset “talk” to Athena install PyAthena which is a Python client for Athena:
Now initialize the database:
and create an admin user:
and finally, setup roles and permissions:
Now you’re ready to fire up Superset:
and go to http://localhost:8088
In the Superset web UI select Databases:
and create new database:
In the add database dialog, name your database and add a connection string with appropriate values:
Now open SQL Lab -> SQL Editor on you’re ready to query your Glue database in Superset!