Census ACS5 Data Pipeline

In this project I utilized Python, SQL, and AWS to design and develop an ETL pipeline that extracts and visualizes data from the official Census ACS5 Survey API. This project serves as an exercise in developing data architectures that allow for easy access of real data that provides valuable insights to drive positive change in education outcomes in the US.

Overview

Extract, Transform, Load:

In this phase, I will write an AWS Lambda function that extracts multiple datasets from the API, process and store them as .csv files on Amazon S3.

Dimensional Modeling:

I will then write SQL queries in Amazon Athena to organize the data and create materialized views for each measure and dimension.

Data Visualization:

Lastly, I will connect Amazon Quicksight to visualize the data.


Posted

in

by