Python
-
Census ACS5 Data Pipeline
In this project I utilized Python, SQL, and AWS to design and develop an ETL pipeline that extracts and visualizes data from the official Census ACS5 Survey API. This project serves as an exercise in developing data architectures that allow for easy access of real data that provides valuable insights to drive positive change in education outcomes in the US.
-
IBM Data Engineering Capstone Project
In this IBM sponsored project, I assumed the role of a Junior Data Engineer who has recently joined a fictional online e-Commerce company named SoftCart. Presented with real-world use cases, I was required to apply a number of industry standard data engineering solutions.
-
LAB: Watch me use Apache Spark to analyze web search data and load a pre-trained sales forecasting machine learning model
In this lab, I was tasked with analyzing e-Commerce web search data using JupyterLab and PySpark. Additionally, I was required to load a pre-trained sales forecasting model to predict sales for 2023.