Skip to content
#

data-ingestion

Here are 54 public repositories matching this topic...

End-to-end data engineering processes for the NIGERIA Health Facility Registry (HFR). The project leveraged Selenium, Pandas, PySpark, PostgreSQL and Airflow

  • Updated Sep 1, 2022
  • Python

IDPS-ESCAPE (Intrusion Detection and Prevention Systems for Evading Supply Chain Attacks and Post-compromise Effects), part of the CyFORT project: open-source SOAR system powered by a dedicated ML-based anomaly detection toolbox (ADBox) integrated with open-source software such as Wazuh and Suricata.

  • Updated Jan 24, 2025
  • Python

This project involves analyzing AdventureWorks bike sales data to uncover key insights into sales performance by country, customer segments, and products. The findings informed strategies for targeted marketing, market expansion, promotional timing, and product quality improvements.

  • Updated Aug 24, 2024
  • Python

Develop a real-time data ingestion pipeline using Kafka and Spark. Collect minute-level stock data from Yahoo Finance, ingest it into Kafka, and process it with Spark Streaming, storing the results in Cassandra. Orchestrated the workflow using Airflow deployed on Docker.

  • Updated Nov 29, 2024
  • Python

This project, you will build a full AI pipeline for an image classification task using Convolutional Neural Networks (CNNs). The project will cover data ingestion, preprocessing, model training, deployment, and CI/CD integration using GitHub Actions, Docker, and AWS.

  • Updated Oct 1, 2024
  • Python

Improve this page

Add a description, image, and links to the data-ingestion topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-ingestion topic, visit your repo's landing page and select "manage topics."

Learn more