Skip to content

vish1108/DataWarehouse_Project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DataWarehouse_Project

Data is more valuable in this era, and big companies always want to have access to the cleanest data available. However, in this generation, it is not easy to maintain such vast amounts of data. Therefore, these companies require individuals who can effectively manage and utilize this data to meet their business needs.

#Project Definition : Created a Data warehouse Project using Python, MS-SQL server, Talend(ETL) and other Data Warehouse Concepts.

Project Diagram 2023-10-30 014908

Dataset

For this project, we will use the Brazilian E-commerce Public Dataset by Olis. You can access this dataset on Kaggle.

Data Modeling

Data Modeling

Power Bi Dashboard

Tech Stack

  1. Python
  2. MS-SQL SERVER
  3. Talend (ETL)
  4. Machine Learning
  5. Power BI

Project Steps

This project involves the following key steps:

  1. Data Loading in Python: Initial data loading and preprocessing using Python.

  2. Data Cleaning: Data is cleaned and prepared for further processing to ensure its quality.

  3. Data Loading in MS-SQL Server: Cleaned data is loaded into MS-SQL Server for storage and analysis.

  4. Creating ETL Jobs: Extract, Transform, Load (ETL) jobs are designed and implemented to facilitate data processing and integration.

  5. Data Modeling: Data modeling techniques are applied to structure and organize data effectively for analysis.

  6. Creating Star Schema: Designing a star schema to optimize query performance and facilitate data retrieval.

  7. Building Two Data Marts: Two separate data marts are constructed, one intended for Machine Learning models and the other for Power BI reporting.

  8. Implementing Data Marts: The data marts are put to practical use, with one dedicated to Machine Learning model applications and the other for Power BI, enabling business intelligence reporting.

About

Data Warehouse Project

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published