Welcome to the dataset documentation for Roamify, our comprehensive machine learning project designed to provide personalized travel recommendations. This document outlines the various datasets we have collected and utilized for both our Recommender System and NLP processing tasks.
Our recommender system leverages official tourism websites from various states and countries to gather detailed and accurate information about popular tourist attractions. Below are the primary sources we have used:
-
Official Tourism Websites: Detailed information from each state in India, including history, visiting hours, and other relevant details.
- Incredible India
- Andhra Pradesh Tourism
- Karnataka Tourism
- Kerala Tourism
- Tamil Nadu Tourism
- Telangana Tourism
- Maharashtra Tourism
- Punjab Tourism
- Haryana Tourism
- Jammu & Kashmir Tourism
- Goa Tourism
- Gujarat Tourism
- Rajasthan Tourism
- Assam Tourism
- Mizoram Tourism
- Madhya Pradesh Tourism
- Jharkhand Tourism
- Odisha Tourism
- Uttarakhand Tourism
- Chhattisgarh Tourism
- Bihar Tourism
- Tripura Tourism
- Meghalaya Tourism
- Manipur Tourism
- Nagaland Tourism
- Arunachal Pradesh Tourism
- Himachal Pradesh Tourism
- Delhi Tourism
- Sikkim Tourism
- Andaman and Nicobar Tourism
- Lakshadweep Tourism
To enrich our travel information database, we utilized web scraping techniques to extract data from the Travel Triangle website. This enabled us to gather comprehensive details on the best places to visit across various cities, states, and countries, ensuring our recommendations are up-to-date and relevant.
- Travel Triangle Website: Detailed travel information for specific cities, states, and countries.
-
India
-
Countries and Regions
-
Europe