Skip to content

Latest commit

 

History

History

ScraperService

DBFeeder - Scraper

The development is in progress as well as the documentation

This is the service in charge of extracting information from an url received from the Crawler and passing it to the DAC via Event Bus. It listens from the queue in the Event Bus generated by the Crawler and:

  • Fetches the destination URL
  • Navigate through the page
  • Finds the target containers
  • Extract the structured information
  • Generate the Entity object from the extracted information

Design

This service is multi-process, a process for each configuration file present in the configs folder.