This is a project that aims to extract the key takeaways from a report, or multiple reports.
Use case: An organization has been having annual conferences to deliberate on different issues, and reports have been issued after each conference. While reading all these reports can be time consuming, we can analyse these reports and identify the key discussion points for these conferences.
The ipynb file contains the code for this project. This project relied heavily on the NLTK library for its analysis. The pbix file visualizes the keywords in a WordCloud. The .py files are an attempt to productionize the project. While the ipynb file handles one file at a time, the .py files intends to create a UI to accept multiple files and run this analysis.
Further work: Work can be done to understand the impact of these conferences, if any, by comparing these key points discussed against similar metrics over a 2-3 year period. For instance, if a 2015 conference centered on Education for the girl child, it will be interesting to find out if there has been any increase in girl-child enrollment or new policy formulation. It's understandable though, that such conference may not be a sufficient metric to compare against, but it can be a part of a wider study.