Amsterdam University College -- Text Mining and Collective Intelligence -- Fall 2019.
- Hello World: a first notebooks to check everything is working.
- Lab 1: Fundamentals: variables, built-in data types and structures, syntax, flow control.
- Lab 2: Fundamentals: functions, exceptions, classes, I/O.
- Lab 3: More fundamentals: modules, packages, standard library.
- Lab 4_1: Scientific programming: NumPy, matplotlib.
- Lab 4_2: Regular expressions (only for reference).
- Lab 5: NLP pipelines: sentence splitting, tokenizing, stemming and lemmatizing, part-of-speech tagging.
- Lab 6: Web scraping and APIs.
- Lab 7_1: Distributions in texts.
- Lab 7_2: WordNet (only for reference).
- Lab 8: Vector Semantics.
- Lab 9: Intro to ML: linear regression, logistic regression, SGD, Sklearn.
- Lab 10: Word Embeddings: Word2Vec using Gensim.
- Lab 11: Sentiment Analysis.
- Lab 12: Recommender Systems.
- Lab 13: Clustering and Topic Modelling.
See the projects folder for info.
2019/20 project outcomes:
- Jaël Kortekaas, António Mendes, Ludovica Schaerf, Topic Modelling of Song Lyrics.
- Joyce den Hertog, Barbara van Eeghen, Eva Schoonings, Fake News Detection.
- Vera Neplenbroek, Kamiel Fokkink, Baran Işcanli, Machine Translation with Recurrent Neural Networks.
- Rajiv Manichand, Hanabi Ono, Eva Gmelich Meijling, Topic Modeling on Bills of the US Congress.
- Andrew Nelson, Tomas Kehus, Lela Roos, Common Threads among Last Statements from Death Row Offenders.
- Jan Koetsier, Lily Voge, Caoimhe Martin, A Textual Analysis of UK Parliament Debates -- Gaining a Better Understanding of Brexit Processes.
- Jasmijn Bleijlevens, Lanie Preston, Floor Kouwenberg, Fantastic Food Finder : Rating Prediction & Sentiment Analysis on Amazon Food Reviews.
- Clone the repository locally:
git clone https://github.com/Giovanni1085/AUC_TMCI_2019.git
- Get updates (from time to time):
git pull
- Create a conda environemnt:
conda create -n myenv python=3.7 anaconda
(wheremyenv
is the envirnoment name) - Activate it:
conda activate myenv
- Install packages (see the
requirements.txt
file), e.g.conda install pandas
- Launch a Jupyter notebook:
jupyter notebook
- More on conda enviroments
- Conda cheatsheet
- Getting started with Jupyter notebooks
- On using git and GitHub for version control
Alternatively, use Binder (link above).
A more detailed guide to setup your environment, with multiple options.
- Michael Repplinger, who ran the 2018/19 edition and Gianluca Lebani, who ran the 2017/18 edition.
- Giovanni Colavizza and Matteo Romanello, Applied Data Analysis course for the Oxford Digitial Humanities Summer School
- James Hetherington and Giovanni Colavizza, Research Software Engineering with Python