Skip to content

Commit

Permalink
Merge pull request #13 from pkeilbach/winter-term-kickoff
Browse files Browse the repository at this point in the history
  • Loading branch information
pkeilbach authored Nov 7, 2023
2 parents f28026c + fdc9faf commit 33b2f51
Show file tree
Hide file tree
Showing 18 changed files with 655 additions and 98 deletions.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ pip: .venv
jupyter: project
.venv/bin/jupyter notebook --no-browser

mkdocs: project
lecture_notes: project
.venv/bin/mkdocs serve

pytest:
Expand Down
18 changes: 0 additions & 18 deletions docs/about/course.md

This file was deleted.

22 changes: 20 additions & 2 deletions docs/about/profile.md → docs/about/course_profile.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
## Objectives

This course aims to provide students with a foundational understanding of Natural Language Processing (NLP) concepts and techniques.
By the end of this course, students will be equipped to preprocess text data, engineer relevant features, and apply basic machine learning models to text.
By the end of this course, students will be equipped to preprocess text data, engineer relevant features, and apply basic machine learning models on text data.
Additionally, students will gain insights into advanced NLP concepts like large language models and generative AI, along with their practical applications.

## Contents
Expand All @@ -20,4 +20,22 @@ Additionally, students will gain insights into advanced NLP concepts like large

## Assessment

Assessment of this course will be based on a written exam.
The course is graded based on a **written 90-minute exam** at the end of the semester.
To be admitted to the exam, it is required to

- complete all assignments and
- give a presentation on an NLP topic.

The presentation and assignments are ungraded.

## Course Language

All course materials are provided in English.
Lectures will be held in German unless we have international students.
In this case, the lectures will also be in English.

## Course Format

The course will be held in a **hybrid format**.
Approximately five lectures will be held in person.
The remaining lectures will be held online.
33 changes: 33 additions & 0 deletions docs/about/prerequisites.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Prerequisites

The following skills are **recommended** to participate in the course.

1. **Basic programming skills**

To complete the course, you will need basic programming skills.
If you visited an introduction to programming course, you should be good to go.
We don't want to bother with advanced programming concepts but get excited with NLP!
So don't worry if you just started with programming.

2. **Basic Python skills**

The code for this lecture is written in Python, so it is definetely an advantage if you have worked with Python before.
However, if you are coming from a different langugage, you should be able to follow along.
I tried to keep the language specific parts to a minimum and will provide explanations where necessary.

Microsoft provides a nice [beginner Python course](https://learn.microsoft.com/en-us/training/paths/beginner-python/) that you can take to get up to speed.

3. **Basic knowledge of the Linux command line**

Since the course is designed for a Linux development environment, it is recommended to have some basic skills with the Linux command line.
However, all required commands will be provided in the instructions, so it is not necessary to have Linux command line skills.
On Linux and Mac, the [setup](../getting_started.md) should work out of the box.

If you are on Windows, it is recommended to use the [Windows Subsystem for Linux (WSL)](https://learn.microsoft.com/en-us/windows/wsl/).
Native Windows is not supported, but you should still be able to get everything running in an Anacoda environment.

4. **Basic knowledge of Git**

The course material is hosted on GitHub pages and you can access it through the browser.
As for the assignments, you will need to clone the repository and set up the development environment.
Also since the course is still in development, you will need to pull the latest changes from time to time.
61 changes: 61 additions & 0 deletions docs/assignments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Assignments

During the semester, each student has to complete a couple of assignments.
The assignments are ungraded, but they are mandatory to pass the course.

## Assignment Structure

Throughout the course, we will work on a Python package called `htwgnlp`.
The package is located in the `src` directory and is a fully functional and installable Python package.
The core sturcture will be provided, and the assignments will be about implementing the missing functionality.

To work on an assignment, you will need to locate the `TODO ASSIGNMENT-*` items in the code.
For example, to work on the first assignment, use the search functionality of your IDE to find all relevant items:

```txt
TODO ASSIGNMENT-1
```

!!! tip

You should check the unit tests located in the `tests` directory to see the exact requirements that need to be implemented.

## Tests

Once you implemented everything, you can run the tests to check if everything works as expected.

You can run the tests using the `make` commands, for example:

```sh
make assignment_01
```

If all your tests pass, you successfully completed the assignment! 🚀

!!! tip

If your IDE provides the functionality, you can also run the tests directly from the IDE.

!!! note

You can also use the native `pytest` commands, but then you need to know the exact path to the tests:

```sh
# make sure to have the virtual environment activated
pytest tests/htwgnlp/test_preprocessing.py
```

!!! info

Pytest is a very powerful testing framework and the de-facto standard for testing in Python.
You will not need to know all the details, but if you want to learn more, check out the [official documentation](https://docs.pytest.org/en/latest/contents.html).

## Submitting Assignments

To submit an assignment, you will need to demonstrate a successful test run.

## Jupyter Notebooks

Some of the assignments are accompanied by Jupyter notebooks.

See the [Getting Started](./getting_started.md) guide for instructions on how to start the Jupyter server.
22 changes: 18 additions & 4 deletions docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,23 @@

## How is this course graded?

## Is there a script for this course?
The course is graded based on a written 90-minute exam at the end of the semester.
For more details, see the [course profile](./about/profile.md#assessment).

## When is an assignment considered completed?

An assignment is considered completed when all of its tests pass.

## How do I get access to the murals?

The link to the murals will shared through the Moodle course or during the lecture.

In order to access the murals, you will need to [create an account](https://www.mural.co/).

## Is there a PDF version of the script?

The content for this course was created using the [MkDocs](https://www.mkdocs.org/) framework.
There is no separate script provided, but you can convert any page to a PDF file by using the printing function of your browser.
MkDocs will render a nice-looking, script-like PDF file without any menu or other web items.
But please think twice before you print: it may not be necessary 🌍🌿🌈
As the course content is subject to continuous change, there is no PDF version of the script provided.
But if you wish, you can convert any page to a PDF file by using the printing function of your browser.
MkDocs will render a nice-looking PDF file without any menu or other web items.
But please think twice before you print. 🌍🌿🌈
109 changes: 108 additions & 1 deletion docs/getting_started.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,116 @@
# Getting Started

This section describes how to set up your environment for this course.

## Accounts

To get the most out of this course, you should have a [GitHub](https://github.com/) and [Mural](https://www.mural.co/) account.
Both services are free to use.

!!! tip

You can use your HTWG email address to register for GitHub and Mural.
This will make it easier to identify you as a member of this course.
Also you may benefit from several student discounts.

## Install Python

The recommended Python version for this course is 3.10. in a virtual environment.

=== ":fontawesome-brands-linux: Linux"

```sh
sudo apt update
sudo apt install python3.10
sudo apt install python3.10-venv
```

In case this doesn't work, try to add the [deadsnakes PPA](https://launchpad.net/~deadsnakes/+archive/ubuntu/ppa) to your system, and try again.

```sh
sudo add-apt-repository ppa:deadsnakes/ppa
```

=== ":fontawesome-brands-apple: Mac"

On Mac, you can use [Homebrew](https://brew.sh/) to install Python.

```sh
brew install [email protected]
```

=== ":fontawesome-brands-windows: Windows"

On Windows, it is recommended to use the [Windows Subsystem for Linux (WSL)](https://learn.microsoft.com/en-us/windows/wsl/).
Then you can follow the instructions for Linux.

There is currently no setup guide for native Windows, but I'm happy to accept a pull request for [this issue](https://github.com/pkeilbach/htwg-practical-nlp/issues/12). 😉

!!! warning

You are free to use another Python version if you wish, but be aware that this may cause problems with the provided code.
Also if you are using Python outside a virtual environment or with a distribution like Anaconda, the described setup may not work.

## Clone the repository

Make sure you have [Git](https://git-scm.com/) installed on your system.

```sh
sudo apt install python3.10-venv
git clone https://github.com/pkeilbach/htwg-practical-nlp.git
```

## Execute the Setup Script

The setup script is provided as a `Makefile`.
Change into the repository directory and execute the setup script.
This should create a virtual environment and install all required dependencies.

```sh
cd htwg-practical-nlp
make
```

This may take a few minutes. ☕

If everything went well, you should be good to go.

## Test your Installation

You can test your installation by running the tests for the first assignment.

```sh
make assignment_01
```

In your terminal, you should see 56 failed tests. 😨

But this is exactly what we want to see, since we haven't implemented anything yet! 🤓

## Start the Jupyter Server

Some of the assignments are accompanied by Jupyter notebooks.
You can start the Jupyter server with the following command.

```sh
make jupyter
```

Jupyter is now accessible at <http://localhost:8888/>.

!!! info

Of course you can also use JupyterLab if you wish, but this is not included in the setup script.

## Serve the Lecture Notes

If you want, you can bring up the lecture notes on your local machine.

```sh
make lecture_notes
```

The lecture notes are now accessible at <http://localhost:8000/>.

---

If you came this far, your initial setup was successful and you are ready to go! 🚀
229 changes: 229 additions & 0 deletions docs/img/nlp-research-vs-nlp-engineering.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
18 changes: 16 additions & 2 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,17 @@
# Welcome to Practical NLP
# Welcome! 👋

A practical course on natural language processing @ HTWG Konstanz.
Welcome to the course "Practical Natural Language Processing" at HTWG Konstanz.

I'm excited to have you here and I hope you will enjoy the course.

## Quickstart

If you have [Python](https://docs.python.org/3/) and [Git](https://git-scm.com/) installed on your system, you can get started right away.

```sh
git clone https://github.com/pkeilbach/htwg-practical-nlp.git
cd htwg-practical-nlp
make
```

For more details, check out the [Getting Started](./getting_started.md) guide.
Loading

0 comments on commit 33b2f51

Please sign in to comment.