Project details

I completed this guided programming exercise as a part of the Udacity course titled Value-Based Methods, a component of the Udacity nanodegree titled Deep Reinforcement Learning.

The code in this respository trains an agent to navigate an environment and collect bananas. The environment comprises a 37-dimensional state space and a four-dimensional action space. The state includes the agent's velocity and its ray-based perception of objects around its forward direction. There are four actions, which are moving forward, moving backward, turning left, and turning right. When an action takes the agent to a state where it collects a yellow banana, a reward of +1 is provided. If a blue banana is collected, a reward of -1 is provided.

The code trains the agent to select the best action in each state with the objective of collecting as many yellow bananas as possible while avoiding blue bananas. This learning task is episodic with a maximum number of transitions (time steps) in one episode. The user defines this maximum number and rewards accumulate over an episode to give an episodic score. Solving the environment means the trained agent can achieve an average score of at least 13 over 100 consecutive episodes.

Getting Started

I completed the project in the Project Workspace provided by Udacity. To set up the environment on your own machine, you need to follow the instructions here. They are reproduced here.

Clone the Course GitHub repository and set up your Python environment. PyTorch, the ML-Agents toolkit, and a few more Python packages are required to complete the project. These steps can be completed by following the instructions here.
The environment is similar to the Banana Collector environment on the Unity ML-Agents GitHub page, but you don't need to download/install Unity because the environment can be downloaded from the links below. However, you need to select the link that matches your operating system: Linux, Mac OSX, Windows (32-bit), and Windows (64-bit). If you are a Windows user, the ML-Agents toolkit supports Windows 10.
Place the environment file in the p1_navigation/ directory in the DRLND GitHub repository.
Unzip (or decompress) the environment file in the directory.

Instructions

First, replace the Navigation.ipynb file in the p1_navigation/ directory with the file with the same name in this GitHub repository. Second, download the dqn_agent_variant.py and model.py files from this GitHub repository and add them to the p1_navigation/ directory.
Open the Navigation.ipynb file. It's a Jupyter notebook.
Update the PATH env var by modifying the first code cell in the notebook, reproduced here.

# Step 1. Update the PATH env var. 
import os
os.environ['PATH'] = f"{os.environ['PATH']}:/home/student/.local/bin"
os.environ['PATH'] = f"{os.environ['PATH']}:/opt/conda/lib/python3.10/site-packages"
os.environ['PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION'] = 'python'

The first few sections of the Jupyter notebook are exploratory.
In Section four, you train the agent. You can change the hyperparameters by providing different arguments for the function parameters, reproduced here. n_episodes is the maximum number of episodes before training ends, max_t is the maximum number of time steps (transitions) in one episode, eps_start is the starting probability of exploration (epsilon-greedy policy), eps_end is the minimum probability of exploration, and eps_decay is the multiplicative decay rate used to reduce the probability at the end of an episode.

scores, scores_window_mean = dqn(n_episodes=2000, max_t=300, eps_start=0.10, eps_end=0.01, eps_decay=0.987)

In the dqn_agent_variant.py file, there are some additional hyperparameters. LR and GAMMA are the most important two. LR represents the neural network's learning rate and GAMMA represents the discount factor in the formula used to update a state-action pair's Q value.

Acknowledgement

I consulted this GitHub repository (link) when I tuned the hyperparameters.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
build/lib		build/lib
communicator_objects		communicator_objects
curricula		curricula
tests		tests
unityagents.egg-info		unityagents.egg-info
unityagents		unityagents
unitytrainers		unitytrainers
Basics.ipynb		Basics.ipynb
LICENSE.txt		LICENSE.txt
Navigation.ipynb		Navigation.ipynb
README.md		README.md
Report.pdf		Report.pdf
checkpoint.pth		checkpoint.pth
dqn_agent_variant.py		dqn_agent_variant.py
learn.py		learn.py
model.py		model.py
requirements.txt		requirements.txt
scores		scores
scores_window_mean		scores_window_mean
setup.py		setup.py
trainer_config.yaml		trainer_config.yaml
unity-environment.log		unity-environment.log

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project details

Getting Started

Instructions

Acknowledgement

About

Releases

Packages

Languages

License

kywertheim/DQN_Navigation

Folders and files

Latest commit

History

Repository files navigation

Project details

Getting Started

Instructions

Acknowledgement

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages