Adding a notebook for Unity executables on SageMaker RL (aws#1489)

* minor changes on launcher script * Adding a sample notebook for Unity executables with SageMaker RL * update README; update notebook intro; update Dockerfile and train preset Co-authored-by: yoheigon <[email protected]> Co-authored-by: henryyuanheng-wang <[email protected]>
gerilya · Sep 16, 2020 · 93ce04f · 93ce04f
1 parent 137d40e
commit 93ce04f
Show file tree

Hide file tree

Showing 9 changed files with 875 additions and 1 deletion.
diff --git a/README.md b/README.md
@@ -84,6 +84,7 @@ The following provide examples demonstrating different capabilities of Amazon Sa
 - [Stable Baselines](reinforcement_learning/rl_roboschool_stable_baselines) In this notebook example, we will make the HalfCheetah agent learn to walk using the stable-baselines, which are a set of improved implementations of Reinforcement Learning (RL) algorithms based on OpenAI Baselines.
 - [Travelling Salesman](reinforcement_learning/rl_traveling_salesman_vehicle_routing_coach) is a classic NP hard problem, which this notebook solves with AWS SageMaker RL.
 - [Tic-tac-toe](reinforcement_learning/rl_tic_tac_toe_coach_customEnv) is a simple implementation of a custom Gym environment to train and deploy an RL agent in Coach that then plays tic-tac-toe interactively in a Jupyter Notebook.
+- [Unity Game Agent](reinforcement_learning/rl_unity_ray) shows how to use RL algorithms to train an agent to play Unity3D game.
 
 ### Scientific Details of Algorithms
 

diff --git a/reinforcement_learning/README.md b/reinforcement_learning/README.md
@@ -24,6 +24,7 @@ These examples demonstrate how to train reinforcement learning models on SageMak
 -  [Tic-tac-toe](rl_tic_tac_toe_coach_customEnv) uses RL to train a policy and then plays locally and interactively within the notebook.
 -  [Traveling Salesman and Vehicle Routing](rl_traveling_salesman_vehicle_routing_coach) is an example of using RL to address operations research problems.
 -  [Game Server Auto-pilot](rl_game_server_autopilot) Reduce player wait time by autoscaling game-servers deployed in EKS cluster using RL to add and remove EC2 instances as per dynamic player usage.
+-  [Unity Game Agent](rl_unity_ray) shows how to use RL algorithms to train an agent to play Unity3D game.
 
 ### FAQ
 https://github.com/awslabs/amazon-sagemaker-examples#faq 
diff --git a/reinforcement_learning/common/sagemaker_rl/ray_launcher.py b/reinforcement_learning/common/sagemaker_rl/ray_launcher.py
@@ -123,7 +123,7 @@ def get_all_host_names(self):
 
     def ray_init_config(self):
         num_workers = max(self.num_cpus, 3)
-        config = {"num_cpus": num_workers, "num_gpus": self.num_gpus}
+        config = {"num_cpus": num_workers, "num_gpus": self.num_gpus, "webui_host": '127.0.0.1'}
 
         if self.is_master_node:
             all_wokers_host_names = self.get_all_host_names()[1:]

diff --git a/reinforcement_learning/rl_unity_ray/Dockerfile b/reinforcement_learning/rl_unity_ray/Dockerfile
@@ -0,0 +1,25 @@
+ARG CPU_OR_GPU
+ARG AWS_REGION
+FROM 462105765813.dkr.ecr.${AWS_REGION}.amazonaws.com/sagemaker-rl-ray-container:ray-0.8.2-tf-${CPU_OR_GPU}-py36
+
+WORKDIR /opt/ml
+
+# Unity dependencies
+
+RUN pip install --upgrade \
+    pip \
+    gym-unity \
+    mlagents-envs
+
+RUN pip install sagemaker-containers --upgrade
+
+ENV PYTHONUNBUFFERED 1
+
+############################################
+# Test Installation
+############################################
+# Test to verify if all required dependencies installed successfully or not.
+RUN python -c "import gym;import sagemaker_containers.cli.train;import ray; from sagemaker_containers.cli.train import main; from mlagents_envs.environment import UnityEnvironment; from mlagents_envs.registry import default_registry; from gym_unity.envs import UnityToGymWrapper"
+
+# Make things a bit easier to debug
+WORKDIR /opt/ml/code
diff --git a/reinforcement_learning/rl_unity_ray/README.md b/reinforcement_learning/rl_unity_ray/README.md
@@ -0,0 +1,13 @@
+#  Unity3D Game with Amazon SageMaker RL
+
+This folder contains examples of how to use RL to train an agent to play Unity3D game using Amazon SageMaker Reinforcement Learning. Customer can choose using [example environment](https://github.com/Unity-Technologies/ml-agents/blob/742c2fbf01188fbf27e82d5a7d9b5fd42f0de67a/docs/Learning-Environment-Examples.md) provided by Unity Toolkit or bring their own customized Unity executables.
+
+
+## Contents
+
+* `rl_unity_ray.ipynb`: notebook for training an RL agent.
+
+
+* `src/`
+  * `train-unity.py`: Entrypoint file to starting a training job
+  * `evaluate-unity.py`: Entrypoint file to starting a evaluation job 
diff --git a/reinforcement_learning/rl_unity_ray/common b/reinforcement_learning/rl_unity_ray/common
@@ -0,0 +1 @@
+../common