Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Major environment refactoring (base version) #169

Merged
merged 10 commits into from
May 1, 2024
Merged

Conversation

cbhua
Copy link
Member

@cbhua cbhua commented Apr 30, 2024

Description

Please refer to the full environment refactor PR: #166.

Motivation and Context

Please refer to the full environment refactor PR: #166.

Types of changes

Compare with the full environment refactor, this base refactor only touches:

  1. Environment structure;
  2. Documentation updates;
  3. Supporting generator;

And as a base refactor, this PR keeps:

  1. All the data generate processes (size, default range, default distribution) are not modified;
  2. All the step, reward calculation, etc logics are not modified;

Overall, this base version PR only works on moving code and doesn't change the logic for consequence. In the future version, we will refactor environments with the guide of the full refactor version step by step.

For more details, please refer to the full environment refactor PR: #166.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds core functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)
  • Example (update in the folder of examples)

Checklist

  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.

@cbhua cbhua marked this pull request as ready for review April 30, 2024 14:42
@fedebotu fedebotu self-requested a review May 1, 2024 00:21
Copy link
Member

@fedebotu fedebotu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

demand_distribution: Union[
int, float, type, Callable
] = Uniform,
vehicle_capacity: float = 1.0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think vehicle_capacity and capacity should be the same parameter, is it right?

EDIT: my bad - capacity is actually the "maximum capacity" to normalize

Copy link
Member Author

@cbhua cbhua May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, maybe later we want to rename the capacity. I think now this may be confusing to some users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let's leave it as simple as possible for now

rl4co/envs/routing/cvrp/generator.py Outdated Show resolved Hide resolved
log = get_pylogger(__name__)


class MPDPEnv(RL4COEnvBase):
"""Multi-agent Pickup and Delivery Problem environment.
"""Multi-agent Pickup and Delivery Problem (mPDP) environment.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that actually this should be called "mPDTSP", but I guess we can leave it like that for now

rl4co/envs/routing/pctsp/generator.py Outdated Show resolved Hide resolved
self.depot_sampler = get_sampler("depot", depot_distribution, min_loc, max_loc, **kwargs)

# Prize distribution
self.deterministic_prize_sampler = get_sampler("deterministric_prize", "uniform", 0.0, 4.0/self.num_loc, **kwargs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one cannot be changed via kwargs, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, since there is a specific rule for the value range of the deterministic_prize and also the following stochastic_prize. Here I hardcode the distribution to Uniform.

@fedebotu fedebotu merged commit f7c984c into main May 1, 2024
24 checks passed
@cbhua cbhua deleted the refactor-env-base branch May 10, 2024 04:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants