Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attn #138

Merged
merged 18 commits into from
Sep 28, 2024
Merged

Attn #138

merged 18 commits into from
Sep 28, 2024

Conversation

AnFreTh
Copy link
Collaborator

@AnFreTh AnFreTh commented Sep 28, 2024

New Model Architectures:

  • MambAttn Class: Introduced a new model class MambAttn that alternates between Mamba blocks and attention layers, providing a flexible architecture for various deep learning tasks. (mambular/arch_utils/mambattn_arch.py)
  • ConvRNN Class: Added the ConvRNN class that combines convolutional layers with RNN layers, supporting various RNN types (RNN, LSTM, GRU) and optional residual connections. (mambular/arch_utils/rnn_utils.py)

Integration and Configuration:

  • MambAttention Model: Implemented the MambAttention model that leverages the MambAttn architecture, with support for various normalization techniques and pooling methods. (mambular/base_models/mambattn.py)
  • Model Registration: Registered the MambAttn model in the __init__.py of base_models to ensure it's accessible within the module. (mambular/base_models/__init__.py) [1] [2]

Optimization Enhancements:

  • Early Pruning and Optimizer Configuration: Enhanced the lightning_wrapper.py to include early pruning based on validation loss and dynamic optimizer configuration, allowing for more flexible and efficient training.
    Include automatic bayesian HPO for all models -> config-mapper for automatic hparam-range detection
    (mambular/base_models/lightning_wrapper.py) [1] [2]

@AnFreTh AnFreTh merged commit ed5a0f3 into develop Sep 28, 2024
@AnFreTh AnFreTh deleted the attn branch November 5, 2024 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant