Skip to content

Exploring Extreme Adaptive Sparse Training (EAST) on a 25.3B parameter language model

Notifications You must be signed in to change notification settings

Prophecy-Jimpsons/EAST

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Extreme Adaptive Sparse Training (EAST) for Large Language Models

Background

EAST is a sparse learning technique designed to train deep neural networks at extreme sparsity levels without sacrificing accuracy. This repository aims to push the boundaries of EAST by testing its effectiveness on one of the largest language models to date.

Model Details

Model architecture: [Different for different tests]

Parameter count: 25.3 billion

Dataset: TBA

EAST Implementation

This repository implements the EAST method as described in the paper by Mrare Jimmy. The implementation includes:

Dynamic ReLU phasing (DyReLU)

Weight sharing

Cyclic sparsity Goals and Contributions

The primary goal of this repository is to investigate the effectiveness of EAST on large language models. By contributing to this repository, you can help: Advance the state-of-the-art in sparse learning for large language models Improve the computational efficiency of large language models Explore new applications of EAST in natural language processing

Acknowledgments (https://arxiv.org/abs/2411.13545)

License TBA Licenses will be defined later.

About

Exploring Extreme Adaptive Sparse Training (EAST) on a 25.3B parameter language model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages