Extreme Adaptive Sparse Training (EAST) for Large Language Models

Background

EAST is a sparse learning technique designed to train deep neural networks at extreme sparsity levels without sacrificing accuracy. This repository aims to push the boundaries of EAST by testing its effectiveness on one of the largest language models to date.

Model Details

Model architecture: [Different for different tests]

Parameter count: 25.3 billion

Dataset: TBA

EAST Implementation

This repository implements the EAST method as described in the paper by Mrare Jimmy. The implementation includes:

Dynamic ReLU phasing (DyReLU)

Weight sharing

Cyclic sparsity Goals and Contributions

The primary goal of this repository is to investigate the effectiveness of EAST on large language models. By contributing to this repository, you can help: Advance the state-of-the-art in sparse learning for large language models Improve the computational efficiency of large language models Explore new applications of EAST in natural language processing

Acknowledgments (https://arxiv.org/abs/2411.13545)

License TBA Licenses will be defined later.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Extreme Adaptive Sparse Training (EAST) for Large Language Models

About

Releases

Packages

Languages

Prophecy-Jimpsons/EAST

Folders and files

Latest commit

History

Repository files navigation

Extreme Adaptive Sparse Training (EAST) for Large Language Models

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages