Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LLaMA 3 Python support #725

Merged
merged 36 commits into from
Aug 8, 2024
Merged
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
41bf8e0
Equivalent with nano llama 3
gordicaleksa Aug 2, 2024
838cd13
Refactor
gordicaleksa Aug 2, 2024
c414d02
Minor refactor
gordicaleksa Aug 2, 2024
465aac4
Equivalent to nano llama 3 reference code
gordicaleksa Aug 3, 2024
f50f2de
Refactor attn, change numerics but equivalent
gordicaleksa Aug 3, 2024
c0c08ba
Have prompts in a file instead of inline, prompt 4 is different
gordicaleksa Aug 3, 2024
de879d1
Refactor checkpoint state dict map func
gordicaleksa Aug 3, 2024
0199e51
Refactor MLP
gordicaleksa Aug 3, 2024
fdd5345
Refactor attn mechanism
gordicaleksa Aug 3, 2024
fa7bcc3
One more minor attn fix
gordicaleksa Aug 3, 2024
180215f
Unify generate and generate_llama
gordicaleksa Aug 3, 2024
8919b66
Fix generate for gpt-2
gordicaleksa Aug 3, 2024
ccdbdfd
Going towards pure llama 3 file - fixed attn
gordicaleksa Aug 3, 2024
8a48df7
MLP GPT2->LLaMA3
gordicaleksa Aug 3, 2024
c1d2b7f
Removed from pretrained for GPT-2
gordicaleksa Aug 3, 2024
d855c96
Refactoring - got to main
gordicaleksa Aug 3, 2024
b1acb59
Got to llama 3 inference (end)
gordicaleksa Aug 3, 2024
bad7857
Done - need to test train loop and saving model
gordicaleksa Aug 3, 2024
879cc5f
Remove init weights as it's gpt-2 specific
gordicaleksa Aug 4, 2024
7768a36
Add prompts file
gordicaleksa Aug 4, 2024
cd90273
Fix saving model / state logic
gordicaleksa Aug 4, 2024
4b386a2
Test training loop works
gordicaleksa Aug 4, 2024
0749a4a
Minor refactor - remove wpe pos array from fwd
gordicaleksa Aug 4, 2024
8e55d16
Support HF & Meta models
gordicaleksa Aug 4, 2024
72dcfeb
Remove float(-inf)
gordicaleksa Aug 4, 2024
d4ef9c5
Remove llmc_py, single file
gordicaleksa Aug 8, 2024
b25e325
Add explicit external mask
gordicaleksa Aug 8, 2024
b7c98c9
Add llama config error check
gordicaleksa Aug 8, 2024
624ed3c
Rename the new file to train llama3
gordicaleksa Aug 8, 2024
dfd459b
Remove prompts.json
gordicaleksa Aug 8, 2024
ac01536
Remove the whole llmc_py
gordicaleksa Aug 8, 2024
89addd3
Remove pycache
gordicaleksa Aug 8, 2024
f1c91f8
Address Andrej's PR comments
gordicaleksa Aug 8, 2024
8b672ff
Add data loader not implemented exception
gordicaleksa Aug 8, 2024
c5c87fc
Add comments, fix stop tokens
gordicaleksa Aug 8, 2024
d773c88
Remove unnecessary comment
gordicaleksa Aug 8, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading