fix: remove lm_head for granite with llama arch models #258

Ssukriti · 2024-07-22T22:06:52Z

Description of the change

Before copying the last checkpoint created to the users's output dir, we load the checkpoint, test to see if it has llama arch and that the lm_head weight matches the embed_weight, and if so delete the lm_head.weight.
- Loads model with AutoModelForCausalLM.from_pretrained() if is a fine tuned model, otherwise loads base model and adapter with PeftModel.from_pretrained().
- After deleting lm_head.weight, saves the model to output_dir with model.save_pretrained()

Related issue number

We found that for granite models with llama architecture (such as granite-3b-code) has lm_head weights tied to embed weights and when tuning with accelerate and FSDP for multi-GPU, the lm_head weight was created as a duplicate of the embed weight. Due to this unexpected weight being created, vLLM is unable to load the model. To fix this, we are deleting the lm_head weight.

How to verify the PR

Ran updated accelerate_launch.py script in cluster with below configs:

granite-3b-code-base, fine tuning, multi-GPU

Successfully fine tuned model and removed lm_head. The tuned model was able to be loaded on vLLM (where verified previous lm_head error). Verified that lm_head is removed from checkpoint's model.safetensors.index.json.

tail tuning logs:

{'train_runtime': 585.5663, 'train_samples_per_second': 7.395, 'train_steps_per_second': 0.939, 'train_loss': 0.4583773977106268, 'epoch': 10.0}
100%|████████████████████████████████████████████████████████████████████████████████████| 550/550 [09:45<00:00,  1.06s/it]
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████| 2/2 [00:16<00:00,  8.17s/it]
INFO:root:Removing lm_head from checkpoint

granite-3b-code, LoRA tuning, multi-GPU

Successfully LoRA tuned model and removed lm_head. The tuned model was able to be loaded on vLLM (where verified previous lm_head error).

tail tuning logs:

{'train_runtime': 281.5458, 'train_samples_per_second': 15.379, 'train_steps_per_second': 1.954, 'train_loss': 0.9379253248734908, 'epoch': 10.0}
100%|████████████████████████████████████████████████████████████████████████████████████| 550/550 [04:41<00:00,  1.95it/s]
DEBUG:urllib3.connectionpool:Starting new HTTPS connection (1): huggingface.co:443
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /ibm-granite/granite-3b-code-base/resolve/main/config.json HTTP/11" 200 0
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████| 2/2 [00:12<00:00,  6.02s/it]
DEBUG:urllib3.connectionpool:https://huggingface.co:443 "HEAD /ibm-granite/granite-3b-code-base/resolve/main/generation_config.json HTTP/11" 200 0
INFO:root:Removing lm_head from checkpoint

granite-3b-code-base, prompt tuning, single-GPU

Successfully prompt tuned and as we know single GPU does not result in the lm_head issue and so it does not delete lm_head tuning runs.

llama-13b-base, fine tuned, multi-GPU

Successfully fine tuned and as expected, does not result in lm_head being deleted. Tuning logs show no message that lm_head is deleted.

Was the PR tested

I have added >=1 unit test(s) for every new method I have added.
I have ensured all unit tests pass

Signed-off-by: Anh-Uong <[email protected]>

Ssukriti · 2024-07-22T22:07:21Z

build/accelerate_launch.py

+                == params_dict["model.embed_tokens.weight"].untyped_storage().data_ptr()
+            ):
+                logging.info("Removing lm_head from checkpoint")
+                del model.lm_head


maybe just deleting the lm_head.weight should suffice?

Signed-off-by: Anh-Uong <[email protected]>

anhuong · 2024-07-25T16:57:07Z

build/accelerate_launch.py

+                    model.save_pretrained(original_output_dir)
+                    # save tokenizer with model
+                    tokenizer.save_pretrained(original_output_dir)


I saved both the model and tokenizer because I think the tokenizer is needed for vLLM to load the model and to match the previous checkpoint output. The LoRA tuned output looks the same but fine tuned output looks a little different:

new output:

$ ls /data/anhuong/tuning_output/granite-3b-code-ft-test-remove-lm-head config.json model-00001-of-00002.safetensors special_tokens_map.json training_logs.jsonl generation_config.json model-00002-of-00002.safetensors tokenizer.json vocab.json merges.txt model.safetensors.index.json tokenizer_config.json

previous output:

$ ls /data/anhuong/tuning_output/granite-3b-code-test-accelerate-orig_params-transformers_v4.42 config.json model.safetensors.index.json scheduler.pt training_args.bin generation_config.json optimizer.bin special_tokens_map.json training_logs.jsonl merges.txt pytorch_model_fsdp.bin tokenizer.json vocab.json model-00001-of-00002.safetensors rng_state_0.pth tokenizer_config.json model-00002-of-00002.safetensors rng_state_1.pth trainer_state.json

the model was still able to be loaded by vLLM but wanted to make note of this.

In addition, I was able to check for fine tuned models that lm_head.weight was deleted by checking file model.safetensors.index.json, but is there an easy way to check this for LoRA tuned adapters?

Finally, when I load the fine tuned model up in python, you do see lm_head.weight still even though it's deleted in model.safetensors.index.json.

>>> use_flash_attn = os.getenv("use_flash_attn", True) >>> ft_model = "/data/anhuong/tuning_output/granite-3b-code-ft-test-remove-lm-head" >>> ft_model = AutoModelForCausalLM.from_pretrained(ft_model, attn_implementation="flash_attention_2" if use_flash_attn else None, torch_dtype=bfloat16 if use_flash_attn else None) Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████████| 2/2 [00:33<00:00, 16.94s/it] >>> ft_model.lm_head Linear(in_features=2560, out_features=49152, bias=False) >>> ft_model.lm_head.weight Parameter containing: tensor([[-0.0022, 0.0032, 0.0302, ..., -0.0132, -0.0620, 0.0070], [ 0.0152, 0.0287, -0.0002, ..., -0.0028, 0.0100, 0.0159], [ 0.0132, -0.0197, 0.0447, ..., 0.0247, 0.0732, -0.0236], ..., [ 0.0067, -0.0435, -0.0234, ..., -0.0294, 0.0168, 0.0226], [ 0.0129, -0.0044, 0.0471, ..., 0.0024, 0.0066, -0.0032], [ 0.0057, -0.0280, -0.0093, ..., 0.0039, 0.0347, 0.0199]], dtype=torch.bfloat16, requires_grad=True) >>> ft_model.lm_head.weight.untyped_storage().data_ptr() == ft_model.model.embed_tokens.weight.untyped_storage().data_ptr()True >>> ft_model.lm_head.weight.untyped_storage().data_ptr() == base_model.lm_head.weight.untyped_storage().data_ptr() False >>> ft_model.lm_head.weight.untyped_storage().data_ptr() == base_model.model.embed_tokens.weight.untyped_storage().data_ptr() False

anhuong · 2024-07-25T22:13:56Z

build/accelerate_launch.py

+            # a fine tuned model will have params_dict.get("model.embed_tokens.weight")
+            # a prompt adapter has params_dict.get("base_model.model.embed_tokens.weight")
+            # a lora adapter has params_dict.get("base_model.model.model.embed_tokens.weight")


I had assumed that model.model.model.embed_tokens.weight.untyped_storage().data_ptr() was the same as referencing

params_dict = dict(model.named_parameters()) params_dict.get("base_model.model.model.embed_tokens.weight")

Let me know if this assumption is incorrect. I moved away from params_dict in order to not have to differentiate between base_model and model

alex-jw-brooks · 2024-07-26T14:12:50Z

build/accelerate_launch.py


 ERROR_LOG = "/dev/termination-log"


+def get_base_model_from_adapter_config(adapter_config):
+    with open(adapter_config, "r", encoding="utf-8") as config_file:


Can you add a docstring to this function, just explaining that the adapter config is for peft models since this might be confusing to people that have only used transformers models?

alex-jw-brooks · 2024-07-26T14:14:23Z

build/accelerate_launch.py

+            last_checkpoint_path = os.path.join(tempdir, last_checkpoint_dir)
+
+            use_flash_attn = job_config.get("use_flash_attn", True)
+            adapter_config_path = f"{last_checkpoint_path}/adapter_config.json"


It would be better to use os.path.join here since we use it in most places and it's easier to port to different platforms :)

Suggested change

adapter_config_path = f"{last_checkpoint_path}/adapter_config.json"

adapter_config_path = os.path.join(last_checkpoint_path, "adapter_config.json")

alex-jw-brooks · 2024-07-26T14:21:40Z

build/accelerate_launch.py

+                # where the model's layers are modified, in our case the embedding layer
+                # is modified, so we resize the backbone model's embedding layer with our own
+                # utility before passing it along to load the PEFT model.
+                tokenizer_data_utils.tokenizer_and_embedding_resize(


is there a reason that this needs to happen again? I think this is called by the launched sft trainer command, so shouldn't the resizing already be handled?

Unit tests for LoRA and PT were failing when trying to load the checkpoint with error:

RuntimeError: Error(s) in loading state_dict for LlamaForCausalLM: size mismatch for model.embed_tokens.weight: copying a param with shape torch.Size([32008, 64]) from checkpoint, the shape in current model is torch.Size([32000, 64]).

so then I had to resize the tokens to be able to load the checkpoint.

Iiiiinteresting. I guess the resize information must not be preserved in the config and reloading with the base class resets it to the base size or something 🤔

I also verified that I get this same error when manually running tuning on tiny llama model with same params as unit test and then trying to load the checkpoint.

huh? is this happening on current main due to some other changes in dependencies or something?

why is this change related to the lm_head deletion here?

if something changed, we should not just patch it by doing such things for final checkpoint only. Though product uses final checkpoint only for now , rest of research can use any checkpoint . so our load and inference scripts should work with any checkpoint and we have to understand what changed

I didn't think this was a patch, I thought this was always needed to load a model we tuned such as what we run in run_inference. Since we update the tokenizer in our sft_trainer then when we load the tuned checkpoint again we need to update the tokenizer as well

I didn't realize this was a new change added in a few weeks ago - https://github.com/foundation-model-stack/fms-hf-tuning/pull/227/files -- that resizes the embeddings to be a multiple of 8 with new tokens instead of the size of the tokenizer which required this addition....hmmm

Signed-off-by: Anh-Uong <[email protected]>

…del-stack#258) * initial code for deleting lm_head Signed-off-by: Anh-Uong <[email protected]> * fix logic for copying checkpoint Signed-off-by: Anh-Uong <[email protected]> * fix check that embed_tokens and lm_head weights are the same Signed-off-by: Anh-Uong <[email protected]> * fix warning assertion Signed-off-by: Anh-Uong <[email protected]> * fix lm_head check, remove test Signed-off-by: Anh-Uong <[email protected]> * small fixes from code review Signed-off-by: Anh-Uong <[email protected]> * fmt Signed-off-by: Anh-Uong <[email protected]> --------- Signed-off-by: Anh-Uong <[email protected]> Co-authored-by: Anh-Uong <[email protected]> Signed-off-by: Abhishek <[email protected]>

* Set default value of target_modules to be None in LoraConfig Signed-off-by: Will Johnson <[email protected]> * Removal of transformers logger and addition of python logger Signed-off-by: Abhishek <[email protected]> * FMT and lint check: Removal of transformers logger and addition of python logger Signed-off-by: Abhishek <[email protected]> * fix: remove lm_head for granite with llama arch models (#258) * initial code for deleting lm_head Signed-off-by: Anh-Uong <[email protected]> * fix logic for copying checkpoint Signed-off-by: Anh-Uong <[email protected]> * fix check that embed_tokens and lm_head weights are the same Signed-off-by: Anh-Uong <[email protected]> * fix warning assertion Signed-off-by: Anh-Uong <[email protected]> * fix lm_head check, remove test Signed-off-by: Anh-Uong <[email protected]> * small fixes from code review Signed-off-by: Anh-Uong <[email protected]> * fmt Signed-off-by: Anh-Uong <[email protected]> --------- Signed-off-by: Anh-Uong <[email protected]> Co-authored-by: Anh-Uong <[email protected]> Signed-off-by: Abhishek <[email protected]> * Add config_utils tests Signed-off-by: Angel Luu <[email protected]> * Fix fmt Signed-off-by: Angel Luu <[email protected]> * Separate tests out and use docstrings Signed-off-by: Angel Luu <[email protected]> * Update more field/value checks from HF defaults Signed-off-by: Angel Luu <[email protected]> * Fix: Addition of env var TRANSFORMERS_VERBOSITY check Signed-off-by: Abhishek <[email protected]> * FMT Fix: Addition of env var TRANSFORMERS_VERBOSITY check Signed-off-by: Abhishek <[email protected]> * Add test for tokenizer in lora config (should be ignored) Signed-off-by: Angel Luu <[email protected]> * Adding logging support to accelerate launch Signed-off-by: Abhishek <[email protected]> * FMT_FIX: Adding logging support to accelerate launch Signed-off-by: Abhishek <[email protected]> * bug: On save event added to callback (#256) * feat: On save event added to callback Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Removed additional bracket Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Removed additional bracket Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Format issues resolved Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: rebase with upstream and add new line Signed-off-by: Mehant Kammakomati <[email protected]> --------- Signed-off-by: Padmanabha V Seshadri <[email protected]> Signed-off-by: Mehant Kammakomati <[email protected]> Co-authored-by: Mehant Kammakomati <[email protected]> * feat: All metric handling changes (#263) * feat: All metric handling changes Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Format issues Signed-off-by: Padmanabha V Seshadri <[email protected]> --------- Signed-off-by: Padmanabha V Seshadri <[email protected]> * feat: Configuration to set logging level for trigger log (#241) * feat: Added the triggered login in the operation Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Formatting issues Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Added default config Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Moved the variable to right scope Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Checked added to validate config log level Signed-off-by: Padmanabha V Seshadri <[email protected]> * fix: Removed some unwanted log file Signed-off-by: Padmanabha V Seshadri <[email protected]> --------- Signed-off-by: Padmanabha V Seshadri <[email protected]> * limit peft deps until investigate (#274) Signed-off-by: Anh-Uong <[email protected]> * Data custom collator (#260) * refactor code to preprocess datasets Co-authored-by: Alex-Brooks <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> * fix formatting Co-authored-by: Alex-Brooks <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> * allow input/output in validate args Co-authored-by: Alex-Brooks <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> * format input/output JSON and mask Co-authored-by: Alex-Brooks <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> * function to return suitable collator Co-authored-by: Alex-Brooks <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> * add tests for SFT Trainer input/output format Co-authored-by: Alex-Brooks <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> * remove unused functions Co-authored-by: Alex-Brooks <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> * add eos token to input/output format Signed-off-by: Sukriti-Sharma4 <[email protected]> * fix tests Signed-off-by: Sukriti-Sharma4 <[email protected]> * improve docstrings Signed-off-by: Sukriti-Sharma4 <[email protected]> * keeping JSON keys constant Signed-off-by: Sukriti-Sharma4 <[email protected]> * support for input/output format Signed-off-by: Sukriti-Sharma4 <[email protected]> * formatting fixes Signed-off-by: Sukriti-Sharma4 <[email protected]> * update rEADME formats Signed-off-by: Sukriti-Sharma4 <[email protected]> * formatting README Signed-off-by: Sukriti-Sharma4 <[email protected]> --------- Signed-off-by: Sukriti-Sharma4 <[email protected]> Co-authored-by: Alex-Brooks <[email protected]> * Revert "limit peft deps until investigate (#274)" (#275) This reverts commit f57ff63. Signed-off-by: Anh-Uong <[email protected]> * feat: per process state metric (#239) Signed-off-by: Harikrishnan Balagopal <[email protected]> * Modify test to pass with target_modules: None Signed-off-by: Will Johnson <[email protected]> * Logging changes and unit tests added Signed-off-by: Abhishek <[email protected]> * feat: Add a dockerfile argument to enable aimstack (#261) * Add a dockerfile argument at the end of final layer to enable aimstack. Currenlty guarded by a dockerfile argument. Signed-off-by: Dushyant Behl <[email protected]> * Set the default value of ENABLE_AIM to false Signed-off-by: Dushyant Behl <[email protected]> --------- Signed-off-by: Dushyant Behl <[email protected]> * Solved conflict with main Signed-off-by: Abhishek <[email protected]> * FMT:Fix Solved conflict with main Signed-off-by: Abhishek <[email protected]> * enabling tests for prompt tuning Signed-off-by: Abhishek <[email protected]> * feat: Support pretokenized (#272) * feat: support pretokenized datasets Signed-off-by: Mehant Kammakomati <[email protected]> * fix: rebase with upstream and review commits Signed-off-by: Mehant Kammakomati <[email protected]> * fix: rebase with upstream and review commits Signed-off-by: Mehant Kammakomati <[email protected]> * fix: rebase with upstream and review commits Signed-off-by: Mehant Kammakomati <[email protected]> * consolidate collator code Signed-off-by: Sukriti-Sharma4 <[email protected]> * add valuerrors for incorrect args Signed-off-by: Sukriti-Sharma4 <[email protected]> * feat: add unit tests for validate_data_args and format_dataset Signed-off-by: Mehant Kammakomati <[email protected]> * feat: add unit tests for validate_data_args and format_dataset Signed-off-by: Mehant Kammakomati <[email protected]> * feat: add unit tests for validate_data_args and format_dataset Signed-off-by: Mehant Kammakomati <[email protected]> * feat: add unit tests for validate_data_args and format_dataset Signed-off-by: Mehant Kammakomati <[email protected]> --------- Signed-off-by: Mehant Kammakomati <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> Co-authored-by: Sukriti-Sharma4 <[email protected]> Co-authored-by: Alex Brooks <[email protected]> * Update packaging requirement from <24,>=23.2 to >=23.2,<25 (#212) Updates the requirements on [packaging](https://github.com/pypa/packaging) to permit the latest version. - [Release notes](https://github.com/pypa/packaging/releases) - [Changelog](https://github.com/pypa/packaging/blob/main/CHANGELOG.rst) - [Commits](pypa/packaging@23.2...24.1) --- updated-dependencies: - dependency-name: packaging dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Anh Uong <[email protected]> * enabling tests for prompt tuning (#278) Signed-off-by: Abhishek <[email protected]> Co-authored-by: Anh Uong <[email protected]> * fix: do not add special tokens for custom tokenizer (#279) Signed-off-by: Mehant Kammakomati <[email protected]> * PR changes for changing logger Signed-off-by: Abhishek <[email protected]> * fix: bug where the logger was not being used properly (#286) Signed-off-by: Hari <[email protected]> * Unit Tests changes Signed-off-by: Abhishek <[email protected]> * Add functionality to free disk space from Github Actions (#287) * Add functionality to free disk space from Github Actions Signed-off-by: Will Johnson <[email protected]> * Add functionality to free disk space from Github Actions, relocate from build-and-publish.yaml to image.yaml Signed-off-by: Will Johnson <[email protected]> * Move freeing space step to before building image Signed-off-by: Will Johnson <[email protected]> --------- Signed-off-by: Will Johnson <[email protected]> * commented os.environ[LOG_LEVEL] in accelerate.py for testing Signed-off-by: Abhishek <[email protected]> * PR changes Signed-off-by: Abhishek <[email protected]> * FIX:FMT Signed-off-by: Abhishek <[email protected]> * PR Changes Signed-off-by: Abhishek <[email protected]> * PR Changes Signed-off-by: Abhishek <[email protected]> * Add unit test to verify target_modules defaults correctly (#281) * Add unit test to verify target_modules defaults correctly Signed-off-by: Will Johnson <[email protected]> * Add sft_trainer.main test to ensure target modules properly default for LoRA when set to None from CLI Signed-off-by: Will Johnson <[email protected]> * fmt Signed-off-by: Will Johnson <[email protected]> * Use model_args instead of importing, fix nits Signed-off-by: Will Johnson <[email protected]> * Add test to ensure target_modules defaults to None in job config Signed-off-by: Will Johnson <[email protected]> * Add additional check, fix nits Signed-off-by: Will Johnson <[email protected]> --------- Signed-off-by: Will Johnson <[email protected]> * docs: Add documentation on experiment tracking. (#257) Signed-off-by: Dushyant Behl <[email protected]> * Ensure additional metadata to trackers don't throw error in happy case. (#290) Signed-off-by: Dushyant Behl <[email protected]> * PR Changes Signed-off-by: Abhishek <[email protected]> * fix multiple runid creation bug with accelerate. (#268) Signed-off-by: Dushyant Behl <[email protected]> * feat: logging control operation (#264) Signed-off-by: Padmanabha V Seshadri <[email protected]> * Metrics file epoch indexing from 0 Signed-off-by: Abhishek <[email protected]> * Revert last commit Signed-off-by: Abhishek <[email protected]> * fix run evaluation to get base model path (#273) Signed-off-by: Anh-Uong <[email protected]> * PR Changes Signed-off-by: Abhishek <[email protected]> * PR Changes Signed-off-by: Abhishek <[email protected]> * feat: Added additional events such as on_step_begin, on_optimizer_step, on_substep_end (#293) Signed-off-by: Padmanabha V Seshadri <[email protected]> * Always update setuptools to latest (#288) Signed-off-by: James Busche <[email protected]> Co-authored-by: Anh Uong <[email protected]> * Rename all fixtures with correct .jsonl extension (#295) Signed-off-by: Will Johnson <[email protected]> Co-authored-by: Anh Uong <[email protected]> * feat: add save_model_dir flag where final checkpoint saved (#291) * add save_model_dir flag for final checkpoint Signed-off-by: Anh-Uong <[email protected]> * remove output_dir logic, add save method Signed-off-by: Anh-Uong <[email protected]> * update accelerate_launch, remove save tokenizer Signed-off-by: Anh-Uong <[email protected]> * fix: put back creation of .complete file Signed-off-by: Anh-Uong <[email protected]> * fix failing tests and add new ones Signed-off-by: Anh-Uong <[email protected]> * tests: add sft_trainer test to train and save - small refactor of tests Signed-off-by: Anh-Uong <[email protected]> * add docs on saving checkpoints and fix help msg Signed-off-by: Anh-Uong <[email protected]> * update example and note best checkpoint Signed-off-by: Anh-Uong <[email protected]> * changes based on PR review Signed-off-by: Anh-Uong <[email protected]> * add logging to save, fix error out properly Signed-off-by: Anh-Uong <[email protected]> --------- Signed-off-by: Anh-Uong <[email protected]> --------- Signed-off-by: Will Johnson <[email protected]> Signed-off-by: Abhishek <[email protected]> Signed-off-by: Anh-Uong <[email protected]> Signed-off-by: Angel Luu <[email protected]> Signed-off-by: Padmanabha V Seshadri <[email protected]> Signed-off-by: Mehant Kammakomati <[email protected]> Signed-off-by: Sukriti-Sharma4 <[email protected]> Signed-off-by: Harikrishnan Balagopal <[email protected]> Signed-off-by: Dushyant Behl <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: Hari <[email protected]> Signed-off-by: James Busche <[email protected]> Co-authored-by: Abhishek <[email protected]> Co-authored-by: Sukriti Sharma <[email protected]> Co-authored-by: Anh-Uong <[email protected]> Co-authored-by: Abhishek Maurya <[email protected]> Co-authored-by: Angel Luu <[email protected]> Co-authored-by: Angel Luu <[email protected]> Co-authored-by: Padmanabha V Seshadri <[email protected]> Co-authored-by: Mehant Kammakomati <[email protected]> Co-authored-by: Alex-Brooks <[email protected]> Co-authored-by: Hari <[email protected]> Co-authored-by: Dushyant Behl <[email protected]> Co-authored-by: Sukriti-Sharma4 <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: James Busche <[email protected]>

anhuong added 2 commits July 19, 2024 16:09

initial code for deleting lm_head

388d484

Signed-off-by: Anh-Uong <[email protected]>

fix logic for copying checkpoint

07ddea8

Signed-off-by: Anh-Uong <[email protected]>

Ssukriti commented Jul 22, 2024

View reviewed changes

anhuong added 3 commits July 24, 2024 12:45

fix check that embed_tokens and lm_head weights are the same

dfe21c7

Signed-off-by: Anh-Uong <[email protected]>

fix warning assertion

506137a

Signed-off-by: Anh-Uong <[email protected]>

Merge branch 'main' into remove-lm-head

176c9e3

Signed-off-by: Anh-Uong <[email protected]>

anhuong marked this pull request as ready for review July 25, 2024 06:50

anhuong requested review from anhuong and alex-jw-brooks as code owners July 25, 2024 06:50

anhuong changed the title ~~Remove lm head~~ fix: remove lm_head for granite with llama arch models Jul 25, 2024

fix lm_head check, remove test

644188a

Signed-off-by: Anh-Uong <[email protected]>

anhuong force-pushed the remove-lm-head branch from 1a325bd to 644188a Compare July 25, 2024 16:35

anhuong reviewed Jul 25, 2024

View reviewed changes

alex-jw-brooks reviewed Jul 26, 2024

View reviewed changes

anhuong added 4 commits July 26, 2024 11:31

small fixes from code review

98143cc

Signed-off-by: Anh-Uong <[email protected]>

merge changes from main

06eb978

Signed-off-by: Anh-Uong <[email protected]>

fmt

25c5ffe

Signed-off-by: Anh-Uong <[email protected]>

Merge branch 'main' into remove-lm-head

da85a0e

Signed-off-by: Anh-Uong <[email protected]>

alex-jw-brooks approved these changes Jul 29, 2024

View reviewed changes

anhuong merged commit 537215f into foundation-model-stack:main Jul 29, 2024
7 checks passed

Abhishek-TAMU mentioned this pull request Sep 5, 2024

fix: remove lm_head post processing #333

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: remove lm_head for granite with llama arch models #258

fix: remove lm_head for granite with llama arch models #258

Ssukriti commented Jul 22, 2024 •

edited by anhuong

Loading

Ssukriti Jul 22, 2024

anhuong Jul 25, 2024

anhuong Jul 25, 2024

anhuong Jul 25, 2024

anhuong Jul 25, 2024

alex-jw-brooks Jul 26, 2024

alex-jw-brooks Jul 26, 2024

alex-jw-brooks Jul 26, 2024

anhuong Jul 26, 2024

alex-jw-brooks Jul 26, 2024

anhuong Jul 26, 2024

Ssukriti Jul 29, 2024

Ssukriti Jul 29, 2024 •

edited

Loading

anhuong Jul 29, 2024

anhuong Jul 29, 2024

	adapter_config_path = f"{last_checkpoint_path}/adapter_config.json"
	adapter_config_path = os.path.join(last_checkpoint_path, "adapter_config.json")

fix: remove lm_head for granite with llama arch models #258

fix: remove lm_head for granite with llama arch models #258

Conversation

Ssukriti commented Jul 22, 2024 • edited by anhuong Loading

Description of the change

Related issue number

How to verify the PR

granite-3b-code-base, fine tuning, multi-GPU

granite-3b-code, LoRA tuning, multi-GPU

granite-3b-code-base, prompt tuning, single-GPU

llama-13b-base, fine tuned, multi-GPU

Was the PR tested

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ssukriti Jul 29, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Ssukriti commented Jul 22, 2024 •

edited by anhuong

Loading

Ssukriti Jul 29, 2024 •

edited

Loading