Save the `preprocessor_config.json` and `chat_template.json` for mllama model after conversion #741

wukaixingxp · 2024-10-21T17:36:17Z

This PR fixed the previous bug and save the preprocessor_config.json and chat_template.json for mllama model by instantiating the processor and use save_pretrained().

Fixes # (issue)

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

conversion works

(llama) [[email protected] ~/work/llama-recipes (fsdp_lmm)]$ python src/llama_recipes/inference/checkpoint_converter_fsdp_hf.py --fsdp_checkpoint_path finetuned_model/fine-tuned-meta-llama/Llama-3.2-11B-Vision-Instruct/ --consolidated_model_path hf_converted
/home/kaiwu/work/llama-recipes/src/llama_recipes/model_checkpointing/checkpoint_handler.py:17: DeprecationWarning: `torch.distributed._shard.checkpoint` will be deprecated, use `torch.distributed.checkpoint` instead
  from torch.distributed._shard.checkpoint import (
Model name: meta-llama/Llama-3.2-11B-Vision-Instruct
model is loaded from config
/home/kaiwu/work/llama-recipes/src/llama_recipes/model_checkpointing/checkpoint_handler.py:259: FutureWarning: `load_state_dict` is deprecated and will be removed in future versions. Please use `load` instead.
  dist_cp.load_state_dict(
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/torch/distributed/checkpoint/filesystem.py:657: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  torch.load(cast(IO[bytes], file_slice), map_location="cpu"),
Sharded state checkpoint loaded from finetuned_model/fine-tuned-meta-llama/Llama-3.2-11B-Vision-Instruct/
model is loaded from FSDP checkpoints
HuggingFace model checkpoints has been saved in hf_converted

inference now works

(llama) [[email protected] ~/work/llama-recipes (fsdp_lmm)]$ python recipes/quickstart/inference/local_inference/multi_modal_infer.py --image_path "./dog.jpg" --prompt_text "Describe this image" --temperature 0.5 --top_p 0.8 --model_name ./hf_converted/
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:08<00:00,  1.10it/s]
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:601: UserWarning: `do_sample` is set to `False`. However, `temperature` is set to `0.5` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `temperature`.
  warnings.warn(
/home/kaiwu/miniconda3/envs/llama/lib/python3.10/site-packages/transformers/generation/configuration_utils.py:606: UserWarning: `do_sample` is set to `False`. However, `top_p` is set to `0.8` -- this flag is only used in sample-based generation modes. You should set `do_sample=True` or unset `top_p`.
  warnings.warn(
Generated Text: end_header_id|>

This image depicts a small dog standing on a skateboard. The dog is a small breed with a white face, brown ears, and a brown body with black and gray patches. It has white paws and a black collar. The dog is standing on a skateboard with red wheels. The background is out of focus, but it appears to be a street with a blue door.<|eot_id|>

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Thanks for contributing 🎉!

wukaixingxp added 2 commits October 18, 2024 14:38

Merge branch 'main' into fsdp_lmm

6af3619

fixed the missing processor after conversion

bd31680

wukaixingxp requested a review from init27 October 21, 2024 17:36

facebook-github-bot added the cla signed label Oct 21, 2024

init27 approved these changes Oct 21, 2024

View reviewed changes

wukaixingxp added 2 commits October 21, 2024 11:14

fixed deadlinks

945886b

fixed wordlist

2845421

wukaixingxp merged commit d8b0eba into main Oct 21, 2024
3 checks passed

wukaixingxp mentioned this pull request Oct 21, 2024

Support converting fine-tuned llama 3.2 vision model to HF format and then local inference #737

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Save the `preprocessor_config.json` and `chat_template.json` for mllama model after conversion #741

Save the `preprocessor_config.json` and `chat_template.json` for mllama model after conversion #741

wukaixingxp commented Oct 21, 2024

Save the preprocessor_config.json and chat_template.json for mllama model after conversion #741

Save the preprocessor_config.json and chat_template.json for mllama model after conversion #741

Conversation

wukaixingxp commented Oct 21, 2024

Feature/Issue validation/testing

Before submitting

Save the `preprocessor_config.json` and `chat_template.json` for mllama model after conversion #741

Save the `preprocessor_config.json` and `chat_template.json` for mllama model after conversion #741