HIPPO

Source code for paper: HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization

Overview

We propose HIPPO, which represents tables using both text and image, and optimizes MLLMs to effectively learn more comprehensive table information from these multiple modalities.

Specifically, HIPPO samples model responses from hybrid-modal table representations and designs a modality-consistent sampling strategy to enhance response diversity and mitigate modality bias during DPO training.

Environment Setup

Clone the repository

git clone https://github.com/NEUIR/HIPPO.git
cd HIPPO

Install Dependencies

conda create -n hippo python=3.10
conda activate hippo
pip install -r requirments.txt

Data Preparation

Download the MMTab Image

# test
wget https://huggingface.co/datasets/SpursgoZmy/MMTab/resolve/main/MMTab-eval_table_images_23K.zip
mv MMTab-eval_table_images_23K.zip hippo/
unzip MMTab-eval_table_images_23K.zip

# train
wget https://huggingface.co/datasets/SpursgoZmy/MMTab/resolve/main/MMTab-instruct_table_images_82K.zip
mv MMTab-instruct_table_images_82K.zip
unzip MMTab-instruct_table_images_82K.zip

Reproduce

Train HIPPO

You can download the checkpoint of HIPPO directly from here or go to the scripts and train the HIPPO model.

For Training, you need to download the model MiniCPM-V-2.6 and data. Then you can go to the scripts to construct DPO data.

cd scripts
bash construct_dpo_data.bash

You can also use constructed data directly: dpo_data.

Then you can train the model.

cd scripts
bash train.bash

Inference HIPPO

For Inference, you can go to the scripts and inference on the HIPPO model:

cd scripts
bash inference.sh

Evaluation

For evaluation, you can use src/eval/MMTab_evaluation.ipynb to evaluate the performance.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
figs		figs
scripts		scripts
src		src
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HIPPO

Overview

Environment Setup

Data Preparation

Reproduce

Train HIPPO

Inference HIPPO

Evaluation

About

Releases

Packages

Languages

License

NEUIR/HIPPO

Folders and files

Latest commit

History

Repository files navigation

HIPPO

Overview

Environment Setup

Data Preparation

Reproduce

Train HIPPO

Inference HIPPO

Evaluation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages