Expression Domain Translation Network for Cross-domain Head Reenactment
Taewoong Kang*1, Jeongsik Oh*2, Jaeseong Lee1, Sunghyun Park2, Jaegul Choo2
1Korea University, 2KAISTAbstract: Despite the remarkable advancements in head reenactment, the existing methods face challenges in cross-domain head reenactment, which aims to transfer human motions to domains outside the human, including cartoon characters. It is still difficult to extract motion from out-of-domain images due to the distinct appearances, such as large eyes. Recently, previous work introduced a large-scale anime dataset called AnimeCeleb and a cross-domain head reenactment model including an optimization-based mapping function to translate the human domain’s expressions to the anime domain. However, we found that the mapping function, which relies on a subset of expressions, imposes limitations on the mapping of various expressions. To solve this challenge, we introduce a novel expression domain translation network that transforms human expressions into anime expressions. Specifically, to maintain the geometric consistency of expressions between the input and output of the expression domain translation network, we employ a 3D geometric-aware loss function that reduces the distances between the vertices in the 3D mesh of the input and output. By doing so, it forces high-fidelity and one-to-one mapping with respect to two cross-expression domains. Our method outperforms existing methods in both qualitative and quantitative analysis, marking a significant advancement in the field of cross-domain head reenactment.
Since our code includes the DECA Encoder, refer to this link to complete the environment setup. Especially, Getting Started/Requirements.
In addition, we need the DECA code along with our code.
git clone https://github.com/YadiraF/DECA
cd DECA
pip install -r requirements.txt
Or use virtual environment by runing
git clone https://github.com/YadiraF/DECA
cd DECA
bash install_conda.sh
Also, you have to download the released model as pretrained model. You can find the download link from the link's Training/start training/released model
We train our model using AnimeCeleb and VoxCeleb.
AnimeCeleb dataset can be downloaded by submiting the form. After downloading, you have to specify a root directory of AnimeCeleb dataset in configuration file.
-
The dataset is preprocessed following the method used in FOMM. You can follow the instructions in their repository to download and pre-process videos.
-
For DECA dataset, we have to go through several steps. First, replace
../DECA/decalib/deca.py
with ourdeca_edit/deca.py
, and additionally add ourdeca_edit/make_dataset_iter.py
to../DECA/demos
. Then, please edit make_dataset_iter.py with your vox dataset root and run it.
The final folder is with format as:
${DATASET_ROOT_FOLDER}
└───images
└───train
└───xxx.mp4
└───0000000.png
└───0000001.png
...
...
└───test
└───xxx.mp4
└───0000000.png
└───0000001.png
...
...
└───deca
└───train
└───xxx.mp4
└───0000000.mat
└───0000001.mat
...
...
└───test
└───xxx.mp4
└───0000000.mat
└───0000001.mat
...
...
We conduct experiments on the AnimeCeleb and Vox datasets with resolution of 512 x 512. For convenience, we provide our trained network used in our experiments in the following links.
Anime_generator Style | Google Drive |
---|---|
Default | link |
Cartoon | link |
You can get pre-trained network of Animo here.
We need to train the Expression domain translation network (EDTN) and the Anime generator separately.
To train EDTN:
python train_EDTN.py --hyper_id=train_1 gpu=[0]
To train Anime generator:
python ../Animo/train_AG.py --config-name=Anime_generator.yaml gpu=[0,1]
Now, after training the two networks, we can create a gif using infer.py
. Before executing, make sure the paths containing each of the weights are correctly set.
python infer.py --path_pose='./example/vox_deca' --anime_basis_path='./example/basis.png' --exp_path='./pretrained/Anime_generator_2d'
If you want to put your video in the model, you can use our demo.
If you find this work useful for your research, please cite our paper:
@misc{kang2023expression,
title={Expression Domain Translation Network for Cross-domain Head Reenactment},
author={Taewoong Kang and Jeongsik Oh and Jaeseong Lee and Sunghyun Park and Jaegul Choo},
year={2023},
eprint={2310.10073},
archivePrefix={arXiv},
primaryClass={cs.CV}
}