Released on August 19, 2016
This is the demo implementation of Shizhan Zhu et al.'s ECCV-16 work Deep Cascaded Bi-Network for Face Hallucination. We appologize for not providing the training code because the training process actually requires lots of manual efforts for analyzing and manipulating the intermediate result, and hence it is difficult to provide a one-script training code. But we are happy to share any training details. Please write emails to Shizhan Zhu [email protected]
for details if you wish.
The project is open source under BSD-3 license (see the LICENSE
file). Codes can be used freely only for academic purpose. If you want to apply it to industrial products, please send an email to Shizhan Zhu at [email protected]
first.
If you use the codes as part of your research project, please cite our work as follows:
@inproceedings{zhu2016deep,
title={Deep Cascaded Bi-Network for Face Hallucination},
author={Zhu, Shizhan and Liu, Sifei and Loy, Chen Change and Tang, Xiaoou},
booktitle={European Conference on Computer Vision},
year={2016}
}
The codes are based on caffe.
This implementation has been modified toward a pure deep solution with slightly more robust results, also for the convenience of code release. The original implementation used the internal SIFT API when aligning faces. Hence we no longer provide its VLFeat retrained demo version for aligning faces and the codes are only depedent to caffe.
- Install caffe. Please note that matlab binary should also be compiled.
- Copy all the folders and files from this repo into the installed caffe, e.g. put folder
codes
in the root directory, put folderexamples/sr1
in theexamples
folder, and all other things in the root directory. - Run the script
initial.sh
to obtain the models and the provided test data. You can of course use your own test data by putting them into the folderexamples/sr1/demo/image_source
. - Get into the the directory of
examples/sr1/demo
and in Matlab, rundemoCBN.m
to view results.
Note that the main algorithm presented in our paper is only run by CBN.m.
demoCBN.m only provides a way to generate the input LR samples. In demoCBN.m we begin from HR image only in order to get the VJ face detection box. If you want to test LR faces (not downsampled from HR) then you need to think of your own way to get the VJ face detection box.
In reality, you can feed any LR faces to CBN.m and view the output. What you need to be careful is that all the facial parts need to be presented in the input LR images (including face contour). On the other hand, the face cannot be too small (smaller than 5pxIOD). This is why it is recommendted to provide the input just in the same way as in demoCBN.m.
One of probably the most interesting findings of this paper might be related to the so-called ghosting effect. To give an illustration, run the following codes (after finishing Step 1-3 of the Installation part).
>> clear; addpath(genpath('../../../matlab')); addpath(genpath('../../../codes'));
>> h4 = CBN_ghosting(ones([16 16 1 2],'single')/3);
>> subplot(121); imshow(ones([16 16])/3); title('Input to CBN');
>> subplot(122); imshow(h4(:,:,:,1)); title('Output (Ghosting Effect)');
And you would expect to see the following input and output.
Enjoy! :P
The code can be run on Unix Matlab with version lower or equal to R2014b. We appologize for the inconvenience caused. The problem is in the image rigid transformation functions and they are not readjusted to the new functions like imwarp
begin from R2015a. We will give it refactored soon.
The current input face size in the demo is 5pxIOD. In our paper we claimed the input size is better not smaller than 5pxIOD. If the input face size is between 5pxIOD and 10pxIOD, we still go through the whole hallucination process (its slightly higher resolution compared to 5pxIOD is useful for face alignment but for hallucination we still need to go from 5pxIOD). If the input face size is larger than 10pxIOD, we would suggest to first perform face alignment and then use the fixed alignment result for hallucination (We observed that some alignment approaches like CFSS with its model trained for the VJ-detector would suffice that resolution). All the face alignment network as well as the first cascade of hallucination would thus not be gone through.
In training, we use Dr. Yuanjun Xiong et al's modified version of caffe. We would like to thank for their wonderful job!
Suggestions and opinions of this work (both positive and negative) are greatly welcome. Please contact the author by sending email to [email protected]
.
BSD-3, see LICENSE
file for details.