-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
10 changed files
with
2,374 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1 +1,67 @@ | ||
# tiva-kg | ||
# TIVA-KG: A Multimodal Knowledge Graph with Text, Image, Video and Audio | ||
|
||
## Knowledge Graph | ||
TIVA-KG is a brand new multimodal knowledge graph equipped with **multiple modalities** and **triplet grounding**, granting it superior expressing ability. | ||
|
||
 | ||
|
||
If you are interested in TIVA-KG itself, please check out our website at http://mn.cs.tsinghua.edu.cn/tivakg. This repo is instead focused on the Quadruple Embedding Baseline (QEB) model to exploit the knowledge in TIVA-KG. | ||
|
||
## Method | ||
|
||
QEB is a simple translational model as indicated by its name, yet it has the ability to exploit the novel features of TIVA-KG. | ||
|
||
As a translational model, QEB follows the basic rule: $h + r \approx t$. By implementing this rule with various combinations of structural and multimodal embeddings, we get some energy functions. | ||
|
||
 | ||
|
||
Putting them together, we can get the final training objective. | ||
|
||
$$ | ||
\begin{align} | ||
E(h, r, t) &= E_s + E_m + E_{CS} + E_{CM} + E_{MSM} + E_{SMS} \nonumber \\ | ||
&+ E_{MSS} + E_{SSM} + E_{SMM},+E_{MMS}. | ||
\end{align} | ||
$$ | ||
|
||
$$ | ||
\begin{align} | ||
L_{\textsf{head}} = \sum\nolimits_{(h,r,t)\in T} \sum\nolimits_{(h,r,t')\in T_{\textsf{tail}}'} max(\gamma + E(h,r,t) - E(h,r,t'), 0), | ||
\end{align} | ||
$$ | ||
|
||
$$ | ||
\begin{align} | ||
L_{\textsf{tail}} = \sum\nolimits_{(h,r,t)\in T} \sum\nolimits_{(h',r,t)\in T_{\textsf{head}}'} max(\gamma + E(t,-r,h) - E(t,-r,h'), 0), | ||
\end{align} | ||
$$ | ||
|
||
These are the results from our experiments: | ||
 | ||
|
||
## Usage | ||
Prepare the environment with: | ||
```bash | ||
conda create -n tiva-kg python=2.7 tensorflow-gpu=1.14 | ||
conda activate tiva-kg | ||
``` | ||
|
||
Prepare data by downloading from https://mailtsinghuaeducn-my.sharepoint.com/:f:/g/personal/autogl_mail_tsinghua_edu_cn/EudDw-AAwVlFnndC6swJKtQBLiFIFpWIB9kmbt_Gnh6DQw?e=bvsiDC. You will notice an experiment.tar.gz which contains experiment related files, but running the code also requires to have downloaded mmkg_data.tar.gz. | ||
Then modify parameters.py and test_parameters.py to fit where you put these files. | ||
|
||
Modify parameters.py and test_parameters.py to get your desired settings. Use model_id to identify different settings. | ||
|
||
To run the experiment, simply: | ||
```bash | ||
python train.py | ||
python test.py | ||
``` | ||
|
||
## Citation | ||
```text | ||
@article{wang2023tiva, | ||
title={TIVA-KG: A Multimodal Knowledge Graph with Text, Image, Video and Audio}, | ||
author={Wang, Xin and Meng, Benyuan and Chen, Hong and Meng, Yuan and Lv, Ke and Zhu, Wenwu}, | ||
year={2023} | ||
} | ||
``` |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
import tensorflow as tf | ||
import os | ||
|
||
mapping_size = 100 | ||
relation_structural_embeddings_size = 40 | ||
entity_structural_embeddings_size = 40 | ||
#entity_multimodal_embeddings_size = 300 | ||
use_text = True | ||
entity_text_embeddings_size = 300 | ||
use_image = True | ||
entity_image_embeddings_size = 2048 | ||
#relation_image_embeddings_size = 2048 | ||
use_video = True | ||
entity_video_embeddings_size = 8 * 2048 | ||
#relation_audio_embeddings_size = 8 * 16 * 2048# + 8 * 2048 | ||
use_audio = True | ||
audio_duration = 50 | ||
audio_per_frame_size = 128 | ||
entity_audio_embeddings_size = audio_per_frame_size * audio_duration | ||
#relation_video_embeddings_size = 128 * audio_duration | ||
|
||
use_rel_mm = True | ||
|
||
#nr_neuron_dense_layer_sum = 100 | ||
nr_neuron_dense_layer_1 = 2048 | ||
nr_neuron_dense_layer_2 = 1024 | ||
dropout_ratio = 0.0 | ||
margin = 10 | ||
training_epochs = 1000 | ||
batch_size = 100 | ||
display_step = 1 | ||
activation_function = tf.nn.tanh | ||
initial_learning_rate = 0.001 | ||
head_mult = [1 for _ in range(5)] + [1, 1, 1, 1, 1] | ||
tail_mult = [1 for _ in range(5)] + [1, 1, 1, 0, 0] | ||
|
||
|
||
# Loading the data | ||
|
||
all_triples_file = "/path/to/all.txt" #" | ||
train_triples_file = "/path/to/train.txt" # | ||
test_triples_file = "/path/to/test.txt" | ||
valid_triples_file = "/path/to/valid.txt" | ||
|
||
entity_full_info = '/path/to/entities.json' | ||
relation_full_info = '/path/to/relations.json' | ||
|
||
structure_embedding_file = '/path/to/structure.hdf5' | ||
entity2id = '/path/to/entity2id.json' | ||
relation2id = '/path/to/relation2id.json' | ||
text_embedding_file = '/path/to/my_embedding.hdf5' | ||
multimodal_embedding_file = '/path/to/all.hdf5' | ||
|
||
|
||
|
||
#model_id = "FBIMG_HMS_MM128_dropout0_m10_tanh_mapped_1_layer_02" #_mm_loss_10m" #"HMS_standard_vgg128_noreg" #"HMS_standard_full_mapping_elu_300_100" | ||
model_id = 'tiva' | ||
|
||
checkpoint_best_valid_dir = "weights/best_"+model_id+"/" | ||
checkpoint_current_dir ="weights/current_"+model_id+"/" | ||
results_dir = "results/results_"+model_id+"/" | ||
|
||
if not os.path.exists(checkpoint_best_valid_dir): | ||
os.makedirs(checkpoint_best_valid_dir) | ||
|
||
if not os.path.exists(checkpoint_current_dir): | ||
os.makedirs(checkpoint_current_dir) | ||
|
||
|
||
if not os.path.exists(results_dir): | ||
os.makedirs(results_dir) | ||
|
||
|
||
model_current_weights_file = checkpoint_current_dir + model_id + "_current" | ||
current_model_meta_file = checkpoint_current_dir + model_id + "_current.meta" | ||
|
||
model_weights_best_valid_file = checkpoint_best_valid_dir + model_id + "_best_hits" | ||
best_valid_model_meta_file = checkpoint_best_valid_dir + model_id + "_best_hits.meta" | ||
|
||
|
||
result_file = results_dir+model_id+"_results.txt" | ||
log_file = results_dir+model_id+"_log.txt" | ||
loss_file = results_dir+model_id+"_loss.json" | ||
|
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.