Skip to content

Commit 8792651

Browse files
authored
Add the HIST and IGMTF model on Alpha360 (#1040)
* Commit the code of HIST and IGMTF on Alpha360 * add stock index * Update README.md * delete useless code * fix the bug of code format with black * fix pylint bugs * fix the bugs of pylint * fix pylint bugs * fix flake8
1 parent 7bfc7e1 commit 8792651

11 files changed

+1149
-0
lines changed

README.md

+3
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
Recent released features
1212
| Feature | Status |
1313
| -- | ------ |
14+
| HIST and IGMTF models | :chart_with_upwards_trend: [Released](https://github.com/microsoft/qlib/pull/1040) on Apr 10, 2022 |
1415
| Qlib notebook tutorial | 📖 [Released](https://github.com/microsoft/qlib/pull/1037) on Apr 7, 2022 |
1516
| Ibovespa index data | :rice: [Released](https://github.com/microsoft/qlib/pull/990) on Apr 6, 2022 |
1617
| Point-in-Time database | :hammer: [Released](https://github.com/microsoft/qlib/pull/343) on Mar 10, 2022 |
@@ -339,6 +340,8 @@ Here is a list of models built on `Qlib`.
339340
- [TCN based on pytorch (Shaojie Bai, et al. 2018)](examples/benchmarks/TCN/)
340341
- [ADARNN based on pytorch (YunTao Du, et al. 2021)](examples/benchmarks/ADARNN/)
341342
- [ADD based on pytorch (Hongshun Tang, et al.2020)](examples/benchmarks/ADD/)
343+
- [IGMTF based on pytorch (Wentao Xu, et al.2021)](examples/benchmarks/IGMTF/)
344+
- [HIST based on pytorch (Wentao Xu, et al.2021)](examples/benchmarks/HIST/)
342345
343346
Your PR of new Quant models is highly welcomed.
344347

examples/benchmarks/HIST/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
# HIST
2+
* Code: [https://github.com/Wentao-Xu/HIST](https://github.com/Wentao-Xu/HIST)
3+
* Paper: [HIST: A Graph-based Framework for Stock Trend Forecasting via Mining Concept-Oriented Shared InformationAdaRNN: Adaptive Learning and Forecasting for Time Series](https://arxiv.org/abs/2110.13716).
Binary file not shown.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
pandas==1.1.2
2+
numpy==1.21.0
3+
scikit_learn==0.23.2
4+
torch==1.7.0
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
qlib_init:
2+
provider_uri: "~/.qlib/qlib_data/cn_data"
3+
region: cn
4+
market: &market csi300
5+
benchmark: &benchmark SH000300
6+
data_handler_config: &data_handler_config
7+
start_time: 2008-01-01
8+
end_time: 2020-08-01
9+
fit_start_time: 2008-01-01
10+
fit_end_time: 2014-12-31
11+
instruments: *market
12+
infer_processors:
13+
- class: RobustZScoreNorm
14+
kwargs:
15+
fields_group: feature
16+
clip_outlier: true
17+
- class: Fillna
18+
kwargs:
19+
fields_group: feature
20+
learn_processors:
21+
- class: DropnaLabel
22+
- class: CSRankNorm
23+
kwargs:
24+
fields_group: label
25+
label: ["Ref($close, -2) / Ref($close, -1) - 1"]
26+
port_analysis_config: &port_analysis_config
27+
strategy:
28+
class: TopkDropoutStrategy
29+
module_path: qlib.contrib.strategy
30+
kwargs:
31+
signal:
32+
- <MODEL>
33+
- <DATASET>
34+
topk: 50
35+
n_drop: 5
36+
backtest:
37+
start_time: 2017-01-01
38+
end_time: 2020-08-01
39+
account: 100000000
40+
benchmark: *benchmark
41+
exchange_kwargs:
42+
limit_threshold: 0.095
43+
deal_price: close
44+
open_cost: 0.0005
45+
close_cost: 0.0015
46+
min_cost: 5
47+
task:
48+
model:
49+
class: HIST
50+
module_path: qlib.contrib.model.pytorch_hist
51+
kwargs:
52+
d_feat: 6
53+
hidden_size: 64
54+
num_layers: 2
55+
dropout: 0
56+
n_epochs: 200
57+
lr: 1e-4
58+
early_stop: 20
59+
metric: ic
60+
loss: mse
61+
base_model: LSTM
62+
model_path: "benchmarks/LSTM/model_lstm_csi300.pkl"
63+
stock2concept: "benchmarks/HIST/qlib_csi300_stock2concept.npy"
64+
stock_index: "benchmarks/HIST/qlib_csi300_stock_index.npy"
65+
GPU: 0
66+
dataset:
67+
class: DatasetH
68+
module_path: qlib.data.dataset
69+
kwargs:
70+
handler:
71+
class: Alpha360
72+
module_path: qlib.contrib.data.handler
73+
kwargs: *data_handler_config
74+
segments:
75+
train: [2008-01-01, 2014-12-31]
76+
valid: [2015-01-01, 2016-12-31]
77+
test: [2017-01-01, 2020-08-01]
78+
record:
79+
- class: SignalRecord
80+
module_path: qlib.workflow.record_temp
81+
kwargs:
82+
model: <MODEL>
83+
dataset: <DATASET>
84+
- class: SigAnaRecord
85+
module_path: qlib.workflow.record_temp
86+
kwargs:
87+
ana_long_short: False
88+
ann_scaler: 252
89+
- class: PortAnaRecord
90+
module_path: qlib.workflow.record_temp
91+
kwargs:
92+
config: *port_analysis_config

examples/benchmarks/IGMTF/README.md

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
# IGMTF
2+
* Code: [https://github.com/Wentao-Xu/IGMTF](https://github.com/Wentao-Xu/IGMTF)
3+
* Paper: [IGMTF: An Instance-wise Graph-based Framework for
4+
Multivariate Time Series Forecasting](https://arxiv.org/abs/2109.06489).
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
pandas==1.1.2
2+
numpy==1.21.0
3+
scikit_learn==0.23.2
4+
torch==1.7.0
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,89 @@
1+
qlib_init:
2+
provider_uri: "~/.qlib/qlib_data/cn_data"
3+
region: cn
4+
market: &market csi300
5+
benchmark: &benchmark SH000300
6+
data_handler_config: &data_handler_config
7+
start_time: 2008-01-01
8+
end_time: 2020-08-01
9+
fit_start_time: 2008-01-01
10+
fit_end_time: 2014-12-31
11+
instruments: *market
12+
infer_processors:
13+
- class: RobustZScoreNorm
14+
kwargs:
15+
fields_group: feature
16+
clip_outlier: true
17+
- class: Fillna
18+
kwargs:
19+
fields_group: feature
20+
learn_processors:
21+
- class: DropnaLabel
22+
- class: CSRankNorm
23+
kwargs:
24+
fields_group: label
25+
label: ["Ref($close, -2) / Ref($close, -1) - 1"]
26+
port_analysis_config: &port_analysis_config
27+
strategy:
28+
class: TopkDropoutStrategy
29+
module_path: qlib.contrib.strategy
30+
kwargs:
31+
model: <MODEL>
32+
dataset: <DATASET>
33+
topk: 50
34+
n_drop: 5
35+
backtest:
36+
start_time: 2017-01-01
37+
end_time: 2020-08-01
38+
account: 100000000
39+
benchmark: *benchmark
40+
exchange_kwargs:
41+
limit_threshold: 0.095
42+
deal_price: close
43+
open_cost: 0.0005
44+
close_cost: 0.0015
45+
min_cost: 5
46+
task:
47+
model:
48+
class: IGMTF
49+
module_path: qlib.contrib.model.pytorch_igmtf
50+
kwargs:
51+
d_feat: 6
52+
hidden_size: 64
53+
num_layers: 2
54+
dropout: 0
55+
n_epochs: 200
56+
lr: 1e-4
57+
early_stop: 20
58+
metric: ic
59+
loss: mse
60+
base_model: LSTM
61+
model_path: "benchmarks/LSTM/model_lstm_csi300.pkl"
62+
GPU: 0
63+
dataset:
64+
class: DatasetH
65+
module_path: qlib.data.dataset
66+
kwargs:
67+
handler:
68+
class: Alpha360
69+
module_path: qlib.contrib.data.handler
70+
kwargs: *data_handler_config
71+
segments:
72+
train: [2008-01-01, 2014-12-31]
73+
valid: [2015-01-01, 2016-12-31]
74+
test: [2017-01-01, 2020-08-01]
75+
record:
76+
- class: SignalRecord
77+
module_path: qlib.workflow.record_temp
78+
kwargs:
79+
model: <MODEL>
80+
dataset: <DATASET>
81+
- class: SigAnaRecord
82+
module_path: qlib.workflow.record_temp
83+
kwargs:
84+
ana_long_short: False
85+
ann_scaler: 252
86+
- class: PortAnaRecord
87+
module_path: qlib.workflow.record_temp
88+
kwargs:
89+
config: *port_analysis_config

examples/benchmarks/README.md

+3
Original file line numberDiff line numberDiff line change
@@ -65,6 +65,9 @@ The numbers shown below demonstrate the performance of the entire `workflow` of
6565
| GATs (Petar Velickovic, et al.) | Alpha360 | 0.0476±0.00 | 0.3508±0.02 | 0.0598±0.00 | 0.4604±0.01 | 0.0824±0.02 | 1.1079±0.26 | -0.0894±0.03 |
6666
| TCTS(Xueqing Wu, et al.) | Alpha360 | 0.0508±0.00 | 0.3931±0.04 | 0.0599±0.00 | 0.4756±0.03 | 0.0893±0.03 | 1.2256±0.36 | -0.0857±0.02 |
6767
| TRA(Hengxu Lin, et al.) | Alpha360 | 0.0485±0.00 | 0.3787±0.03 | 0.0587±0.00 | 0.4756±0.03 | 0.0920±0.03 | 1.2789±0.42 | -0.0834±0.02 |
68+
| IGMTF(Wentao Xu, et al.) | Alpha360 | 0.0480±0.00 | 0.3589±0.02 | 0.0606±0.00 | 0.4773±0.01 | 0.0946±0.02 | 1.3509±0.25 | -0.0716±0.02 |
69+
| HIST(Wentao Xu, et al.) | Alpha360 | 0.0522±0.00 | 0.3530±0.01 | 0.0667±0.00 | 0.4576±0.01 | 0.0987±0.02 | 1.3726±0.27 | -0.0681±0.01 |
70+
6871

6972
- The selected 20 features are based on the feature importance of a lightgbm-based model.
7073
- The base model of DoubleEnsemble is LGBM.

0 commit comments

Comments
 (0)