Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix some small bugs in report-factor loop #152

Merged
merged 85 commits into from
Aug 2, 2024
Merged
Changes from 1 commit
Commits
Show all changes
85 commits
Select commit Hold shift + click to select a range
49541c6
Init todo
you-n-g Jul 17, 2024
f3b097e
update all code
WinstonLiyt Jul 18, 2024
61dc8ce
update
WinstonLiyt Jul 18, 2024
6481278
Extract factors from financial reports loop finished
WinstonLiyt Jul 19, 2024
16e3b3a
Merge branch 'main' of https://github.com/microsoft/RD-Agent into fix…
WinstonLiyt Jul 19, 2024
f2d031e
Merge branch 'main' of https://github.com/microsoft/RD-Agent into fix…
WinstonLiyt Jul 19, 2024
ce30c04
Fix two small bugs.
WinstonLiyt Jul 19, 2024
aa2ffac
Delete rdagent/app/qlib_rd_loop/run_script.sh
WinstonLiyt Jul 19, 2024
cecb4c5
Minor mod
you-n-g Jul 19, 2024
61d352f
Delete rdagent/app/qlib_rd_loop/nohup.out
you-n-g Jul 19, 2024
367a1ce
Fix a small bug in file reading.
WinstonLiyt Jul 22, 2024
7887905
some updates
WinstonLiyt Jul 22, 2024
d5f36d9
Update the detailed process and prompt of factor loop.
WinstonLiyt Jul 22, 2024
b4594ef
Merge branch 'main' into fix_some_errors_when_debug_factor
WinstonLiyt Jul 22, 2024
aa4c7e5
Evaluation & dataset
taozhiwang Jul 23, 2024
6d022b8
Optimize the prompt for generating hypotheses and feedback in the fac…
WinstonLiyt Jul 23, 2024
c51a6f0
Generate new data
taozhiwang Jul 23, 2024
90bd7e3
dataset generation
taozhiwang Jul 24, 2024
4fd9733
Performed further optimizations on the factor loop and report extract…
WinstonLiyt Jul 24, 2024
1da2635
Merge branch 'main' into fix_some_errors_when_debug_factor
WinstonLiyt Jul 24, 2024
1d66f16
Update rdagent/components/coder/factor_coder/CoSTEER/evaluators.py
you-n-g Jul 24, 2024
b1bdfdd
Update package.txt for fitz.
WinstonLiyt Jul 24, 2024
50a8ff0
Merge branch 'fix_some_errors_when_debug_factor' of https://github.co…
WinstonLiyt Jul 24, 2024
864f5a0
add the result
taozhiwang Jul 24, 2024
048c6fe
Performed further optimizations on the factor loop and report extract…
WinstonLiyt Jul 24, 2024
f9b57b9
Analysis
taozhiwang Jul 24, 2024
b9d9194
Optimized log output.
WinstonLiyt Jul 24, 2024
9218e5f
Merge branch 'fix_some_errors_when_debug_factor' of https://github.co…
WinstonLiyt Jul 24, 2024
ec5cc64
Merge branch 'fix_some_errors_when_debug_factor' into main
WinstonLiyt Jul 24, 2024
db82b67
Factor update
taozhiwang Jul 24, 2024
dcb7e07
Optimized log output.
WinstonLiyt Jul 24, 2024
265b6b3
A draft of the "Quick Start" section for README
WinstonLiyt Jul 24, 2024
39282eb
Merge branch 'main' of https://github.com/microsoft/RD-Agent into doc…
WinstonLiyt Jul 24, 2024
68f0a75
Add scenario descriptions.
WinstonLiyt Jul 24, 2024
52dc938
Updates
taozhiwang Jul 25, 2024
11980dc
Adjust content
you-n-g Jul 25, 2024
12c0eba
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 25, 2024
c9809f2
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
98906af
Enable logging of backtesting in Qlib and store rich-text description…
WinstonLiyt Jul 25, 2024
b97f24f
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 25, 2024
b7a04c2
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
702c830
Reformat analysis.py
taozhiwang Jul 25, 2024
ac80c93
CI fix
taozhiwang Jul 25, 2024
eb1c04e
Refactor
you-n-g Jul 25, 2024
f9295e0
remove useless code
you-n-g Jul 25, 2024
cab4f46
Merge branch 'benchmark'
taozhiwang Jul 25, 2024
d2770c6
fix bugs (#111)
SH-Src Jul 25, 2024
f4d553a
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
22b176b
Fix two small bugs.
WinstonLiyt Jul 25, 2024
26f2f74
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 25, 2024
f44e4ae
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 25, 2024
fb1478e
Fix a merge bug.
WinstonLiyt Jul 25, 2024
09e2d88
Fix two small bugs.
WinstonLiyt Jul 26, 2024
33b70e2
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 26, 2024
cf568a5
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 26, 2024
05869ce
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 26, 2024
9c64f14
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 26, 2024
787450c
fix some bugs.
WinstonLiyt Jul 29, 2024
b36e1cf
Fix some format bugs.
WinstonLiyt Jul 29, 2024
3e42a7b
Restore a file.
WinstonLiyt Jul 29, 2024
87dba2d
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 29, 2024
fb28226
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 29, 2024
ad7d18d
Fix a format bug.
WinstonLiyt Jul 29, 2024
9384937
draft renew of evaluators
WinstonLiyt Jul 30, 2024
557f3a7
fix a small bug.
WinstonLiyt Jul 30, 2024
a06d7f4
fix a small bug
WinstonLiyt Jul 30, 2024
13df05e
Support Factor Report Loop
you-n-g Jul 30, 2024
0e7a90f
Update framework for extracting factors from research reports.
WinstonLiyt Jul 30, 2024
5860055
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Jul 30, 2024
5f5675a
Merge branch 'main' into docs_and_demo
WinstonLiyt Jul 30, 2024
2a07947
Refactor report-based factor extraction and fix minor bugs.
WinstonLiyt Aug 1, 2024
f591636
fix a small bug of log.
WinstonLiyt Aug 1, 2024
4bdb1de
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Aug 1, 2024
4f743b2
Merge branch 'main' into docs_and_demo
WinstonLiyt Aug 1, 2024
34f335a
change some prompts
WinstonLiyt Aug 1, 2024
6dc4369
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Aug 1, 2024
a8cb022
Merge branch 'main' into docs_and_demo
WinstonLiyt Aug 1, 2024
f7046b0
improve factor_runner
WinstonLiyt Aug 2, 2024
ea5e114
fix a small bug
WinstonLiyt Aug 2, 2024
2ef60e9
change some prompts
WinstonLiyt Aug 2, 2024
6fd15a7
cancel some comments
WinstonLiyt Aug 2, 2024
a883a78
cancel some comments and fix some bugs
WinstonLiyt Aug 2, 2024
daf21b0
Merge branch 'main' of https://github.com/microsoft/RD-Agent into main
WinstonLiyt Aug 2, 2024
101bac2
Merge branch 'main' into docs_and_demo
WinstonLiyt Aug 2, 2024
fd6f44b
fix some bugs in factor from reports loop
WinstonLiyt Aug 2, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
improve factor_runner
  • Loading branch information
WinstonLiyt committed Aug 2, 2024
commit f7046b0f2666ec3dcfba4fe62ae488a486dfdf92
30 changes: 30 additions & 0 deletions rdagent/scenarios/qlib/developer/factor_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,32 @@ class QlibFactorRunner(CachedRunner[QlibFactorExperiment]):
- results in `mlflow`
"""

def calculate_information_coefficient(
self, concat_feature: pd.DataFrame, SOTA_feature_column_size: int, new_feature_columns_size: int
) -> pd.DataFrame:
res = pd.Series(index=range(SOTA_feature_column_size * new_feature_columns_size))
for col1 in range(SOTA_feature_column_size):
for col2 in range(SOTA_feature_column_size, SOTA_feature_column_size + new_feature_columns_size):
res.loc[col1 * new_feature_columns_size + col2 - SOTA_feature_column_size] = concat_feature.iloc[
:, col1
].corr(concat_feature.iloc[:, col2])
return res

def deduplicate_new_factors(self, SOTA_feature: pd.DataFrame, new_feature: pd.DataFrame) -> pd.DataFrame:
# calculate the IC between each column of SOTA_feature and new_feature
# if the IC is larger than a threshold, remove the new_feature column
# return the new_feature

concat_feature = pd.concat([SOTA_feature, new_feature], axis=1)
IC_max = (
concat_feature.groupby("datetime")
.apply(lambda x: self.calculate_information_coefficient(x, SOTA_feature.shape[1], new_feature.shape[1]))
.mean()
)
IC_max.index = pd.MultiIndex.from_product([range(SOTA_feature.shape[1]), range(new_feature.shape[1])])
IC_max = IC_max.unstack().max(axis=0)
return new_feature.iloc[:, IC_max[IC_max < 0.99].index]

def develop(self, exp: QlibFactorExperiment) -> QlibFactorExperiment:
"""
Generate the experiment by processing and combining factor data,
Expand Down Expand Up @@ -66,11 +92,15 @@ def develop(self, exp: QlibFactorExperiment) -> QlibFactorExperiment:

# Combine the SOTA factor and new factors if SOTA factor exists
if SOTA_factor is not None and not SOTA_factor.empty:
new_factors = self.deduplicate_new_factors(SOTA_factor, new_factors)
if new_factors.empty:
raise FactorEmptyError("No valid factor data found to merge.")
combined_factors = pd.concat([SOTA_factor, new_factors], axis=1).dropna()
else:
combined_factors = new_factors

# Sort and nest the combined factors under 'feature'
# print(combined_factors)
combined_factors = combined_factors.sort_index()
combined_factors = combined_factors.loc[:, ~combined_factors.columns.duplicated(keep='last')]
new_columns = pd.MultiIndex.from_product([["feature"], combined_factors.columns])
Expand Down
Loading