This repository has been archived by the owner on Jun 9, 2024. It is now read-only.
v0.0.1
·
1039 commits
to master
since this release
What's Changed
- First commit for AutoGPT Benchmarks by @dschonholtz in #1
- Typo in README.md by @ambujpawar in #2
- Remove the submodule, reference OpenAI directly rather than running it on the command line, fix logging by @dschonholtz in #16
- Update README.md by @dschonholtz in #17
- Graphs for evals by @rihp in #20
- windows docs make workspace if not there by @dschonholtz in #25
- EvalNames with dates for the eval run filename and compatibility with 0.3.0 by @dschonholtz in #26
- init first challenge template by @ScarletPan in #34
- start fixtures, types, challenge creation, mock run (stable by @SilenNaihin in #37
- Add automatic regression markers by @SilenNaihin in #38
- MockManager, mock_func in data.json by @SilenNaihin in #39
- addition of basic challenges, easier challenge creation, --mock flag, adding mini-agi by @SilenNaihin in #40
- Update README.md by @SilenNaihin in #41
- adding hook to integrate agnostically by @SilenNaihin in #42
- Integrate one challenge to auto gpt by @merwanehamadi in #44
- Add static linters ci by @merwanehamadi in #45
- Run regression tests on push to master and stable by @merwanehamadi in #46
- Integrate with gpt engineer by @merwanehamadi in #47
- Integrate smol developer with agbenchmark by @merwanehamadi in #48
- Explain how to benchmark new agents by @merwanehamadi in #49
- local runs, home_path config, submodule miniagi by @SilenNaihin in #50
- Add retrieval challenge test + run tests on CI pipeline by @merwanehamadi in #51
- Add pr template by @merwanehamadi in #52
- Add information retrieval 3 by @merwanehamadi in #54
- Change test dependencies by @merwanehamadi in #55
- dynamic workspace path by @SilenNaihin in #56
- Add basic memory challenge by @merwanehamadi in #57
- Rename '--reg' flag to '--maintain' by @merwanehamadi in #58
- Add 'Remember multiple ids' memory challenge by @merwanehamadi in #59
- added caching based on file key by @SilenNaihin in #62
- Add 'remember ids with noise' challenge by @merwanehamadi in #61
- Add 'remember phrases with noise' challenge by @merwanehamadi in #63
- fix home_path, local mini-agi run works by @SilenNaihin in #64
- Add 'Debug simple typo with guidance' challenge by @merwanehamadi in #65
- Add "Debug code without guidance" challenge by @merwanehamadi in #66
- Get rid of get file path by using the data.json convention to store the challenge information by @merwanehamadi in #67
- Print out all of stdout on each process poll. by @erik-megarad in #69
- Add .txt to memory challenges by @merwanehamadi in #70
- Fix memory challenge 2 by @merwanehamadi in #71
- Use artifacts out instead of python code by @merwanehamadi in #72
- i/o workspace, adding superagi by @SilenNaihin in #60
- fixing the incorrect addition of superagi by @SilenNaihin in #73
- quality of life improvements & fixes by @SilenNaihin in #75
- Fix debug code challenge by @merwanehamadi in #76
- Add gpt engineer to ci by @merwanehamadi in #78
- just json, no test files by @SilenNaihin in #77
- Combine all agents into one ci.yml by @merwanehamadi in #79
- adding search interface challenge and cleaning repo by @SilenNaihin in #80
- Add Helicone by @merwanehamadi in #81
- Add "Simple web server" challenge by @merwanehamadi in #74
- added --test, consolidate files, reports working by @SilenNaihin in #83
- Fix tests ci by @merwanehamadi in #82
- All Agents log to helicone automatically by @merwanehamadi in #85
- Fix Auto-GPT integration by adding python module as entrypoint by @merwanehamadi in #86
- Fix Auto-GPT looping forever by @merwanehamadi in #87
- Add custom properties to Helicone by @merwanehamadi in #91
- Enable cache again by @merwanehamadi in #92
- fixing backslashes, adding basic metrics by @SilenNaihin in #89
- Fix Smol developer and gpt engineer by @merwanehamadi in #93
- Remove dependencies cache by @merwanehamadi in #94
- Remove dependencies if a specific test is asked by the user by @merwanehamadi in #95
- Update submodules and upload artifacts by @merwanehamadi in #97
- Add basic code generation challenge by @merwanehamadi in #98
- Replace hidden files with custom python by @merwanehamadi in #99
- Start showing benchmark results by @merwanehamadi in #100
- Show Auto-GPT results by @merwanehamadi in #102
- Display smol-developer-results by @merwanehamadi in #103
- Display results per category by @merwanehamadi in #104
- Update auto gpt to current version of master by @merwanehamadi in #105
- Update Auto-GPT score by @merwanehamadi in #106
- Clean up workspace between each test by @erik-megarad in #109
- Add three sum challenge by @merwanehamadi in #108
- Fix ci by @merwanehamadi in #110
- Remove cache true on pr by @merwanehamadi in #111
- Dynamic cutoff and other quality of life by @SilenNaihin in #101
- Allow change location of reports by @merwanehamadi in #115
- Fix cutoff errors by @merwanehamadi in #116
- Fix pipes issue by @merwanehamadi in #117
- Update reports when pushing to master by @merwanehamadi in #162
- dynamic home path for runs by @SilenNaihin in #119
- internal_info.json dynamic changes by @SilenNaihin in #163
- file naming when --test by @SilenNaihin in #164
- Use report location by @merwanehamadi in #165
- fixing memory challenges, naming, testing mini-agi, smooth retrieval scaling by @SilenNaihin in #166
- Push reports to google drive by @merwanehamadi in #167
- Integrate Beebot by @merwanehamadi in #169
- Change beebot submodule by @merwanehamadi in #170
- Disable cache by @merwanehamadi in #174
- Kill subprocesses when test ends by @erik-megarad in #172
- Update beebot submodule by @merwanehamadi in #175
- Update submodules by @merwanehamadi in #176
- integrate baby-agi by @SilenNaihin in #168
- Publish pypi package by @merwanehamadi in #179
- Update publish_package.yml by @merwanehamadi in #180
- Make spreadsheet dynamic by @merwanehamadi in #181
- Update Helicone mitm to pin to a specific version by @merwanehamadi in #182
- Update permission package by @merwanehamadi in #183
- Change package version by @merwanehamadi in #184
New Contributors
- @dschonholtz made their first contribution in #1
- @ambujpawar made their first contribution in #2
- @rihp made their first contribution in #20
- @ScarletPan made their first contribution in #34
- @SilenNaihin made their first contribution in #37
- @erik-megarad made their first contribution in #69
Full Changelog: https://github.com/Significant-Gravitas/Auto-GPT-Benchmarks/commits/v0.0.1