Skip to content

Commit

Permalink
expts: assess performance of structured outputs (#291)
Browse files Browse the repository at this point in the history
* expts: assess structured outputs

* expts: assess structured outputs

* expts: set temp 0

* chore: clean up code
  • Loading branch information
shreyashankar authored Jan 29, 2025
1 parent 3ecb385 commit b0ded0e
Show file tree
Hide file tree
Showing 3 changed files with 393 additions and 1 deletion.
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,7 @@ website/*.tsbuildinfo
website/next-env.d.ts

# Docker
.docker/
.docker/

# experiments
experiments/*.json
14 changes: 14 additions & 0 deletions experiments/outputs.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
Results Table:
Experiment Results
╭────────────────────────────────────────────────┬───────┬────────────┬───────────┬────────┬───────┬─────────────┬──────────────╮
│ Model │ Doc % │ Approach │ Precision │ Recall │ F1 │ Avg Runtime │ Avg Cost ($) │
├────────────────────────────────────────────────┼───────┼────────────┼───────────┼────────┼───────┼─────────────┼──────────────┤
│ azure/gpt-4o-mini │ 10% │ structured │ 0.869 │ 0.872 │ 0.853 │ 1.100s │ $0.0004 │
│ azure/gpt-4o-mini │ 10% │ tool │ 0.914 │ 0.906 │ 0.891 │ 0.722s │ $0.0004 │
├────────────────────────────────────────────────┼───────┼────────────┼───────────┼────────┼───────┼─────────────┼──────────────┤
│ deepseek/deepseek-chat │ 10% │ structured │ 0.878 │ 0.889 │ 0.877 │ 2.094s │ $0.0003 │
│ deepseek/deepseek-chat │ 10% │ tool │ 0.867 │ 0.856 │ 0.860 │ 2.212s │ $0.0003 │
├────────────────────────────────────────────────┼───────┼────────────┼───────────┼────────┼───────┼─────────────┼──────────────┤
│ lm_studio/hugging-quants/llama-3.2-3b-instruct │ 10% │ structured │ 0.033 │ 0.022 │ 0.027 │ 33.635s │ $0.0000 │
│ lm_studio/hugging-quants/llama-3.2-3b-instruct │ 10% │ tool │ 0.000 │ 0.000 │ 0.000 │ 70.858s │ $0.0000 │
╰────────────────────────────────────────────────┴───────┴────────────┴───────────┴────────┴───────┴─────────────┴──────────────╯
Loading

0 comments on commit b0ded0e

Please sign in to comment.