Added code for human-eval benchmark
Showing
- benchmarks/human-eval/README.md 28 additions, 0 deletionsbenchmarks/human-eval/README.md
- benchmarks/human-eval/core/__init__.py 2 additions, 0 deletionsbenchmarks/human-eval/core/__init__.py
- benchmarks/human-eval/core/evaluation.py 67 additions, 0 deletionsbenchmarks/human-eval/core/evaluation.py
- benchmarks/human-eval/core/prompts.py 14 additions, 0 deletionsbenchmarks/human-eval/core/prompts.py
- benchmarks/human-eval/eval_llama.py 66 additions, 0 deletionsbenchmarks/human-eval/eval_llama.py
- benchmarks/human-eval/eval_phi.py 72 additions, 0 deletionsbenchmarks/human-eval/eval_phi.py
- benchmarks/human-eval/requirements.txt 7 additions, 0 deletionsbenchmarks/human-eval/requirements.txt
Loading
Please register or sign in to comment