A framework for few-shot evaluation of language models.
-
Updated
Apr 29, 2024 - Python
A framework for few-shot evaluation of language models.
Test your prompts, models, and RAGs. Catch regressions and improve prompt quality. LLM evals for OpenAI, Azure, Anthropic, Gemini, Mistral, Llama, Bedrock, Ollama, and other local & private models with CI/CD integration.
The LLM Evaluation Framework
This is the repository of our article published in RecSys 2019 "Are We Really Making Much Progress? A Worrying Analysis of Recent Neural Recommendation Approaches" and of several follow-up studies.
LightEval is a lightweight LLM evaluation suite that Hugging Face has been using internally with the recently released LLM data processing library datatrove and LLM training library nanotron.
Open-Source Evaluation for GenAI Application Pipelines
A research library for automating experiments on Deep Graph Networks
AI Data Management & Evaluation Platform
Metrics to evaluate the quality of responses of your Retrieval Augmented Generation (RAG) applications.
Expressive is a cross-platform expression parsing and evaluation framework. The cross-platform nature is achieved through compiling for .NET Standard so it will run on practically any platform.
PySODEvalToolkit: A Python-based Evaluation Toolbox for Salient Object Detection and Camouflaged Object Detection
Python SDK for running evaluations on LLM generated responses
Evaluation suite for large-scale language models.
Test and evaluate LLMs, prompts and other configuration, across all the scenarios that matter for your application
Optical Flow Dataset and Benchmark for Visual Crowd Analysis
Official repository of RankEval: An Evaluation and Analysis Framework for Learning-to-Rank Solutions.
BIRL: Benchmark on Image Registration methods with Landmark validations
LiDAR SLAM comparison and evaluation framework
The implementation of the paper "Evaluating Coherence in Dialogue Systems using Entailment"
This repository allows you to evaluate a trained computer vision model and get general information and evaluation metrics with little configuration.
Add a description, image, and links to the evaluation-framework topic page so that developers can more easily learn about it.
To associate your repository with the evaluation-framework topic, visit your repo's landing page and select "manage topics."