SalesforceAIResearch / CRMArenaLinks

Official Repo for CRMArena and CRMArena-Pro

☆104

Alternatives and similar repositories for CRMArena

Users that are interested in CRMArena are comparing it to the libraries listed below

Sorting:

miralab-ai / autoreason
☆40Updated 7 months ago
microsoft / llm-steer-instruct
A method for steering llms to better follow instructions
☆48Updated 3 weeks ago
salesforce / summary-of-a-haystack
Codebase accompanying the Summary of a Haystack paper.
☆79Updated 10 months ago
TIGER-AI-Lab / StructLM
Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)
☆75Updated 9 months ago
weaviate / structured-rag
Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models
☆111Updated 3 months ago
Columbia-NLP-Lab / PAPILLON
Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles
☆53Updated 2 months ago
SalesforceAIResearch / SFR-RAG
☆77Updated 6 months ago
zetaalphavector / RAGElo
RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker
☆114Updated 3 weeks ago
microsoft / eureka-ml-insights
A framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings.
☆165Updated last week
facebookresearch / collaborative-reasoner
Source code for the collaborative reasoner research project at Meta FAIR.
☆99Updated 3 months ago
HishamAlyahya / semantic_backprop
Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖
☆72Updated 8 months ago
orionw / promptriever
The first dense retrieval model that can be prompted like an LM
☆81Updated 2 months ago
oriyor / assistantbench
Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"
☆59Updated 7 months ago
weaviate-tutorials / Hurricane
Writing Blog Posts with Generative Feedback Loops!
☆50Updated last year
automix-llm / automix
Mixing Language Models with Self-Verification and Meta-Verification
☆105Updated 7 months ago
apple / ml-superposition-prompting
☆145Updated last year
zbambergerNLP / strategic-debate-tot
A DSPy-based implementation of the tree of thoughts method (Yao et al., 2023) for generating persuasive arguments
☆87Updated 10 months ago
rhyang2021 / SELFGOAL
Source code for our paper: "SelfGoal: Your Language Agents Already Know How to Achieve High-level Goals".
☆68Updated last year
bespokelabsai / verifiers
Verifiers for LLM Reinforcement Learning
☆68Updated 3 months ago
aymeric-roucher / agent_reasoning_benchmark
🔧 Compare how Agent systems perform on several benchmarks. 📊🚀
☆99Updated 9 months ago
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆49Updated last year
patronus-ai / Lynx-hallucination-detection
☆41Updated last year
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 6 months ago
yale-nlp / SciArena
Analysis code for paper "SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks"
☆45Updated last month
wang-research-lab / agentinstruct
Code repo for "Agent Instructs Large Language Models to be General Zero-Shot Reasoners"
☆115Updated 10 months ago
padas-lab-de / ir-rag-sigir24-persona-rag
☆47Updated 10 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆81Updated this week
sony / talkhier
Official Repo for The Paper "Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems"
☆56Updated 5 months ago
Pleias / Pleias-RAG-Library
Python library to use Pleias-RAG models
☆61Updated 3 months ago
ali-bahrainian / RAG_best_practices
☆93Updated 4 months ago