autogenai / easy-problems-that-llms-get-wrongLinks

☆46

Alternatives and similar repositories for easy-problems-that-llms-get-wrong

Users that are interested in easy-problems-that-llms-get-wrong are comparing it to the libraries listed below

Sorting:

ComposioHQ / Composio-Function-Calling-Benchmark
Function Calling Benchmark & Testing
☆87Updated last year
writer / writing-in-the-margins
☆118Updated 10 months ago
agokrani / distillKitPlus
Easy to use, High Performant Knowledge Distillation for LLMs
☆88Updated 2 months ago
argilla-io / argilla-cookbook
Simple examples using Argilla tools to build AI
☆53Updated 8 months ago
QuixiAI / kraken
☆66Updated last year
OpenPipe / deductive-reasoning
Train your own SOTA deductive reasoning model
☆99Updated 4 months ago
Xalp / ECHO
Official homepage for "Self-Harmonized Chain of Thought" (NAACL 2025)
☆91Updated 5 months ago
teknium1 / ShareGPT-Builder
☆115Updated 7 months ago
teknium1 / LLM-Benchmark-Logs
Just a bunch of benchmark logs for different LLMs
☆119Updated 11 months ago
l4b4r4b4b4 / AIDocks
LLM-Training-API: Including Embeddings & ReRankers, mergekit, LaserRMT
☆27Updated last year
casper-hansen / OpenCoconut
OpenCoconut implements a latent reasoning paradigm where we generate thoughts before decoding.
☆173Updated 6 months ago
zhudotexe / redel
ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)
☆82Updated 4 months ago
facebookresearch / matrix
Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…
☆73Updated 2 weeks ago
letta-ai / sleep-time-compute
accompanying material for sleep-time compute paper
☆97Updated 2 months ago
nyunAI / PruneGPT
☆52Updated last year
SinatrasC / entropix
Entropy Based Sampling and Parallel CoT Decoding
☆17Updated 9 months ago
facebookresearch / collaborative-reasoner
Source code for the collaborative reasoner research project at Meta FAIR.
☆95Updated 3 months ago
thomasgauthier / LoRD
Low-Rank adapter extraction for fine-tuned transformers models
☆173Updated last year
kerekovskik / autologic
autologic is a Python package that implements the SELF-DISCOVER framework proposed in the paper SELF-DISCOVER: Large Language Models Self…
☆60Updated last year
Cerebras / DocChat
GPT-4 Level Conversational QA Trained In a Few Hours
☆63Updated 10 months ago
arcee-ai / DAM
☆52Updated 8 months ago
sujitpal / llm-rag-eval
Large Language Model (LLM) powered evaluator for Retrieval Augmented Generation (RAG) pipelines.
☆29Updated last year
louisbrulenaudet / ragoon
High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡
☆66Updated 8 months ago
lamini-ai / Lamini-Memory-Tuning
Banishing LLM Hallucinations Requires Rethinking Generalization
☆276Updated last year
cognitivecomputations / grokadamw
☆134Updated 10 months ago
cognitivecomputations / spectrum
☆128Updated 3 months ago
facebookresearch / ExploreToM
Code for ExploreTom
☆84Updated 3 weeks ago
arcee-ai / EvolKit
EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language M…
☆229Updated 8 months ago
cognitivecomputations / dolphin-logger
☆101Updated last month
kubernetes-bad / reward-composer
Lego for GRPO
☆28Updated last month