lechmazur / generalization
Thematic Generalization Benchmark: measures how effectively various LLMs can infer a narrow or specific "theme" (category/rule) from a small set of examples and anti-examples, then detect which item truly fits that theme among a collection of misleading candidates.
☆41Updated last week
Alternatives and similar repositories for generalization:
Users that are interested in generalization are comparing it to the libraries listed below
- Glyphs, acting as collaboratively defined symbols linking related concepts, add a layer of multidimensional semantic richness to user-AI …☆49Updated last month
- ☆24Updated 2 months ago
- Adding a multi-text multi-speaker script (diffe) that is based on a script from asiff00 on issue 61 for Sesame: A Conversational Speech G…☆21Updated this week
- A Python library to orchestrate LLMs in a neural network-inspired structure☆46Updated 5 months ago
- LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…☆32Updated last week
- an auto-sleeping and -waking framework around llama.cpp☆11Updated last month
- Create text chunks which end at natural stopping points without using a tokenizer☆26Updated 2 weeks ago
- Conduct in-depth research with AI-driven insights : DeepDive is a command-line tool that leverages web searches and AI models to generate…☆39Updated 7 months ago
- Attend - to what matters.☆14Updated last month
- Open source tool for transcirption and subtitling, alternative to happyscribe.☆25Updated last month
- Benchmark evaluating LLMs on their ability to create and resist disinformation. Includes comprehensive testing across major models (Claud…☆24Updated last week
- ☆45Updated 2 weeks ago
- ☆29Updated 3 months ago
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆71Updated 6 months ago
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆42Updated last week
- Generate a wiki for your research topic, sourcing from the web and your docs.☆44Updated 3 weeks ago
- The hearth of The Pulsar App, fast, secure and shared inference with modern UI☆56Updated 3 months ago
- Groq-powered MAD: The first work to explore Multi-Agent Debate with Large Language Models :D☆11Updated 8 months ago
- SPLAA is an AI assistant framework that utilizes voice recognition, text-to-speech, and tool-calling capabilities to provide a conversati…☆25Updated 2 months ago
- ☆15Updated last week
- ☆17Updated 3 months ago
- One Line To Build Zero-Data Classifiers in Minutes☆36Updated 6 months ago
- Distributed Inference for mlx LLm☆87Updated 7 months ago
- Use smol agents to do research and then update csv coumns with its findings.☆37Updated last month
- My version of an LLM Websearch Agent using a local SearXNG server because SearXNG is great.☆28Updated 3 weeks ago
- GPT-4 Level Conversational QA Trained In a Few Hours☆59Updated 7 months ago
- AI conflict resolution framework designed to work alongside existing AI orchestration tools☆23Updated 3 months ago
- A bot that checks your grammar and phrasing using LLM of choice☆29Updated last month
- Serving LLMs in the HF-Transformers format via a PyFlask API☆71Updated 6 months ago
- A command-line utility to manage MLX models between your Hugging Face cache and LM Studio.☆32Updated last month