paradite / eval-dataLinks
Evaluation data for LLMs and prompts on real world coding tasks and writing tasks.
☆12Updated this week
Alternatives and similar repositories for eval-data
Users that are interested in eval-data are comparing it to the libraries listed below
Sorting:
- The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models☆22Updated 6 months ago
- ☆9Updated last month
- Seamless Voice Interactions with LLMs☆12Updated last year
- ☆17Updated last year
- ☆26Updated last year
- The original BabyAGI, updated with LiteLLM and no vector database reliance (csv instead)☆21Updated 8 months ago
- Radiantloom Email Assist 7B is an email-assistant large language model fine-tuned from Zephyr-7B-Beta, over a custom-curated dataset of 1…☆14Updated last year
- [WIP] AI Try-On plugin for Chrome☆27Updated last year
- Project code for training LLMs to write better unit tests + code☆20Updated 2 weeks ago
- Apps that run on modal.com☆12Updated last year
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- Chrome Extension for YouTube. Acts as an assistant for the YouTube video you are watching☆23Updated 2 years ago
- ☆41Updated last year
- A seamless matchmaking application that is programmed with Cohere Command R+, Stanford NLP DSPy framework, Weaviate Vector store and Crew…☆59Updated last year
- ☆13Updated last year
- Writing Blog Posts with Generative Feedback Loops!☆48Updated last year
- a version of baby agi using dspy and typed predictors☆17Updated last year
- Leveraging DSPy for AI-driven task understanding and solution generation, the Self-Discover Framework automates problem-solving through r…☆60Updated 10 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- Lego for GRPO☆28Updated last week
- Testing paligemma2 finetuning on reasoning dataset☆18Updated 5 months ago
- Using modal.com to process FineWeb-edu data☆20Updated last month
- Comparing retrieval abilities from GPT4-Turbo and a RAG system on a toy example for various context lengths☆35Updated last year
- ☆11Updated 3 months ago
- entropix style sampling + GUI☆26Updated 7 months ago
- Example implementation of Iteration of Tought - Gives a star if you like the project☆41Updated 5 months ago
- A clone of OpenAI's Tokenizer page for HuggingFace Models☆45Updated last year
- ☆1Updated 10 months ago
- BH hackathon☆13Updated last year
- ☆31Updated last year