matthewrenze / jhu-concise-cot
The Benefits of a Concise Chain of Thought on Problem Solving in Large Language Models
☆21Updated 4 months ago
Alternatives and similar repositories for jhu-concise-cot:
Users that are interested in jhu-concise-cot are comparing it to the libraries listed below
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆39Updated 2 months ago
- Using multiple LLMs for ensemble Forecasting☆16Updated last year
- ☆48Updated 5 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆67Updated 5 months ago
- BH hackathon☆14Updated last year
- Modified Beam Search with periodical restart☆12Updated 7 months ago
- entropix style sampling + GUI☆25Updated 5 months ago
- ☆20Updated last year
- ☆40Updated 2 months ago
- Using modal.com to process FineWeb-edu data☆20Updated last week
- A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.☆33Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated last year
- ☆45Updated 6 months ago
- ☆16Updated last month
- [WIP] Transformer to embed Danbooru labelsets☆13Updated last year
- ☆47Updated 2 months ago
- OpenMindedChatbot is a Proof Of Concept that leverages the power of Open source Large Language Models (LLM) with Function Calling capabil…☆29Updated last year
- ☆20Updated last year
- Lego for GRPO☆26Updated last week
- Writing Blog Posts with Generative Feedback Loops!☆47Updated last year
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆55Updated 7 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆35Updated 11 months ago
- ☆33Updated 9 months ago
- The first dense retrieval model that can be prompted like an LM☆68Updated 6 months ago
- ☆16Updated 10 months ago
- ☆13Updated 3 months ago
- Simple GRPO scripts and configurations.☆58Updated 2 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 9 months ago
- Data preparation code for CrystalCoder 7B LLM☆44Updated 11 months ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆21Updated last week