jdeschena / sdtt
[ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models
☆16Updated 3 weeks ago
Alternatives and similar repositories for sdtt:
Users that are interested in sdtt are comparing it to the libraries listed below
- CycleQD is a framework for parameter space model merging.☆31Updated 3 weeks ago
- Japanese LLaMa experiment☆52Updated 2 months ago
- ☆22Updated last year
- ☆26Updated 9 months ago
- Mamba training library developed by kotoba technologies☆67Updated last year
- ☆25Updated 3 months ago
- Unofficial entropix impl for Gemma2 and Llama and Qwen2 and Mistral☆17Updated last month
- Checkpointable dataset utilities for foundation model training☆32Updated last year
- ☆47Updated 2 months ago
- Code for Discovering Preference Optimization Algorithms with and for Large Language Models☆55Updated 8 months ago
- ☆15Updated 11 months ago
- ☆15Updated 5 months ago
- GPT-4 を用いて、言語モデルの応答を自動評価するスクリプト☆16Updated 8 months ago
- ☆50Updated last year
- Preferred Generation Benchmark☆72Updated this week
- ☆14Updated 10 months ago
- 0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" i…☆96Updated 11 months ago
- This project uses llama.cpp as an LLM server to perform inference and generate speech using Synthetic voice library☆22Updated 11 months ago
- ☆22Updated last year
- A Slack Bot for summarizing arXiv papers, powered by OpenAI LLMs.☆69Updated last year
- ☆25Updated 2 years ago
- Webブラウザから手軽にローカルLLMとおしゃべりできるソフトウェアです。☆27Updated last year
- ☆14Updated 5 months ago
- A command-line tool that uses Gemini API to generate summaries of academic papers.☆42Updated this week
- Mixtral-based Ja-En (En-Ja) Translation model☆18Updated last month
- ☆12Updated 7 months ago
- Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch☆20Updated 11 months ago
- 【2024年版】BERTによるテキスト分類☆29Updated 7 months ago