Pretrain、decay、SFT a CodeLLM from scratch 🧙♂️
☆40May 15, 2024Updated last year
Alternatives and similar repositories for WizardLearner
Users that are interested in WizardLearner are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Template for ACM-ICPC☆14Feb 18, 2022Updated 4 years ago
- EVE: Efficient Vision-Language Pre-training with Masked Prediction and Modality-Aware MoE☆10Mar 1, 2024Updated 2 years ago
- Explore what LLMs are really leanring over SFT☆28Mar 30, 2024Updated last year
- ☆18Apr 10, 2025Updated 11 months ago
- EMNLP 2025 | RouterLens☆29Sep 15, 2025Updated 6 months ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- exploring whether LLMs perform case-based or rule-based reasoning☆30Mar 2, 2024Updated 2 years ago
- [ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)☆186Feb 17, 2025Updated last year
- Fully open reproduction of DeepSeek-R1☆11Mar 24, 2025Updated last year
- 上海大学本科生毕业论文Typst模板☆92Dec 5, 2025Updated 3 months ago
- TOD-Flow: Modeling the Structure of Task-Oriented Dialogues☆13Feb 7, 2024Updated 2 years ago
- Contains the model patches and the eval logs from the passing swe-bench-lite run.☆10Jun 28, 2024Updated last year
- A collection of some awesome public projects about LLM-based Web Agents and Tools.☆12Apr 25, 2024Updated last year
- Official Implementation for the paper "VisCodex: Unified Multimodal Code Generation via Merging Vision and Coding Models"☆22Aug 14, 2025Updated 7 months ago
- Curation of resources for LLM mathematical reasoning, most of which are screened by @tongyx361 to ensure high quality and accompanied wit…☆152Jul 12, 2024Updated last year
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- PyTorch implementation of TinyWASE described in our paper "Compressing Speaker Extraction Model with Ultra-low Precision Quantization and…☆11Jun 28, 2021Updated 4 years ago
- 实现一个自己的小语言模型☆11Jun 15, 2024Updated last year
- ☆41Jun 19, 2024Updated last year
- Travel time prediction from GPS observations using an HMM☆11Jan 4, 2023Updated 3 years ago
- ☆35Sep 14, 2024Updated last year
- ☆51Mar 9, 2026Updated 2 weeks ago
- [EMNLP 2024] FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents☆22Jan 6, 2025Updated last year
- ☆10Jul 11, 2022Updated 3 years ago
- Codebase for Paper Reusing Embeddings: Reproducible Reward Model Research in Large Language Model Alignment without GPUs☆22Apr 24, 2025Updated 11 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- 💩里淘金☆44Updated this week
- Open Source WizardCoder Dataset☆166Jul 12, 2023Updated 2 years ago
- A spoken version of the textual story cloze benchmark☆21Aug 6, 2023Updated 2 years ago
- ☆19Jun 14, 2024Updated last year
- Fine-Tune LLM Synthetic-Data application and "From Data to AGI: Unlocking the Secrets of Large Language Model"☆16Jul 5, 2024Updated last year
- 刹那是永恒☆13Feb 26, 2020Updated 6 years ago
- A Benchmark and Evaluation Suite for Zero-shot Singing Voice Synthesis☆24Feb 11, 2026Updated last month
- ☆25Nov 19, 2025Updated 4 months ago
- Synthetic data generation for evaluating LLM symbolic and logic reasoning☆22Mar 6, 2026Updated 2 weeks ago
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- ☆13Sep 12, 2024Updated last year
- Reproducing several bandwidth-based traffic signal coordination models (including MaxBand, MultiBand, etc.)☆12Sep 18, 2020Updated 5 years ago
- ☆29Aug 30, 2024Updated last year
- ☆17Jul 10, 2023Updated 2 years ago
- Tensorflow implementation of DeepMind's Tacotron-2 (without wavenet)☆11Jul 12, 2019Updated 6 years ago
- ☆26Feb 13, 2026Updated last month
- 基于文本相似度的win10智能客服问答系统☆16Mar 12, 2020Updated 6 years ago