microsoft / RedStone
The RedStone repository includes code for preparing extensive datasets used in training large language models.
☆113Updated last month
Alternatives and similar repositories for RedStone:
Users that are interested in RedStone are comparing it to the libraries listed below
- Offical Repo for "Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale"☆226Updated 3 weeks ago
- Reformatted Alignment☆114Updated 5 months ago
- ☆262Updated 7 months ago
- A visuailzation tool to make deep understaning and easier debugging for RLHF training.☆164Updated 3 weeks ago
- Skywork-MoE: A Deep Dive into Training Techniques for Mixture-of-Experts Language Models☆128Updated 9 months ago
- ☆101Updated 3 months ago
- [ICLR 2025] The official implementation of paper "ToolGen: Unified Tool Retrieval and Calling via Generation"☆130Updated 2 weeks ago
- code for Scaling Laws of RoPE-based Extrapolation☆70Updated last year
- Implementation of the LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens Paper☆127Updated 7 months ago
- ☆309Updated 5 months ago
- ☆180Updated 3 weeks ago
- [ACL 2024 Findings] MathBench: A Comprehensive Multi-Level Difficulty Mathematics Evaluation Dataset☆96Updated 8 months ago
- ☆28Updated 6 months ago
- Mixture-of-Experts (MoE) Language Model☆185Updated 6 months ago
- ☆141Updated 8 months ago
- [ACL 2024 Demo] Official GitHub repo for UltraEval: An open source framework for evaluating foundation models.☆233Updated 4 months ago
- ☆81Updated 10 months ago
- ☆91Updated 2 months ago
- FuseAI Project☆83Updated last month
- Codes for the paper "∞Bench: Extending Long Context Evaluation Beyond 100K Tokens": https://arxiv.org/abs/2402.13718☆312Updated 5 months ago
- ☆131Updated last month
- Code implementation of synthetic continued pretraining☆93Updated 2 months ago