ajtejankar / mixtral-vis-moe
Visualize expert firing frequencies across sentences in the Mixtral MoE model
☆17Updated last year
Alternatives and similar repositories for mixtral-vis-moe
Users that are interested in mixtral-vis-moe are comparing it to the libraries listed below
Sorting:
- Data preparation code for CrystalCoder 7B LLM☆44Updated last year
- [ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …☆40Updated last year
- ☆48Updated 6 months ago
- ☆43Updated 3 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 4 months ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆52Updated 3 months ago
- ☆72Updated 2 weeks ago
- A fast, local, and secure approach for training LLMs for coding tasks using GRPO with WebAssembly and interpreter feedback.☆22Updated last month
- Nexusflow function call, tool use, and agent benchmarks.☆19Updated 5 months ago
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆22Updated 7 months ago
- ☆54Updated this week
- Implementation of nougat that focuses on processing pdf locally.☆81Updated 3 months ago
- ☆27Updated last week
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 8 months ago
- ☆56Updated this week
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆116Updated 5 months ago
- ☆15Updated last month
- distill chatGPT coding ability into small model (1b)☆29Updated last year
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer☆43Updated last year
- A toolkit for fine-tuning, inferencing, and evaluating GreenBitAI's LLMs.☆83Updated 2 months ago
- ☆33Updated 10 months ago
- Simple repository for training small reasoning models☆27Updated 3 months ago
- ☆46Updated 9 months ago
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- Pre-training code for CrystalCoder 7B LLM☆54Updated last year
- Experiments on speculative sampling with Llama models☆126Updated last year
- EvaByte: Efficient Byte-level Language Models at Scale☆92Updated 2 weeks ago
- Training hybrid models for dummies.☆21Updated 3 months ago
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆54Updated 5 months ago