ajtejankar / mixtral-vis-moeLinks
Visualize expert firing frequencies across sentences in the Mixtral MoE model
☆18Updated last year
Alternatives and similar repositories for mixtral-vis-moe
Users that are interested in mixtral-vis-moe are comparing it to the libraries listed below
Sorting:
- Data preparation code for CrystalCoder 7B LLM☆45Updated last year
- Storing long contexts in tiny caches with self-study☆67Updated last week
- Lightweight toolkit package to train and fine-tune 1.58bit Language models☆80Updated last month
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆36Updated last year
- ☆41Updated 2 weeks ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment☆57Updated 9 months ago
- Verifiers for LLM Reinforcement Learning☆60Updated 2 months ago
- ☆51Updated 7 months ago
- Nexusflow function call, tool use, and agent benchmarks.☆20Updated 6 months ago
- ☆48Updated 11 months ago
- ☆47Updated 4 months ago
- A repository for research on medium sized language models.☆76Updated last year
- Self-host LLMs with LMDeploy and BentoML☆20Updated 2 weeks ago
- ☆15Updated 2 months ago
- [ICLR2025] Breaking Throughput-Latency Trade-off for Long Sequences with Speculative Decoding☆116Updated 6 months ago
- GoldFinch and other hybrid transformer components☆45Updated 11 months ago
- Public reports detailing responses to sets of prompts by Large Language Models.☆30Updated 5 months ago
- ☆35Updated last year
- Implementation of the paper: "Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention" from Google in pyTO…☆55Updated this week
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆65Updated last year
- ☆15Updated 2 months ago
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 9 months ago
- ☆63Updated 9 months ago
- ☆126Updated last year
- Data preparation code for Amber 7B LLM☆91Updated last year
- Simple implementation of Speculative Sampling in NumPy for GPT-2.☆95Updated last year
- The official repo for "LLoCo: Learning Long Contexts Offline"☆117Updated last year
- Repository for CPU Kernel Generation for LLM Inference☆26Updated last year
- ☆51Updated 7 months ago