☆12Feb 16, 2024Updated 2 years ago
Alternatives and similar repositories for ExpertTokenRouting
Users that are interested in ExpertTokenRouting are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆22Dec 11, 2024Updated last year
- The implementation of “Fine-tuning Graph Neural Networks by Preserving Graph Generative Patterns”☆18Jun 18, 2024Updated last year
- Implementation of ICLR' 25 paper “Multi-Label Node Classification with Label Influence Propagation".☆17Feb 28, 2025Updated last year
- LLMRouterBench: A Massive Benchmark and Unified Framework for LLM Routing☆44Jan 30, 2026Updated last month
- Code of GraphAdapter☆17Mar 21, 2024Updated 2 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- Website for HKU NLP group (under construction)☆14Updated this week
- The implementation of "Beyond Homophily: Structure-aware Path Aggregation Graph Neural Network".☆22Apr 17, 2023Updated 2 years ago
- [ICML 2024] Self-Infilling Code Generation☆18May 5, 2024Updated last year
- [EMNLP 2022] RLET: A Reinforcement Learning Based Approach for Explainable QA with Entailment Trees☆11Jul 15, 2023Updated 2 years ago
- ☆19May 2, 2024Updated last year
- AlphaGo Zero Clone☆17Mar 15, 2020Updated 6 years ago
- ☆15Jul 9, 2025Updated 8 months ago
- Official codebase for “In-Context Learning with Many Demonstration Examples”☆16Feb 13, 2023Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- Official code for the paper CipherDAug: Ciphertext based Data Augmentation for Neural Machine Translation published at ACL 2022 main conf…☆12Apr 6, 2023Updated 2 years ago
- ☆21Mar 18, 2026Updated last week
- ☆21Oct 30, 2023Updated 2 years ago
- Official implementation for 'Division-of-Thoughts: Harnessing Hybrid Language Model Synergy for Efficient LLM Reasoning'☆26Feb 18, 2025Updated last year
- ☆12Jan 31, 2024Updated 2 years ago
- Code for "SCL-RAI: Span-based Contrastive Learning with Retrieval Augmented Inference for Unlabeled Entity Problem in NER" @COLING-2022☆11Aug 20, 2022Updated 3 years ago
- Dataset and baseline for Coling 2022 long paper (oral): "ConFiguRe: Exploring Discourse-level Chinese Figures of Speech"☆13Jul 27, 2023Updated 2 years ago
- Code for ACL22 short Paper "Hierarchical Curriculum Learning for AMR Parsing"☆13Jun 1, 2022Updated 3 years ago
- (ICLR25 Oral) Do as We Do, Not as You Think: the Conformity of Large Language Models☆43Feb 6, 2026Updated last month
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- my commonly-used tools☆64Jan 7, 2025Updated last year
- Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation, ICML 2024☆22Jun 26, 2024Updated last year
- ☆18Mar 3, 2025Updated last year
- Rationale-enhanced language models are better continual relation learners (EMNLP 2023 Main Conference)☆12Oct 11, 2023Updated 2 years ago
- ☆145May 2, 2024Updated last year
- Code for ACL 2025 Main paper "Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning…☆48Aug 4, 2025Updated 7 months ago
- ☆71Oct 23, 2025Updated 5 months ago
- RLA is a tool for managing your RL experiments automatically☆32Jan 11, 2025Updated last year
- code for Preprint paper at Arxiv: MoT: Pre-thinking and Recalling Enable ChatGPT to Self-Improve with Memory-of-Thoughts☆24Nov 29, 2023Updated 2 years ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- The official code for our EMNLP 2022 long paper [Breaking the Representation Bottleneck of Chinese Characters: Neural Machine Translation…☆26Sep 10, 2025Updated 6 months ago
- WIKIGENBENCH: Exploring Full-length Wikipedia Generation under Real-World Scenario (COLING 2025)☆12Jan 5, 2025Updated last year
- Modified Beam Search with periodical restart☆12Sep 12, 2024Updated last year
- Code for Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image Captioning☆16Apr 5, 2024Updated last year
- simpleR1: A Simple Framework for Training R1-like Models☆30Aug 12, 2025Updated 7 months ago
- [CVPR 2024 Accepted] TaskWeave: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection☆30Sep 26, 2024Updated last year
- [ICLR 2025] "GraphRouter: A Graph-based Router for LLM Selections", Tao Feng, Yanzhen Shen, Jiaxuan You☆62Dec 30, 2025Updated 2 months ago