yuanzhoulvpi2017 / mamba4transformersLinks

☆13

Alternatives and similar repositories for mamba4transformers

Users that are interested in mamba4transformers are comparing it to the libraries listed below

Sorting:

hkgc-1 / GHPO
☆38Updated last month
Wangmerlyn / MCTS-GSM8k-Demo
This is a repo for showcasing using MCTS with LLMs to solve gsm8k problems
☆87Updated 5 months ago
USTC-StarTeam / ZIP
☆24Updated last year
Dereck0602 / Awesome_Test_Time_LLMs
☆120Updated 5 months ago
waltonfuture / Diff-eRank
[NeurIPS 2024] A Novel Rank-Based Metric for Evaluating Large Language Models
☆52Updated 3 months ago
Open-Source-O1 / o1_Reasoning_Patterns_Study
☆103Updated 8 months ago
kyegomez / Mixture-of-Depths
Implementation of the paper: "Mixture-of-Depths: Dynamically allocating compute in transformer-based language models"
☆104Updated last week
DIRECT-BIT / SRA-MCTS
☆33Updated 2 months ago
THUDM / T1
RL Scaling and Test-Time Scaling (ICML'25)
☆112Updated 7 months ago
zitian-gao / one-shot-em
One-shot Entropy Minimization
☆180Updated 2 months ago
GraphPKU / Case_or_Rule
exploring whether LLMs perform case-based or rule-based reasoning
☆30Updated last year
thu-coai / SPaR
☆46Updated 2 months ago
starrYYxuan / LeCo
This the implementation of LeCo
☆31Updated 7 months ago
AwesomeSeq / Comba-triton
☆48Updated 2 months ago
abdelfattah-lab / SplitReason
☆18Updated 2 months ago
chuanyang-Zheng / DAPE
The this is the official implementation of "DAPE: Data-Adaptive Positional Encoding for Length Extrapolation"
☆39Updated 10 months ago
SkyworkAI / skywork-o1-prm-inference
☆65Updated 9 months ago
ZhenweiAn / Dynamic_MoE
Inference Code for Paper "Harder Tasks Need More Experts: Dynamic Routing in MoE Models"
☆62Updated last year
RyanLiu112 / GenPRM
Official codebase for "GenPRM: Scaling Test-Time Compute of Process Reward Models via Generative Reasoning".
☆81Updated 3 months ago
liutianlin0121 / decoding-time-realignment
Implementation of "Decoding-time Realignment of Language Models", ICML 2024.
☆19Updated last year
YeFD / RRAG
The official Github repository for paper "R^2AG: Incorporating Retrieval Information into Retrieval Augmented Generation" (EMNLP 2024 Fin…
☆35Updated 8 months ago
yayayacc / MUR
☆45Updated last month
wenyudu / MIGU
[EMNLP 2024 Findings] Unlocking Continual Learning Abilities in Language Models
☆25Updated 10 months ago
YuchuanTian / RethinkTinyLM
[ICML'24] The official implementation of “Rethinking Optimization and Architecture for Tiny Language Models”
☆123Updated 7 months ago
RUCAIBox / EASYEP
☆20Updated 4 months ago
zhaochenyang20 / Prompt2Model-Self-Guide
SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper
☆33Updated last year
Zanette-Labs / SpeculativeRejection
[NeurIPS 2024] Fast Best-of-N Decoding via Speculative Rejection
☆51Updated 10 months ago
GCYZSL / MoLA
☆151Updated last year
codefuse-ai / Collinear-Constrained-Attention
☆63Updated last year
TsinghuaC3I / Fourier-Position-Embedding
[ICML 2025] Fourier Position Embedding: Enhancing Attention’s Periodic Extension for Length Generalization
☆90Updated 3 months ago