☆52Feb 17, 2025Updated last year
Alternatives and similar repositories for InfiniRetri
Users that are interested in InfiniRetri are comparing it to the libraries listed below
Sorting:
- A simple and minimal open source implementation of "Introducing LFM2: The Fastest On-Device Foundation Models on the Market" from Liquid …☆23Feb 9, 2026Updated 3 weeks ago
- A few models converted from caffe to CoreMLs format.☆15Jun 6, 2017Updated 8 years ago
- GPT-jax based on the official huggingface library☆13Jun 22, 2021Updated 4 years ago
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Nov 15, 2025Updated 3 months ago
- ROSA+: RWKV's ROSA implementation with fallback statistical predictor☆34Oct 13, 2025Updated 4 months ago
- MEXMA: Token-level objectives improve sentence representations☆43Jan 6, 2025Updated last year
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- The source code for running LLMs on the AAAR-1.0 benchmark.☆18Apr 5, 2025Updated 11 months ago
- Modifying Large Language Models Post-training for Diverse Creative Writing☆52May 12, 2025Updated 9 months ago
- Code for the EMNLP24 paper "A simple and effective L2 norm based method for KV Cache compression."☆18Dec 13, 2024Updated last year
- ☆19Oct 2, 2023Updated 2 years ago
- Efficient PScan implementation in PyTorch☆17Jan 2, 2024Updated 2 years ago
- MFAQ: a Multilingual FAQ Dataset☆18Sep 17, 2023Updated 2 years ago
- Nexusflow function call, tool use, and agent benchmarks.☆30Dec 13, 2024Updated last year
- Official Implementation for NorMuon paper☆57Feb 9, 2026Updated 3 weeks ago
- This project is established for real-time training of the RWKV model.☆50May 17, 2024Updated last year
- A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.☆37Aug 27, 2025Updated 6 months ago
- A user-friendly Command & Control (C&C) web platform for remote monitoring, management, and task automation across multiple devices.☆14Dec 15, 2024Updated last year
- cpp write language detect model☆11Sep 22, 2021Updated 4 years ago
- NanoGPT-speedrunning for the poor T4 enjoyers☆73Apr 22, 2025Updated 10 months ago
- ☆50Sep 8, 2025Updated 5 months ago
- [EMNLP'24] LongHeads: Multi-Head Attention is Secretly a Long Context Processor☆31Apr 8, 2024Updated last year
- ☆31Jan 23, 2026Updated last month
- ☆29Jan 23, 2024Updated 2 years ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆92Oct 30, 2024Updated last year
- ☆41Apr 30, 2025Updated 10 months ago
- ☆10Feb 2, 2021Updated 5 years ago
- The official implementation of the paper "Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models".☆87Mar 25, 2025Updated 11 months ago
- Repository of IPBench☆19Jan 4, 2026Updated 2 months ago
- Python SDK for Permit.io: Plug & Play Application Level Authorization☆16Sep 25, 2025Updated 5 months ago
- ☆91Aug 18, 2024Updated last year
- ☆52Mar 18, 2025Updated 11 months ago
- ☆34Mar 12, 2025Updated 11 months ago
- "Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?"☆38Nov 13, 2024Updated last year
- ☆14Jun 10, 2025Updated 8 months ago
- GBM implementation on Legate☆14Jan 28, 2026Updated last month
- ☆11Jul 17, 2023Updated 2 years ago
- ☆10Jul 13, 2024Updated last year
- ☆10Oct 9, 2025Updated 4 months ago