Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"
☆14Apr 30, 2025Updated 10 months ago
Alternatives and similar repositories for Gather-and-Aggregate
Users that are interested in Gather-and-Aggregate are comparing it to the libraries listed below
Sorting:
- Official implementation of Phi-Mamba. A MOHAWK-distilled model (Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Mode…☆121Sep 13, 2024Updated last year
- ☆22Sep 16, 2025Updated 5 months ago
- ☆15Mar 2, 2025Updated last year
- The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling☆42Dec 29, 2025Updated 2 months ago
- Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"☆14May 26, 2025Updated 9 months ago
- Voice agent using LiveKit (orchestration), Cartesia (TTS), OpenAI (LLM), and Deepgram (STT)☆20Oct 28, 2025Updated 4 months ago
- H-Net Dynamic Hierarchical Architecture☆81Sep 11, 2025Updated 5 months ago
- Code for NeurIPS 2024 Paper - Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass☆21Aug 22, 2024Updated last year
- Make reasoning models scalable☆47May 31, 2025Updated 9 months ago
- ☆36Feb 26, 2024Updated 2 years ago
- ☆35Apr 12, 2024Updated last year
- manipulating cointegrated pairs to achieve a market-neutral strategy that outperforms indices☆12Jan 12, 2021Updated 5 years ago
- Code for NeurIPS 2024 Spotlight: "Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations"☆92Oct 30, 2024Updated last year
- A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks☆36Oct 31, 2024Updated last year
- Building LLMs from scratch following the book from S. Raschka☆33Mar 27, 2025Updated 11 months ago
- Neuro-Symbolic Integration Brings Causal and Reliable Reasoning Proofs☆41Feb 15, 2024Updated 2 years ago
- TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators☆118Jun 14, 2025Updated 8 months ago
- The GraphBench package.☆27Updated this week
- Example application for creating an MVC Express + Node + TypeScript app and deploying it to Azure☆10Nov 8, 2018Updated 7 years ago
- 📦 A collection of pastable code gathered from past projects☆12Sep 9, 2024Updated last year
- ☆34Mar 12, 2025Updated 11 months ago
- a react scrollable tabs component with many additional features☆36Jan 16, 2025Updated last year
- [NeurIPS 2024] Official implementation of the paper "MambaLRP: Explaining Selective State Space Sequence Models" 🐍☆45Nov 6, 2024Updated last year
- A neural network layer API and library for sequence modeling, designed for easy creation of sequence models that can be executed layerwis…☆56Feb 20, 2026Updated 2 weeks ago
- [NeurIPS 2025] Think or Not? Selective Reasoning via Reinforcement Learning for Vision-Language Models☆53Sep 29, 2025Updated 5 months ago
- [CVPR 2021] FMO Deblurring Benchmark☆13Jan 12, 2022Updated 4 years ago
- ☆16Feb 22, 2025Updated last year
- Project focused on enhancing the quality of low-fidelity endoscopy images using Generative Adversarial Networks (GANs) implemented in PyT…☆17Jun 5, 2025Updated 9 months ago
- Official implementation of the paper "LTrack: Generalizing Multiple Object Tracking to Unseen Domains by Introducing Natural Language Rep…☆12Jul 26, 2023Updated 2 years ago
- Official repo for paper "HiMoE-VLA: Hierarchical Mixture-of-Experts for Generalist Vision-Language-Action Policies"☆23Dec 12, 2025Updated 2 months ago
- Integrating neurosymbolic representations into LLMs for interpretability, steering, and running symbolic algorithms☆14Feb 2, 2026Updated last month
- Python资源大全中文版,内容包括:Web框架、网络爬虫、网络内容提取、模板引擎、数据库、数据可视化、图片处理、文本处理、自然语言处理、机器学习、日志、代码分析等☆11May 24, 2016Updated 9 years ago
- Communication Relay by creating a WiFi Mesh Network using ROS, and using that network for Data Telemetry, with Telemetry radios ( Ubiquit…☆11Dec 18, 2018Updated 7 years ago
- [NeurIPS 2025] Official Pytorch Implementation of "The Curse of Depth in Large Language Models" by Wenfang Sun, Xinyuan Song, Pengxiang L…☆67Updated this week
- Reasoning-based Evaluation and Ranking of Translations.☆19Jul 18, 2025Updated 7 months ago
- Analyzes target website for anti-scraping protections and performance. Saves screenshots/HTML snapshots.☆11Aug 13, 2025Updated 6 months ago
- A Statistical Arbitrage Strategy to trade Cryptocurrency Pairs☆13Nov 6, 2020Updated 5 years ago
- Computational Neuroscience stuff☆13Aug 12, 2019Updated 6 years ago
- A reducer enhancer for using an xstate chart with redux☆13Mar 5, 2018Updated 8 years ago