The official code of "Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers"
☆20Jul 24, 2024Updated last year
Alternatives and similar repositories for StructuredFFN
Users that are interested in StructuredFFN are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …☆30Dec 8, 2025Updated 6 months ago
- ☆72Nov 15, 2024Updated last year
- ☆13Jul 3, 2024Updated last year
- Automatically exported from code.google.com/p/transducersaurus☆11Apr 1, 2015Updated 11 years ago
- [ICML-2025] We introduce Lie group Relative position Encodings (LieRE) that goes beyond RoPE in supporting n-dimensional inputs.☆14Aug 8, 2025Updated 10 months ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- ☆17Oct 27, 2024Updated last year
- Source code accompanying the NeurIPS 2022 paper "Learning Partial Equivariances From Data"☆10Nov 18, 2022Updated 3 years ago
- Training, optimization and deployment of Object Detection model with dinov2 backbone for efficient inference on NVIDIA Jetson☆14Jul 26, 2025Updated 10 months ago
- ☆24May 25, 2024Updated 2 years ago
- Model configurations for scaling SE models in the paper "Beyond Performance Plateaus: A Comprehensive Study on Scalability in Speech Enha…☆41Aug 7, 2024Updated last year
- This repository provides a collection of LaTeX class templates designed to enhance the clarity and conciseness of the main.tex files. It …☆13Nov 13, 2025Updated 6 months ago
- CLI tool for submitting GPU kernels☆13Apr 28, 2026Updated last month
- Experiments Notebook of "Understanding the Skill Gap in Recurrent Language Models: The Role of the Gather-and-Aggregate Mechanism"☆15Apr 30, 2025Updated last year
- Reasoning-based Evaluation and Ranking of Translations.☆20Updated this week
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Structured Neuron Level Pruning to compress Transformer-based models [ECCV'24]☆16Aug 7, 2024Updated last year
- Code for [Re] On the Reproducibility of Post-Hoc Concept Bottleneck Models.☆13Nov 27, 2024Updated last year
- manipulating cointegrated pairs to achieve a market-neutral strategy that outperforms indices☆11Jan 12, 2021Updated 5 years ago
- 🧞♀️ Discover AI-generated programming project ideas☆10Sep 2, 2022Updated 3 years ago
- gRPC server over a FAISS index☆19Aug 19, 2021Updated 4 years ago
- coloring terminal text with intensities (used for plotting probability, entropy with tokens)☆12Oct 11, 2024Updated last year
- A few models converted from caffe to CoreMLs format.☆15Jun 6, 2017Updated 9 years ago
- ☆21Oct 1, 2024Updated last year
- Scratchpad/Chain-of-Thought Prompts☆12Jun 6, 2022Updated 4 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- Mastodon server running for the Doubanius Tertius project☆10Apr 4, 2022Updated 4 years ago
- Combining SOAP and MUON☆22Feb 11, 2025Updated last year
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- Code to generate visual metamers via foveated feed-forward style transfer (ICLR 2019)☆19Apr 13, 2021Updated 5 years ago
- A Python interface to OpenFst (fix FstDrawer interface issue for 1.6 version)☆17Apr 2, 2018Updated 8 years ago
- ☆15Mar 2, 2025Updated last year
- Fast and differentiable hidden Markov model in C++☆19Jan 20, 2023Updated 3 years ago
- Elucidated Dataset Condensation (NeurIPS 2024)☆20Oct 5, 2024Updated last year
- A Statistical Arbitrage Strategy to trade Cryptocurrency Pairs☆13Nov 6, 2020Updated 5 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official implementation for LaCo (EMNLP 2024 Findings)☆22Oct 3, 2024Updated last year
- Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"☆14May 26, 2025Updated last year
- Uses Processing and Perlin Noise to generate a procedural 2D rendering of different landscapes, which are then rendered into 3D☆16Aug 14, 2018Updated 7 years ago
- Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification☆13Feb 5, 2022Updated 4 years ago
- This is the code which powers the Twitter Bot https://twitter.com/RGB_Colours☆15Apr 14, 2017Updated 9 years ago
- ☆63Oct 3, 2024Updated last year
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆29Sep 22, 2023Updated 2 years ago