The official code of "Building on Efficient Foundations: Effectively Training LLMs with Structured Feedforward Layers"
☆20Jul 24, 2024Updated last year
Alternatives and similar repositories for StructuredFFN
Users that are interested in StructuredFFN are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official codebase for "Quantile Reward Policy Optimization: Alignment with Pointwise Regression and Exact Partition Functions" (Matrenok …☆30Dec 8, 2025Updated 5 months ago
- ☆13Jul 3, 2024Updated last year
- Automatically exported from code.google.com/p/transducersaurus☆11Apr 1, 2015Updated 11 years ago
- ☆15Apr 25, 2023Updated 3 years ago
- Training, optimization and deployment of Object Detection model with dinov2 backbone for efficient inference on NVIDIA Jetson☆13Jul 26, 2025Updated 9 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This repository provides a collection of LaTeX class templates designed to enhance the clarity and conciseness of the main.tex files. It …☆13Nov 13, 2025Updated 6 months ago
- CLI tool for submitting GPU kernels☆13Apr 28, 2026Updated 3 weeks ago
- Reasoning-based Evaluation and Ranking of Translations.☆20Jul 18, 2025Updated 10 months ago
- torchvision-based transforms that provide access to parameterization☆16Dec 4, 2025Updated 5 months ago
- Structured Neuron Level Pruning to compress Transformer-based models [ECCV'24]☆16Aug 7, 2024Updated last year
- manipulating cointegrated pairs to achieve a market-neutral strategy that outperforms indices☆10Jan 12, 2021Updated 5 years ago
- [NeurIPS 2024] "AmoebaLLM: Constructing Any-Shape Large Language Models for Efficient and Instant Deployment" by Yonggan Fu, Zhongzhi Yu,…☆19Dec 13, 2024Updated last year
- Continuous regular group convolutions for Pytorch☆12Jun 9, 2024Updated last year
- Official Implementation of paper "Distilling Long-tailed Datasets" [CVPR 2025]☆21Aug 13, 2025Updated 9 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- coloring terminal text with intensities (used for plotting probability, entropy with tokens)☆12Oct 11, 2024Updated last year
- Azure DevOps Inventory .NET Tool – Inventories and documents an Azure DevOps organization by generating a set of Markdown files for the s…☆13May 13, 2026Updated last week
- ☆21Oct 1, 2024Updated last year
- Some benchmarks☆12Sep 19, 2019Updated 6 years ago
- Efficiently discovering algorithms via LLMs with evolutionary search and reinforcement learning.☆137Nov 18, 2025Updated 6 months ago
- Mastodon server running for the Doubanius Tertius project☆10Apr 4, 2022Updated 4 years ago
- [ICLR 2025] SDTT: a simple and effective distillation method for discrete diffusion models☆51Feb 26, 2026Updated 2 months ago
- Code to reproduce key results accompanying "SAEs (usually) Transfer Between Base and Chat Models"☆13Jul 18, 2024Updated last year
- A Visual Studio debugger extension for viewing SkiaSharp bitmaps and images.☆15Apr 19, 2024Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Offcial code for the ECCV2024 paper "Self-Adapting Large Visual-Language Models to Edge Devices across Visual Modalities"☆26Oct 1, 2024Updated last year
- Official implementation for LaCo (EMNLP 2024 Findings)☆21Oct 3, 2024Updated last year
- [CVPR 2025] Official repository for GETA☆42Nov 5, 2025Updated 6 months ago
- Fast and differentiable hidden Markov model in C++☆19Jan 20, 2023Updated 3 years ago
- An example application that uses SkiaSharp with Wpf☆15Apr 1, 2016Updated 10 years ago
- Elucidated Dataset Condensation (NeurIPS 2024)☆20Oct 5, 2024Updated last year
- ☆14Jan 22, 2025Updated last year
- A Statistical Arbitrage Strategy to trade Cryptocurrency Pairs☆14Nov 6, 2020Updated 5 years ago
- Uses Processing and Perlin Noise to generate a procedural 2D rendering of different landscapes, which are then rendered into 3D☆16Aug 14, 2018Updated 7 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- Source code for the paper "Positional Attention: Expressivity and Learnability of Algorithmic Computation"☆14May 26, 2025Updated 11 months ago
- This repo contains the code for studying the interplay between quantization and sparsity methods☆26Feb 26, 2025Updated last year
- Temporal Alignment Prediction for Supervised Representation Learning and Few-Shot Sequence Classification☆13Feb 5, 2022Updated 4 years ago
- This is the code which powers the Twitter Bot https://twitter.com/RGB_Colours☆15Apr 14, 2017Updated 9 years ago
- Minimal implementation of TokenFormer for inference and learning☆13Nov 6, 2024Updated last year
- ☆63Oct 3, 2024Updated last year
- Sharpness-Aware Minimization Leads to Low-Rank Features [NeurIPS 2023]☆29Sep 22, 2023Updated 2 years ago