fmchisel: Efficient Compression and Training Algorithms for Foundation Models
☆87May 4, 2026Updated 2 weeks ago
Alternatives and similar repositories for fmchisel
Users that are interested in fmchisel are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- LLM training parallelisms (DP, FSDP, TP, PP) in pure C☆28Jan 27, 2026Updated 3 months ago
- nv-one-logger enables tracking of GPU application progress over time and can help to identify overhead from workload and cluster ineffici…☆23Nov 6, 2025Updated 6 months ago
- [ICDCS 2023] Evaluation and Optimization of Gradient Compression for Distributed Deep Learning☆10Apr 28, 2023Updated 3 years ago
- API for coordinating Maintenance in Kubernetes.☆26Jul 18, 2025Updated 10 months ago
- codes and plots for "Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs"☆11Dec 30, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A Telegram bot to attach a banner about Yalda on your avatar.☆13Feb 10, 2023Updated 3 years ago
- Make triton easier☆50Jun 12, 2024Updated last year
- QuantEase, a layer-wise quantization framework, frames the problem as discrete-structured non-convex optimization. Our work leverages Coo…☆19Feb 22, 2024Updated 2 years ago
- A machine learning framework with readable source code☆15Apr 30, 2025Updated last year
- Cataloging released Triton kernels.☆303Sep 9, 2025Updated 8 months ago
- ☆12Sep 18, 2024Updated last year
- ☆24Sep 11, 2025Updated 8 months ago
- 🧜♀️ Pi extension that renders Mermaid diagrams as ASCII in the TUI, with width-aware output and safe handling for larger diagrams.☆64Feb 23, 2026Updated 3 months ago
- diffusers with search engine☆12Jan 13, 2026Updated 4 months ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [ACL 2025] Official implementation of the "CoT-ICL Lab" framework☆11May 1, 2026Updated 3 weeks ago
- ☆14Mar 8, 2025Updated last year
- walterra's collections of helpers for agentic coding☆34Mar 23, 2026Updated 2 months ago
- Framework for Algorithmic Correctness Testing of Operators☆17Mar 9, 2026Updated 2 months ago
- Transcripts of Democratic Debates as R Package☆10Jun 17, 2020Updated 5 years ago
- Exploring how optimizations for GEMMs work☆33Feb 28, 2026Updated 2 months ago
- A docker image for One Student One Chip's debug exam☆10Sep 22, 2023Updated 2 years ago
- (🔥ICML2026) Reward Auditor: Inference on Reward Modeling Suitability in Real-World Perturbed Scenarios☆35Jan 24, 2026Updated 4 months ago
- DuaLip: Dual Decomposition based Linear Program Solver☆71May 13, 2026Updated last week
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- 哈尔滨工业大学(深圳)2021年球季学期深度学习体系结构实验☆17Oct 1, 2022Updated 3 years ago
- ☆12Mar 7, 2024Updated 2 years ago
- CLI-first runtime for Codex, Claude Code, and AI agents to operate CAE solvers via plugins: COMSOL, Abaqus, Ansys.☆90May 16, 2026Updated last week
- AI Coach Powered by Cole's Content (RAG AI Agent)☆52Oct 26, 2025Updated 6 months ago
- A repo explaining with an example how to extend the kubernetes default scheduler☆17Jul 11, 2019Updated 6 years ago
- A Python Snowpark CLI for loading the TPC-DI dataset into Snowflake. Additional dbt models for building the data warehouse.☆11Sep 4, 2025Updated 8 months ago
- a student trainning project for HLS and transformer☆11Oct 19, 2022Updated 3 years ago
- Code for paper "Conversational Product Search Based on Negative Feedback"☆12Jun 26, 2020Updated 5 years ago
- a collection of skills for vllm-omni☆67Updated this week
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- ☆14Jun 22, 2022Updated 3 years ago
- ☆22May 5, 2025Updated last year
- [DATE'2025, TCAD'2025] Terafly : A Multi-Node FPGA Based Accelerator Design for Efficient Cooperative Inference in LLMs☆36Nov 13, 2025Updated 6 months ago
- Code on IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems (WWW 2020)☆11Apr 18, 2021Updated 5 years ago
- A single-file educational implementation for understanding vLLM's core concepts and running LLM inference.☆43Apr 7, 2026Updated last month
- Causal Analysis of Agent Behavior for AI Safety☆20Jun 27, 2023Updated 2 years ago
- ☆18Nov 11, 2025Updated 6 months ago