☆71Aug 6, 2025Updated 7 months ago
Alternatives and similar repositories for gpt-oss-reverse-engineering
Users that are interested in gpt-oss-reverse-engineering are comparing it to the libraries listed below
Sorting:
- Aims for memory-efficient training (24GB VRAM) on consumer GPUs. Optimizing language models through guidance tokens in reasoning chains, …☆29Feb 23, 2025Updated last year
- ☆11Apr 3, 2023Updated 2 years ago
- [NeurIPS 2022] Your Transformer May Not be as Powerful as You Expect (official implementation)☆34Aug 6, 2023Updated 2 years ago
- Vortex: A Flexible and Efficient Sparse Attention Framework☆49Jan 21, 2026Updated 2 months ago
- Adversarially Robust Generalization Just Requires More Unlabeled Data☆11Aug 8, 2019Updated 6 years ago
- ☆38Aug 7, 2025Updated 7 months ago
- Code of ICML paper arxiv.org/abs/2302.08105☆14May 4, 2023Updated 2 years ago
- Code for the paper "The Journey, Not the Destination: How Data Guides Diffusion Models"☆25Dec 12, 2023Updated 2 years ago
- The repository contains code for Adaptive Data Optimization☆32Dec 9, 2024Updated last year
- Server Usage Documentation of AIR☆22Feb 22, 2023Updated 3 years ago
- [NeurIPS-2024] The offical Implementation of "Instruction-Guided Visual Masking"☆42Nov 15, 2024Updated last year
- Formal Contracts for Multi-Agent Reinforcement Learning☆19Oct 24, 2023Updated 2 years ago
- Is In-Context Learning Sufficient for Instruction Following in LLMs? [ICLR 2025]☆32Jan 23, 2025Updated last year
- ☆13Feb 12, 2023Updated 3 years ago
- ☆18Mar 18, 2024Updated 2 years ago
- Homepage for ProLong (Princeton long-context language models) and paper "How to Train Long-Context Language Models (Effectively)"☆247Sep 12, 2025Updated 6 months ago
- ☆10Jul 13, 2024Updated last year
- [CHIL 2024] Interpretation of Intracardiac Electrograms Through Textual Representations☆12Sep 4, 2024Updated last year
- ☆21Sep 1, 2025Updated 6 months ago
- This repo consists of my implementation of DocFormerV2☆11Mar 31, 2024Updated last year
- [ICML 2021] This is the official github repo for training L_inf dist nets with high certified accuracy.☆42Mar 16, 2022Updated 4 years ago
- Model Selection with Large Language Models for Reasoning (EMNLP2023 Findings)☆30Dec 23, 2023Updated 2 years ago
- Accelerate LLM preference tuning via prefix sharing with a single line of code☆51Jul 4, 2025Updated 8 months ago
- Multi-Level Triton Runner supporting Python, IR, PTX, and cubin.☆84Updated this week
- Official implementation of Panacea: A foundation model for clinical trial design, recruitment, search, and summarization.☆18Dec 24, 2024Updated last year
- test images with not appropriate labels in MNIST dataset☆10Mar 3, 2018Updated 8 years ago
- Repository for the paper: "TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining" ACL Oral 2025☆22Mar 6, 2026Updated 2 weeks ago
- ☆13Sep 2, 2021Updated 4 years ago
- ☆15Apr 26, 2025Updated 10 months ago
- FlexAttention w/ FlashAttention3 Support☆27Oct 5, 2024Updated last year
- ☆18Oct 30, 2025Updated 4 months ago
- Organize the Web: Constructing Domains Enhances Pre-Training Data Curation☆79May 2, 2025Updated 10 months ago
- ☆83Feb 10, 2026Updated last month
- SWE-Swiss: A Multi-Task Fine-Tuning and RL Recipe for High-Performance Issue Resolution☆103Sep 24, 2025Updated 5 months ago
- An Open-Source RAG Workload Trace to Optimize RAG Serving Systems☆35Nov 18, 2025Updated 4 months ago
- ☆13Feb 18, 2025Updated last year
- code for the table-based open domain question answering project, with paper title: "Reasoning over Hybrid Chain for Table-and-Text Open D…☆12Sep 16, 2022Updated 3 years ago
- codes for paper "AttCAT: Explaining Transformers via Attentive Class Activation Tokens"☆12May 13, 2024Updated last year
- ☆109Jul 15, 2025Updated 8 months ago