bigcode-project/astraios

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bigcode-project/astraios)

bigcode-project / astraios

Astraios: Parameter-Efficient Instruction Tuning Code Language Models

☆63

Alternatives and similar repositories for astraios

Users that are interested in astraios are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

all-the-noises / eval-arena
View on GitHub
☆34Mar 21, 2026Updated 3 months ago
bigcode-project / bigcodebench-annotation
View on GitHub
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
☆26Aug 8, 2024Updated last year
ise-uiuc / blazedit
View on GitHub
Making code edting up to 7.7x faster using multi-layer speculation
☆23Feb 20, 2025Updated last year
amazon-science / llm-code-preference
View on GitHub
Training and Benchmarking LLMs for Code Preference.
☆38Nov 15, 2024Updated last year
bigcode-project / octopack
View on GitHub
🐙 OctoPack: Instruction Tuning Code Large Language Models
☆479Feb 5, 2025Updated last year
AI Agents on DigitalOcean Gradient AI Platform • Ad
Build production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
LJ2lijia / SkCoder
View on GitHub
Official implementation of our ICSE 2023 paper on Automatic Code Generation.
☆27Nov 8, 2023Updated 2 years ago
bigcode-project / the-stack-v2
View on GitHub
Code for the curation of The Stack v2 and StarCoder2 training data
☆136Apr 11, 2024Updated 2 years ago
sail-sg / regmix
View on GitHub
[ICLR 2025] 🧬 RegMix: Data Mixture as Regression for Language Model Pre-training (Spotlight)
☆194Feb 17, 2025Updated last year
ise-uiuc / xft
View on GitHub
XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
☆36Jul 2, 2024Updated 2 years ago
huangd1999 / EffiLearner
View on GitHub
[NeurIPS 2024] Self-Optimization Improves the Efficiency of Code Generation
☆15May 10, 2025Updated last year
JoshuaPurtell / SmallBench
View on GitHub
Small, simple agent task environments for training and evaluation
☆20Nov 1, 2024Updated last year
Muennighoff / FLAN
View on GitHub
Provides a minimal implementation to extract FLAN datasets for further processing
☆11Feb 1, 2023Updated 3 years ago
NJU-iSE / FUEL
View on GitHub
This repo is the artifact of FUEL
☆16May 19, 2026Updated 2 months ago
SMILE-data / SMILE
View on GitHub
SMILE: A Multimodal Dataset for Understanding Laughter
☆13Jun 15, 2023Updated 3 years ago
Proton VPN Special Offer - Get 70% off • Ad
Special partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
dashends / CodeSyntax
View on GitHub
Code and dataset for EMNLP 2022 Findings paper "Benchmarking Language Models for Code Syntax Understanding"
☆16Oct 24, 2022Updated 3 years ago
facebookresearch / dmae_st
View on GitHub
Directed masked autoencoders
☆14Mar 25, 2026Updated 3 months ago
FSoft-AI4Code / DocChecker
View on GitHub
DocChecker: Bootstrapping Code-Text Pretrained Language Model to Detect Inconsistency Between Code and Comment
☆15Jan 23, 2024Updated 2 years ago
mrmaheshrajput / productionizing-llms
View on GitHub
Code Repository for Blog - How to Productionize Large Language Models (LLMs)
☆12Mar 27, 2024Updated 2 years ago
Laz4rz / mup
View on GitHub
Minimal (truly) muP implementation, consistent with TP4 and TP5 papers notation
☆14Jan 2, 2026Updated 6 months ago
marcusm117 / IdentityChain
View on GitHub
[ICLR 2024] Beyond Accuracy: Evaluating Self-Consistency of Code Large Language Models with IdentityChain
☆11Nov 24, 2025Updated 7 months ago
Copilot-Language / copilot-verifier
View on GitHub
System for verifying the correctness of generated Copilot programs
☆19May 8, 2025Updated last year
ise-uiuc / WhiteFox
View on GitHub
WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models (OOPSLA 2024)
☆84Aug 5, 2025Updated 11 months ago
cedricrupb / TSSB3M
View on GitHub
Mining tool and large-scale datasets of single statement bug fixes in Python
☆19Nov 29, 2023Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
koalazf99 / tacube
View on GitHub
[EMNLP 2022] TaCube: Pre-computing Data Cubes for Answering Numerical-Reasoning Questions over Tabular Data
☆17May 17, 2023Updated 3 years ago
gonglinyuan / ast_t5
View on GitHub
☆86May 10, 2024Updated 2 years ago
facebookresearch / lss_eval
View on GitHub
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Aug 25, 2023Updated 2 years ago
treeverse / example-versioning
View on GitHub
Data sets and ML models versioning example from DVC get started
☆11Jun 4, 2024Updated 2 years ago
bigcode-project / selfcodealign
View on GitHub
[NeurIPS'24] SelfCodeAlign: Self-Alignment for Code Generation
☆323Feb 24, 2025Updated last year
SecurityLab-UCD / UniTSyn
View on GitHub
[ISSTA'24] A Large-Scale Dataset Capable of Enhancing the Prowess of Large Language Models for Program Testing
☆12Jan 7, 2025Updated last year
TencentARC / LLaMA-Pro
View on GitHub
[ACL 2024] Progressive LLaMA with Block Expansion.
☆513May 20, 2024Updated 2 years ago
ablghtianyi / ICL_Modular_Arithmetic
View on GitHub
☆19Mar 25, 2025Updated last year
zzjas / anypoc
View on GitHub
Generates executable Proof-of-Concept for any bug in any project. AI agents discover and reproduce vulnerabilities — verified, not halluc…
☆27May 5, 2026Updated 2 months ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
AISE-TUDelft / Capybara-BinT5
View on GitHub
Replication package for the SANER 2023 paper titled "Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binaries"
☆17Jul 8, 2024Updated 2 years ago
IBM / SALMON
View on GitHub
Self-Alignment with Principle-Following Reward Models
☆170Sep 18, 2025Updated 10 months ago
reddy-lab-code-research / PPOCoder
View on GitHub
Code for the TMLR 2023 paper "PPOCoder: Execution-based Code Generation using Deep Reinforcement Learning"
☆116Jan 9, 2024Updated 2 years ago
bigcode-project / bigcodearena
View on GitHub
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution
☆61Oct 13, 2025Updated 9 months ago
GURPREETKAURJETHRA / Multi-Agent-AI-App
View on GitHub
Multi-Agent AI App from Scratch in python without any depedency of framework
☆15Jan 7, 2025Updated last year
hkust-nlp / deita
View on GitHub
Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]
☆599Dec 9, 2024Updated last year
bigcode-project / bigcode-dataset
View on GitHub
☆496Aug 15, 2024Updated last year