bigcode-project/bigcode-analysis

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/bigcode-project/bigcode-analysis)

bigcode-project / bigcode-analysis

Repository for analysis and experiments in the BigCode project.

☆126

Alternatives and similar repositories for bigcode-analysis

Users that are interested in bigcode-analysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

bigcode-project / bigcode-dataset
View on GitHub
☆496Aug 15, 2024Updated last year
bigcode-project / bigcode-tokenizer
View on GitHub
☆15Oct 24, 2023Updated 2 years ago
bigcode-project / bigcode-inference-benchmark
View on GitHub
☆19Aug 10, 2024Updated last year
bigcode-project / bigcode-evaluation-harness
View on GitHub
A framework for the evaluation of autoregressive code generation language models.
☆1,052Jul 22, 2025Updated 11 months ago
bigcode-project / bigcode-encoder
View on GitHub
☆32Jul 24, 2023Updated 2 years ago
Serverless GPU API endpoints on Runpod - Get Bonus Credits • Ad
Skip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
bigcode-project / transformers
View on GitHub
☆26Mar 6, 2024Updated 2 years ago
bigscience-workshop / data-preparation
View on GitHub
Code used for sourcing and cleaning the BigScience ROOTS corpus
☆318Mar 20, 2023Updated 3 years ago
huggingface / datablations
View on GitHub
Scaling Data-Constrained Language Models
☆344Jun 28, 2025Updated last year
ChenghaoMou / text-dedup
View on GitHub
All-in-one text de-duplication
☆764Mar 9, 2026Updated 4 months ago
serega / gaoya
View on GitHub
Locality Sensitive Hashing
☆81May 29, 2026Updated last month
thesephist / hfm
View on GitHub
Hugging Face Download (Cache) Manager
☆22Aug 7, 2022Updated 3 years ago
bigcode-project / bigcode-website
View on GitHub
Source of the website of the BigCode project.
☆22Updated this week
pietrolesci / anchoral
View on GitHub
This is the official PyTorch implementation for our NAACL 2024 paper: "AnchorAL: Computationally Efficient Active Learning for Large and …
☆22Apr 15, 2025Updated last year
VITA-Group / ChainCoder
View on GitHub
[ICML 2023] "Outline, Then Details: Syntactically Guided Coarse-To-Fine Code Generation", Wenqing Zheng, S P Sharan, Ajay Kumar Jaiswal, …
☆43Nov 9, 2023Updated 2 years ago
End-to-end encrypted cloud storage - Proton Drive • Ad
Special offer: 40% Off Yearly / 80% Off First Month. Protect your most important files, photos, and documents from prying eyes.
RAIVNLab / AdANNS
View on GitHub
Code repository for the paper - "AdANNS: A Framework for Adaptive Semantic Search"
☆69Oct 10, 2023Updated 2 years ago
pacman100 / peft-codegen-25
View on GitHub
☆23Jul 10, 2023Updated 3 years ago
slikts / gh-minimap
View on GitHub
Source code 💻 minimap 🗺️ extension for GitHub 🙈
☆11Sep 17, 2018Updated 7 years ago
bigcode-project / Megatron-LM
View on GitHub
Ongoing research training transformer models at scale
☆396Aug 20, 2024Updated last year
s-smits / grpo-optuna
View on GitHub
Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna
☆60Oct 18, 2025Updated 9 months ago
huggingface / ember
View on GitHub
ANE accelerated embedding models!
☆20Dec 11, 2024Updated last year
rawsh / mirrorllm
View on GitHub
various experiments for scaling inference time compute with small reasoning models
☆17Jan 16, 2025Updated last year
bigcode-project / the-stack-v2
View on GitHub
Code for the curation of The Stack v2 and StarCoder2 training data
☆136Apr 11, 2024Updated 2 years ago
apoorvumang / knowledge-cutoff
View on GitHub
Benchmark to measure what the real knowledge cutoff of a model is
☆15Jul 10, 2026Updated last week
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
yixiaoer / mistral-v0.2-jax
View on GitHub
JAX implementation of the Mistral 7b v0.2 model
☆35Jul 3, 2024Updated 2 years ago
salesforce / CodeRL
View on GitHub
This is the official code for the paper CodeRL: Mastering Code Generation through Pretrained Models and Deep Reinforcement Learning (Neur…
☆573Jun 2, 2026Updated last month
mikex86 / tritonc
View on GitHub
Standalone commandline CLI tool for compiling Triton kernels
☆20Sep 13, 2024Updated last year
Muhtasham / summarization-eval
View on GitHub
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆104Jan 15, 2024Updated 2 years ago
jeffchuber / chroma-demo
View on GitHub
☆12Jun 2, 2023Updated 3 years ago
abacaj / code-eval
View on GitHub
Run evaluation on LLMs using human-eval benchmark
☆429Sep 12, 2023Updated 2 years ago
loubnabnl / santacoder-finetuning
View on GitHub
Fine-tune SantaCoder for Code/Text Generation.
☆196Apr 11, 2023Updated 3 years ago
marepilc / pink-parquet
View on GitHub
User-friendly viewer for Parquet files
☆16May 8, 2026Updated 2 months ago
dpfried / incoder
View on GitHub
Generative model for code infilling and synthesis
☆312Sep 9, 2023Updated 2 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
bmschmidt / pySRP
View on GitHub
Python Module implementing SRP
☆12Jul 29, 2022Updated 3 years ago
scottlogic-alex / prm800k-denorm
View on GitHub
Script for processing OpenAI's PRM800K process supervision dataset into an Alpaca-style instruction-response format
☆27Jul 12, 2023Updated 3 years ago
bigcode-project / octopack
View on GitHub
🐙 OctoPack: Instruction Tuning Code Large Language Models
☆479Feb 5, 2025Updated last year
stanford-cs324 / winter2023
View on GitHub
☆39Feb 27, 2023Updated 3 years ago
MeLeLBGU / SaGe
View on GitHub
Code for SaGe subword tokenizer (EACL 2023)
☆28Nov 30, 2024Updated last year
bigcode-project / opt-out-v2
View on GitHub
Repository for opt-out requests.
☆10Mar 25, 2024Updated 2 years ago
kuleshov-group / MODULoRA-Experiment
View on GitHub
Evaluation Code repository for the paper "ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers". (2023…
☆13Dec 5, 2023Updated 2 years ago