Data preparation code for CrystalCoder 7B LLM
☆44May 10, 2024Updated last year
Alternatives and similar repositories for crystalcoder-data-prep
Users that are interested in crystalcoder-data-prep are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Pre-training code for CrystalCoder 7B LLM☆58May 10, 2024Updated last year
- Data preparation code for Amber 7B LLM☆94May 10, 2024Updated last year
- Pre-training code for Amber 7B LLM☆173May 10, 2024Updated last year
- Open Implementations of LLM Analyses☆108Oct 8, 2024Updated last year
- ☆236May 10, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A list where most values will be None (or default)☆11Jul 19, 2023Updated 2 years ago
- An open-source conversational language model developed by the Knowledge Works Research Laboratory at Fudan University.☆64Oct 12, 2023Updated 2 years ago
- A collection of CLI LLM tools that I built and use daily☆15Aug 7, 2024Updated last year
- [NeurIPS 2024 poster] Cross-model Control: Improving Multiple Large Language Models in One-time Training☆14Oct 25, 2024Updated last year
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Feb 9, 2025Updated last year
- This repository contains the replication package of our paper "Assessing the Security of GitHub Copilot’s Generated Code - A Targeted Rep…☆10Nov 16, 2023Updated 2 years ago
- INDICT: Code Generation with Internal Dialogues of Critiques for Both Security and Helpfulness☆14Nov 10, 2025Updated 5 months ago
- UM1 test programs and sample code☆11Jul 25, 2022Updated 3 years ago
- [ICML'25] MELON: Provable Defense Against Indirect Prompt Injection Attacks in AI Agents☆24Jul 31, 2025Updated 8 months ago
- DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- ☆15Oct 2, 2024Updated last year
- Implementation of "LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models"☆40Nov 11, 2024Updated last year
- Text-2-SQL☆19Feb 21, 2025Updated last year
- [Findings of EMNLP22] From Mimicking to Integrating: Knowledge Integration for Pre-Trained Language Models☆19Mar 16, 2023Updated 3 years ago
- ☆20Dec 14, 2024Updated last year
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Sep 15, 2023Updated 2 years ago
- BH hackathon☆14Apr 4, 2024Updated 2 years ago
- Source code for paper: Knowledge Inheritance for Pre-trained Language Models☆38Apr 24, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click and start building anything your business needs.
- This repository is the official implementation of the TRAC optimizer in Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement …☆34May 2, 2025Updated 11 months ago
- Ongoing research training transformer models at scale☆44Mar 26, 2026Updated 2 weeks ago
- A curated list of my GitHub stars☆15Mar 14, 2025Updated last year
- OOPSLA 2019 Artifact for AutoPandas. Website at https://rbavishi.github.io/autopandas☆31Nov 21, 2022Updated 3 years ago
- ☆19Aug 23, 2025Updated 7 months ago
- Paper notes for my PhD on Machine Learning (mostly focused on Reinforcement Learning)☆17Jul 22, 2019Updated 6 years ago
- Tiny evaluation of leading LLMs on competitive programming problems☆14Nov 28, 2024Updated last year
- 🔮✍🏻 Automatically organize, analyze, and augment the quality of your obsidian.md notes with AI.☆17Aug 28, 2024Updated last year
- OpenSource deployment made easy☆10Jun 13, 2015Updated 10 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Official repository for "Reweighting Strategy based on Synthetic Data Identification for Sentence Similarity (COLING2022)"☆18Sep 4, 2022Updated 3 years ago
- Build a level 1 coding agent.☆17Jan 28, 2025Updated last year
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 8 months ago
- Examples for using the SiLLM framework for training and running Large Language Models (LLMs) on Apple Silicon☆16May 8, 2025Updated 11 months ago
- ☆19Dec 31, 2025Updated 3 months ago
- AIxCC: automated vulnerability repair via LLMs, search, and static analysis☆12Jul 16, 2024Updated last year
- A Dataset of 600k Java Source Code Changes Categorized by Diff Size http://arxiv.org/pdf/2108.04631☆23Mar 22, 2024Updated 2 years ago