Data preparation code for CrystalCoder 7B LLM
☆43May 10, 2024Updated last year
Alternatives and similar repositories for crystalcoder-data-prep
Users that are interested in crystalcoder-data-prep are comparing it to the libraries listed below
Sorting:
- Pre-training code for CrystalCoder 7B LLM☆57May 10, 2024Updated last year
- Data preparation code for Amber 7B LLM☆93May 10, 2024Updated last year
- Pre-training code for Amber 7B LLM☆172May 10, 2024Updated last year
- Open Implementations of LLM Analyses☆107Oct 8, 2024Updated last year
- An open-source conversational language model developed by the Knowledge Works Research Laboratory at Fudan University.☆64Oct 12, 2023Updated 2 years ago
- Hutter Prize Submission☆14Aug 9, 2021Updated 4 years ago
- ☆25Jun 10, 2025Updated 8 months ago
- Minimum Description Length probing for neural network representations☆20Jan 28, 2025Updated last year
- This repository is the official implementation of the TRAC optimizer in Fast TRAC: A Parameter-Free Optimizer for Lifelong Reinforcement …☆32May 2, 2025Updated 9 months ago
- A simple database optimized for returning results by custom scoring functions.☆21Mar 29, 2016Updated 9 years ago
- Feedback-Driven Tool-Use Improvements in Large Language Models via Automated Build Environments☆48Jan 8, 2026Updated last month
- M2-Reasoning: Empowering MLLMs with Unified General and Spatial Reasoning☆46Jul 17, 2025Updated 7 months ago
- An open-source framework for building monolithic or distributed agentic systems, ranging from simple LLM calls to compositional workflows…☆25Jan 14, 2026Updated last month
- DELT: Data Efficacy for Language Model Training☆43Feb 12, 2026Updated 2 weeks ago
- [ECCV 2024] PanoFree: Tuning-Free Holistic Multi-view Image Generation with Cross-view Self-Guidance☆23Jul 25, 2024Updated last year
- Source code for the paper "Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data"☆20Feb 24, 2024Updated 2 years ago
- ☆33Aug 9, 2024Updated last year
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆120Oct 9, 2025Updated 4 months ago
- a Fine-tuned LLaMA that is Good at Arithmetic Tasks☆178Sep 15, 2023Updated 2 years ago
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆60May 28, 2024Updated last year
- 🚀🤗 A collection of templates for Hugging Face Spaces☆35Oct 9, 2023Updated 2 years ago
- A Python script that generates a list of pairs of funny words for naming things such as app releases, internal projects, servers and chil…☆26Nov 13, 2016Updated 9 years ago
- ☆39May 20, 2025Updated 9 months ago
- PreAct: Prediction Enhances Agent's Planning Ability (Coling2025)☆30Dec 12, 2024Updated last year
- An open-source session replay tool for single-page applications that uses AI analysis, aggregated trends, and a RAG chatbot to help devel…☆11Jan 23, 2026Updated last month
- edaSQL is a python library to bridge the SQL with Exploratory Data Analysis where you can connect to the Database and insert the queries.…☆10Nov 14, 2021Updated 4 years ago
- A lightweight implementation of shapes drawn across a geo-temporal plane.☆12Jan 27, 2026Updated last month
- Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales☆32Jul 17, 2023Updated 2 years ago
- Ongoing research training transformer language models at scale, including: BERT & GPT-2☆69Jul 20, 2023Updated 2 years ago
- LSTM-based dependency graph parser with Bi-LSTM Subtraction and Incremental Tree-LSTM☆28Dec 13, 2017Updated 8 years ago
- Tiny AutoEncoder for Stable Diffusion Videos☆36Oct 5, 2024Updated last year
- Collaborative Training of Large Language Models in an Efficient Way☆419Aug 28, 2024Updated last year
- Code for the paper "Rethinking Benchmark and Contamination for Language Models with Rephrased Samples"☆316Dec 20, 2023Updated 2 years ago
- Simplifies data migration between Apache Ignite clusters by relying on Apache Avro as an intermediate storage format☆13Jun 27, 2023Updated 2 years ago
- ☆36Sep 6, 2024Updated last year
- Implementation of PCA algorithm using Gram-Scmidt modification on NIPALS☆10Jun 13, 2015Updated 10 years ago
- Sparsey, trademark Neurithmic Systems, is unsupervised learning algorithm inspired from the computations of cortical macro-columns and mi…☆12Feb 27, 2023Updated 3 years ago
- [NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model☆65Oct 26, 2025Updated 4 months ago
- An Educational Framework Based on PyTorch for Deep Learning Education and Exploration☆10Dec 24, 2023Updated 2 years ago