Data preparation code for Amber 7B LLM
☆94May 10, 2024Updated last year
Alternatives and similar repositories for amber-data-prep
Users that are interested in amber-data-prep are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Data preparation code for CrystalCoder 7B LLM☆44May 10, 2024Updated last year
- Pre-training code for Amber 7B LLM☆173May 10, 2024Updated last year
- Pre-training code for CrystalCoder 7B LLM☆58May 10, 2024Updated last year
- Open Implementations of LLM Analyses☆108Oct 8, 2024Updated last year
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- QuoteSum is a textual QA dataset containing Semi-Extractive Multi-source Question Answering (SEMQA) examples written by humans, based on …☆13Mar 25, 2024Updated 2 years ago
- 🚀 End-to-end examples and analysis of deploying LLMs serverless using Modal, Runpod, and Beam☆28Mar 25, 2024Updated 2 years ago
- MobileLLM Optimizing Sub-billion Parameter Language Models for On-Device Use Cases. In ICML 2024.☆1,420Apr 21, 2025Updated 11 months ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Aug 15, 2023Updated 2 years ago
- ☆35Jun 3, 2025Updated 10 months ago
- ☆12Feb 14, 2024Updated 2 years ago
- Reproducible and flexible LLM evaluations for scientific reasoning.☆27Jul 23, 2025Updated 8 months ago
- LMTuner: Make the LLM Better for Everyone☆38Sep 21, 2023Updated 2 years ago
- train with kittens!☆64Oct 25, 2024Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI • AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Reaching LLaMA2 Performance with 0.1M Dollars☆988Jul 23, 2024Updated last year
- TensorFlow implementation of the "Prompt-to-Prompt Image Editing with Cross Attention Control" for Stable Diffusion☆16Mar 25, 2023Updated 3 years ago
- ☆206Apr 19, 2025Updated 11 months ago
- Code and dataset for EMNLP 2022 Findings paper "Benchmarking Language Models for Code Syntax Understanding"☆16Oct 24, 2022Updated 3 years ago
- ☆54Jun 6, 2024Updated last year
- Data and tools for generating and inspecting OLMo pre-training data.☆1,472Nov 5, 2025Updated 5 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆82Dec 25, 2025Updated 3 months ago
- ☆15Feb 21, 2024Updated 2 years ago
- ☆56Jun 26, 2025Updated 9 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- Terminal Image Viewer for iTerm2☆12Jul 6, 2019Updated 6 years ago
- Code of our paper "Method-Level Bug Severity Prediction using Source Code Metrics and LLMs" which is accepted to ISSRE 2023.☆10Nov 12, 2023Updated 2 years ago
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆31Nov 14, 2023Updated 2 years ago
- ☆38May 2, 2024Updated last year
- ☆94Oct 5, 2023Updated 2 years ago
- GPT-J 6B inference on TensorRT with INT-8 precision☆11Apr 5, 2023Updated 3 years ago
- codebase release for EMNLP2023 paper publication☆19Sep 18, 2025Updated 6 months ago
- A family of open-sourced Mixture-of-Experts (MoE) Large Language Models☆1,673Mar 8, 2024Updated 2 years ago
- ComfyUI custom node to extend Wan videos in loops with overlap consistency, per loop prompts, and optional LoRA control.☆26Nov 29, 2025Updated 4 months ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.☆2,978Apr 2, 2026Updated last week
- Documenting large text datasets 🖼️ 📚☆14Dec 17, 2024Updated last year
- ☆13Oct 20, 2022Updated 3 years ago
- A collection of CLI LLM tools that I built and use daily☆15Aug 7, 2024Updated last year
- This repository contains code and tooling for the Abacus.AI LLM Context Expansion project. Also included are evaluation scripts and bench…☆601Nov 17, 2023Updated 2 years ago
- Hugging Face and Pyserini interoperability☆19May 18, 2023Updated 2 years ago
- Official repository for paper "ReasonIR Training Retrievers for Reasoning Tasks".☆225Jun 24, 2025Updated 9 months ago