Implementation of the Llama architecture with RLHF + Q-learning
β170Feb 1, 2025Updated last year
Alternatives and similar repositories for llama-qrlhf
Users that are interested in llama-qrlhf are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Implementation of GateLoop Transformer in Pytorch and Jaxβ92Jun 18, 2024Updated last year
- Implementation of π» Mirasol, SOTA Multimodal Autoregressive model out of Google Deepmind, in Pytorchβ92Dec 22, 2023Updated 2 years ago
- Implementation of Gradient Agreement Filtering, from Chaubard et al. of Stanford, but for single machine microbatches, in Pytorchβ25Jan 21, 2025Updated last year
- Implementation of the Kalman Filtering Attention proposed in "Kalman Filtering Attention for User Behavior Modeling in CTR Prediction"β59Oct 22, 2023Updated 2 years ago
- Just some miscellaneous utility functions / decorators / modules related to Pytorch and Accelerate to help speed up implementation of newβ¦β126Jul 26, 2024Updated last year
- Proton VPN Special Offer - Get 70% off β’ AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- A practical implementation of GradNorm, Gradient Normalization for Adaptive Loss Balancing, in Pytorchβ127Aug 25, 2025Updated 7 months ago
- CUDA implementation of autoregressive linear attention, with all the latest research findingsβ46May 23, 2023Updated 2 years ago
- Yet another random morning idea to be quickly tried and architecture shared if it works; to allow the transformer to pause for any amountβ¦β53Oct 22, 2023Updated 2 years ago
- Some personal experiments around routing tokens to different autoregressive attention, akin to mixture-of-expertsβ123Oct 17, 2024Updated last year
- Explorations into whether a transformer with RL can direct a genetic algorithm to converge fasterβ71May 18, 2025Updated 10 months ago
- Implementation and explorations into Blackbox Gradient Sensing (BGS), an evolutionary strategies approach proposed in a Google Deepmind pβ¦β20Jul 20, 2025Updated 8 months ago
- Implementation of Infini-Transformer in Pytorchβ112Jan 4, 2025Updated last year
- Implementation of an Attention layer where each head can attend to more than just one token, using coordinate descent to pick topkβ47Jul 16, 2023Updated 2 years ago
- Standalone Product Key Memory module in Pytorch - for augmenting Transformer modelsβ87Nov 1, 2025Updated 4 months ago
- NordVPN Threat Protection Proβ’ β’ AdTake your cybersecurity to the next level. Block phishing, malware, trackers, and ads. Lightweight app that works with all browsers.
- Implementation of CALM from the paper "LLM Augmented LLMs: Expanding Capabilities through Composition", out of Google Deepmindβ179Sep 12, 2024Updated last year
- Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorchβ94Apr 10, 2023Updated 2 years ago
- Experiments around a simple idea for inducing multiple hierarchical predictive model within a GPTβ225Updated this week
- Implementation of Soft Actor Critic and some of its improvements in Pytorchβ63Dec 29, 2025Updated 3 months ago
- Implementation of the new SOTA for model based RL, from the paper "Improving Transformer World Models for Data-Efficient RL", in Pytorchβ153May 2, 2025Updated 10 months ago
- Implementation of Mind Evolution, Evolving Deeper LLM Thinking, from Deepmindβ59May 31, 2025Updated 9 months ago
- Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences"β70Apr 10, 2023Updated 2 years ago
- β62Dec 8, 2023Updated 2 years ago
- Exploration into the Firefly algorithm in Pytorchβ41Feb 14, 2025Updated last year
- Bare Metal GPUs on DigitalOcean Gradient AI β’ AdPurpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
- Implementation of TiTok, proposed by Bytedance in "An Image is Worth 32 Tokens for Reconstruction and Generation"β182Jun 20, 2024Updated last year
- Implementation of Agent Attention in Pytorchβ93Jul 10, 2024Updated last year
- My explorations into editing the knowledge and memories of an attention networkβ35Dec 8, 2022Updated 3 years ago
- Associative scan package for DRYing some code between reposβ18Jan 5, 2026Updated 2 months ago
- My attempts at applying Soundstream design on learned tokenization of text and then applying hierarchical attention to text generationβ90Oct 11, 2024Updated last year
- Toy genetic algorithm in Pytorchβ55Apr 29, 2025Updated 11 months ago
- [WIP] Transformer to embed Danbooru labelsetsβ13Mar 31, 2024Updated last year
- An attempt to merge ESBN with Transformers, to endow Transformers with the ability to emergently bind symbolsβ16Aug 3, 2021Updated 4 years ago
- Exploring an idea where one forgets about efficiency and carries out attention across each edge of the nodes (tokens)β55Mar 25, 2025Updated last year
- DigitalOcean Gradient AI Platform β’ AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Explorations into the recently proposed Taylor Series Linear Attentionβ100Aug 18, 2024Updated last year
- My implementation of the model KosmosG from "KOSMOS-G: Generating Images in Context with Multimodal Large Language Models"β14Nov 11, 2024Updated last year
- Implementation of the algorithm detailed in paper "Evolutionary design of molecules based on deep learning and a genetic algorithm"β24Dec 15, 2023Updated 2 years ago
- Axial Positional Embedding for Pytorchβ84Feb 25, 2025Updated last year
- Implementation of π Ring Attention, from Liu et al. at Berkeley AI, in Pytorchβ548May 16, 2025Updated 10 months ago
- Implementation of the general framework for AMIE, from the paper "Towards Conversational Diagnostic AI", out of Google Deepmindβ74Sep 16, 2024Updated last year
- Explorations into the proposal from the paper "Grokfast, Accelerated Grokking by Amplifying Slow Gradients"β103Dec 22, 2024Updated last year