cosmoquester / memoria
Memoria is a human-inspired memory architecture for neural networks.
☆55Updated last week
Related projects: ⓘ
- Repository for the paper Stream of Search: Learning to Search in Language☆70Updated last month
- Generate High Quality textual or multi-modal datasets with Agents☆17Updated last year
- ☆74Updated 9 months ago
- ☆68Updated 2 months ago
- The simplest, fastest repository for training/finetuning medium-sized xLSTMs.☆38Updated 3 months ago
- Official code for ACL 2023 (short, findings) paper "Recursion of Thought: A Divide and Conquer Approach to Multi-Context Reasoning with L…☆41Updated last year
- Code repository for the c-BTM paper☆105Updated 11 months ago
- A repository for research on medium sized language models.☆71Updated 3 months ago
- ☆62Updated 5 months ago
- The GeoV model is a large langauge model designed by Georges Harik and uses Rotary Positional Embeddings with Relative distances (RoPER).…☆122Updated last year
- The Next Generation Multi-Modality Superintelligence☆69Updated 2 weeks ago
- Mixing Language Models with Self-Verification and Meta-Verification☆96Updated 10 months ago
- ☆55Updated 9 months ago
- ☆71Updated last year
- Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks☆31Updated 4 months ago
- MiniHF is an inference, human preference data collection, and fine-tuning tool for local language models. It is intended to help the user…☆143Updated this week
- Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"☆30Updated last month
- Intelligent Go-Explore: Standing on the Shoulders of Giant Foundation Models☆41Updated 3 months ago
- ☆29Updated 2 weeks ago
- Official repository for the paper "Approximating Two-Layer Feedforward Networks for Efficient Transformers"☆34Updated 10 months ago
- An all-new Language Model That Processes Ultra-Long Sequences of 100,000+ Ultra-Fast☆131Updated 2 weeks ago
- ☆61Updated 2 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆37Updated 3 months ago
- ☆48Updated 6 months ago
- Modeling code for a BitNet b1.58 Llama-style model.☆22Updated 4 months ago
- ☆22Updated last year
- ☆75Updated 3 weeks ago
- TART: A plug-and-play Transformer module for task-agnostic reasoning☆188Updated last year
- Demonstration that finetuning RoPE model on larger sequences than the pre-trained model adapts the model context limit☆62Updated last year
- Finetune any model on HF in less than 30 seconds☆56Updated last week