Lizn-zn / Nesy-ProgrammingLinks
☆9Updated 8 months ago
Alternatives and similar repositories for Nesy-Programming
Users that are interested in Nesy-Programming are comparing it to the libraries listed below
Sorting:
- Official implementation of the transformer (TF) architecture suggested in a paper entitled "Looped Transformers as Programmable Computers…☆27Updated 2 years ago
- [TMLR 2024] G4SATBench: Benchmarking and Advancing SAT Solving with Graph Neural Networks☆35Updated last year
- [ICML 2023] "Data Efficient Neural Scaling Law via Model Reusing" by Peihao Wang, Rameswar Panda, Zhangyang Wang☆14Updated last year
- Awesome Triton Resources☆32Updated 2 months ago
- Code for our ACL '23 paper titled "Grokking of Hierarchical Structure in Vanilla Transformers"☆21Updated last year
- Agentic Benchmark for LLM-Crafted Heuristics in Combinatorial Optimization☆27Updated 3 weeks ago
- 32 times longer context window than vanilla Transformers and up to 4 times longer than memory efficient Transformers.☆48Updated 2 years ago
- ☆19Updated 3 months ago
- This is the official repository for all the code of TheoremLlama☆43Updated 9 months ago
- [CoLM 24] Official Repository of MambaByte: Token-free Selective State Space Model☆22Updated 9 months ago
- ☆12Updated last year
- [ICML 2024] When Linear Attention Meets Autoregressive Decoding: Towards More Effective and Efficient Linearized Large Language Models☆31Updated last year
- Stick-breaking attention☆58Updated 2 weeks ago
- A list of awesome neural symbolic papers.☆47Updated 2 years ago
- SCoRe: Training Language Models to Self-Correct via Reinforcement Learning☆10Updated 5 months ago
- Official code for the paper "Attention as a Hypernetwork"☆40Updated last year
- MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248☆55Updated last year
- Code repository for the public reproduction of the language modelling experiments on "MatFormer: Nested Transformer for Elastic Inference…☆24Updated last year
- Parallelizing non-linear sequential models over the sequence length☆52Updated 3 weeks ago
- ACL 2023☆39Updated 2 years ago
- ☆25Updated 9 months ago
- Official Code Repository for the paper "Key-value memory in the brain"☆27Updated 4 months ago
- ☆27Updated last year
- An efficient implementation of the NSA (Native Sparse Attention) kernel☆90Updated 3 weeks ago
- ☆21Updated last month
- Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning☆12Updated 3 weeks ago
- Curse-of-memory phenomenon of RNNs in sequence modelling☆19Updated 2 months ago
- Can GPT-4 Perform Neural Architecture Search?☆87Updated 2 years ago
- The official implementation of "Self-play LLM Theorem Provers with Iterative Conjecturing and Proving"☆98Updated 3 months ago
- [ICLR 2025] Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better☆15Updated 5 months ago