mrsteyk / RWKV-LM-deepspeedView external linksLinks
☆44Mar 29, 2023Updated 2 years ago
Alternatives and similar repositories for RWKV-LM-deepspeed
Users that are interested in RWKV-LM-deepspeed are comparing it to the libraries listed below
Sorting:
- Run ONNX RWKV-v4 models with GPU acceleration using DirectML [Windows], or just on CPU [Windows AND Linux]; Limited to 430M model at this…☆21Mar 16, 2023Updated 2 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- A simple REPL for Lean 4, returning information about errors and sorries.☆12Jun 19, 2023Updated 2 years ago
- tinygrad port of the RWKV large language model.☆45Mar 9, 2025Updated 11 months ago
- ☆13Jun 3, 2023Updated 2 years ago
- RWKV centralised docs for the community☆31Jan 17, 2026Updated 3 weeks ago
- 📖 — Notebooks related to RWKV☆58May 13, 2023Updated 2 years ago
- CCQA A New Web-Scale Question Answering Dataset for Model Pre-Training☆32Jul 20, 2022Updated 3 years ago
- Plain-text declaration export for Lean 4☆22Feb 6, 2026Updated last week
- Project Ivory is a minimalism PHP forum, with a clean UI for minimalists.☆71Nov 7, 2018Updated 7 years ago
- The Basis Programming Language☆26Jul 28, 2019Updated 6 years ago
- This project aims to make RWKV Accessible to everyone using a Hugging Face like interface, while keeping it close to the R and D RWKV bra…☆65May 14, 2023Updated 2 years ago
- ☆16Jul 3, 2023Updated 2 years ago
- Documenting common pitfalls and footguns in Lean☆37Aug 26, 2025Updated 5 months ago
- Exploring finetuning public checkpoints on filter 8K sequences on Pile☆116Mar 22, 2023Updated 2 years ago
- 实现Blip2RWKV+QFormer的多模态图文对话大模型,使用Two-Step Cognitive Psychology Prompt方法,仅3B参数的模型便能够出现类人因果思维链。对标MiniGPT-4,ImageBind等图文对话大语言模型,力求以更小的算力和资源实…☆40Jul 17, 2023Updated 2 years ago
- Let us make Psychohistory (as in Asimov) a reality, and accessible to everyone. Useful for LLM grounding and games / fiction / business /…☆40Apr 9, 2023Updated 2 years ago
- Generic interface for hooking up to any Interactive Theorem Prover (ITP) and collecting data for training ML models for AI in formal theo…☆18Jan 16, 2026Updated last month
- The accompanying code for "Simplifying and Understanding State Space Models with Diagonal Linear RNNs" (Ankit Gupta, Harsh Mehta, Jonatha…☆23Dec 30, 2022Updated 3 years ago
- Code for the paper "Decomposing the Enigma: Subgoal-based Demonstration Learning for Formal Theorem Proving"☆19May 25, 2023Updated 2 years ago
- JAX implementations of RWKV☆19Sep 26, 2023Updated 2 years ago
- A Structured Span Selector (NAACL 2022). A structured span selector with a WCFG for span selection tasks (coreference resolution, semanti…☆21Jul 11, 2022Updated 3 years ago
- continous batching and parallel acceleration for RWKV6☆22Jun 28, 2024Updated last year
- Posterior Control of Blackbox Generation☆23May 2, 2020Updated 5 years ago
- ChatGPT-like Web UI for RWKVstic☆100Apr 18, 2023Updated 2 years ago
- Anh - LAION's multilingual assistant datasets and models☆27Apr 5, 2023Updated 2 years ago
- ☆21Dec 19, 2024Updated last year
- LeelaZero + PhoenixGo's weights☆20Nov 13, 2018Updated 7 years ago
- Elm ports' wrapper for uncomplicated request-response-style communication☆29Apr 20, 2021Updated 4 years ago
- Modelling the new Lead-Copper apatite proposed room temperature supeconductor☆32Aug 8, 2023Updated 2 years ago
- Gradio UI for RWKV LLM☆29Feb 21, 2023Updated 2 years ago
- A finetuning pipeline for instruct tuning Raven 14bn using QLORA 4bit and the Ditty finetuning library☆28Jun 5, 2024Updated last year
- Formalization of the Millennium Problems in Lean 4☆41Jan 16, 2026Updated last month
- Codes accompanying the paper "LaProp: a Better Way to Combine Momentum with Adaptive Gradient"☆29Jul 30, 2020Updated 5 years ago
- Script and instruction how to fine-tune large RWKV model on your data for Alpaca dataset☆31Apr 2, 2023Updated 2 years ago
- ☆34Jul 21, 2024Updated last year
- The nanoGPT-style implementation of RWKV Language Model - an RNN with GPT-level LLM performance.☆198Nov 9, 2023Updated 2 years ago
- Language Modeling with the H3 State Space Model☆522Sep 29, 2023Updated 2 years ago
- [NeurIPS 2023] Sparse Modular Activation for Efficient Sequence Modeling☆40Dec 2, 2023Updated 2 years ago