LLaDA implementation
☆19Jul 24, 2025Updated 8 months ago
Alternatives and similar repositories for LLaDA_Arithmetic
Users that are interested in LLaDA_Arithmetic are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Asymmetric Multi-Task Learning code, If you want to use it, please let me know and cite AMTL paper☆11Aug 3, 2016Updated 9 years ago
- ☆16Jul 23, 2024Updated last year
- Source code for SWIFT, an efficient reward model.☆20Jan 13, 2026Updated 3 months ago
- dynamic planning, hybrid models, hierarchical active inference, tool use☆15Jun 13, 2025Updated 10 months ago
- Unofficial implementation of the Ask-LLM paper 'How to Train Data-Efficient LLMs', arXiv:2402.09668.☆12Jun 19, 2024Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆30Dec 19, 2025Updated 3 months ago
- [ICLR 2026] Official Implementation of ProxyThinker: Test-Time Guidance through Small Visual Reasoners.☆21Sep 24, 2025Updated 6 months ago
- Code for experiments on transformers using Markovian data.☆22Nov 22, 2024Updated last year
- [ACL 2023] Contextual Distortion Reveals Constituency: Mask Language Models are Implicit Parsers.☆14Jun 3, 2023Updated 2 years ago
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆33Dec 24, 2025Updated 3 months ago
- 魔镜魔镜,无所不知的魔镜[-_-](并不是)☆13Jun 10, 2021Updated 4 years ago
- [CVPR 2025] Decision SpikeFormer: Spike-Driven Transformer for Decision Making☆19Aug 8, 2025Updated 8 months ago
- ☆13Nov 18, 2025Updated 4 months ago
- Self Reproduction Code of Paper "Reducing Transformer Key-Value Cache Size with Cross-Layer Attention (MIT CSAIL)☆17May 24, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- This repository contains an experimental PyTorch implementation exploring the NoProp algorithm, presented in the paper "NOPROP: TRAINING …☆16Mar 8, 2026Updated last month
- Generative Modeling via Drifting in MLX☆42Feb 6, 2026Updated 2 months ago
- PyTorch code and models for ScaLR image-to-lidar distillation method☆62Jul 8, 2025Updated 9 months ago
- Official Implementation of Geo2Vec oral presented @ [AAAI '2026]☆32Apr 2, 2026Updated last week
- [ICRA2025] This is official implementation for annealed Winner-Takes-All loss in <Annealed Winner-Takes-All for Motion Forecasting>.☆23Mar 5, 2025Updated last year
- Unbalanced Optimal Transport: A Unified Framework for Object Detection☆22Jan 14, 2025Updated last year
- A PyTorch native platform for training generative AI models☆16Nov 18, 2025Updated 4 months ago
- VGA LCD Core (OpenCores)☆15May 22, 2018Updated 7 years ago
- Light-weight real-time multi-object detection and tracking in Nvidia TX2☆10May 10, 2019Updated 6 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- ☆12Feb 19, 2024Updated 2 years ago
- Python toolkit for document information extraction using LMDX☆13Oct 15, 2023Updated 2 years ago
- a simple web of data visualization☆11Feb 18, 2023Updated 3 years ago
- [CCS-LAMPS'24] LLM IP Protection Against Model Merging☆16Oct 14, 2024Updated last year
- Code for our paper: Online Variational Filtering and Parameter Learning☆21Dec 8, 2021Updated 4 years ago
- A custom Color Picker widget for PyQt5/PyQt6 applications.☆17May 22, 2021Updated 4 years ago
- A C/C++ header file that converts Intel SSE intrinsics to MIPS/MIPS64 MSA intrinsics.☆10Nov 16, 2021Updated 4 years ago
- ☆19Mar 12, 2025Updated last year
- ☆16Jan 7, 2025Updated last year
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- ☆14Sep 14, 2021Updated 4 years ago
- A utility to be able to launch jobs on JZ with reasonable defaults☆19Apr 13, 2023Updated 3 years ago
- This repository contains a curated list of resources related to World Models for Autonomous Driving (WMAD), based on the survey.☆30Oct 10, 2025Updated 6 months ago
- 个人轨迹树 / My Previous Blog☆17Mar 14, 2020Updated 6 years ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- Generate sentences from a probabilistic context-free grammar.☆17Nov 8, 2024Updated last year
- This is the implementation of the paper "Pre-training Time Series Models with Stock Data Customization"☆41May 30, 2025Updated 10 months ago