ablghtianyi/ICL_Modular_Arithmetic

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/ablghtianyi/ICL_Modular_Arithmetic)

ablghtianyi / ICL_Modular_Arithmetic

☆19

Alternatives and similar repositories for ICL_Modular_Arithmetic

Users that are interested in ICL_Modular_Arithmetic are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

liunian-harold-li / scotd
View on GitHub
☆16Apr 15, 2024Updated 2 years ago
hrlics / LITE
View on GitHub
[COLM 2024] LITE: Modeling Environmental Ecosystems with Multimodal Large Language Models
☆14Jan 4, 2025Updated last year
sail-sg / ActivePRM
View on GitHub
☆21Apr 16, 2025Updated last year
OSU-NLP-Group / GrokkedTransformer
View on GitHub
Code for NeurIPS'24 paper 'Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization'
☆240Jul 19, 2025Updated last year
SeanLeng1 / Reward-Calibration
View on GitHub
☆21Dec 14, 2024Updated last year
Bare Metal GPUs on DigitalOcean Gradient AI • Ad
Purpose-built for serious AI teams training foundational models, running large-scale inference, and pushing the boundaries of what's possible.
Jiachen-T-Wang / GREATS
View on GitHub
☆20Jun 27, 2026Updated 3 weeks ago
katiekang1998 / reasoning_generalization
View on GitHub
☆33Jan 7, 2025Updated last year
princeton-pli / what-makes-good-rm
View on GitHub
[NeurIPS 2025] What Makes a Reward Model a Good Teacher? An Optimization Perspective
☆44Sep 18, 2025Updated 10 months ago
ZhentingWang / DUMP
View on GitHub
☆33May 9, 2025Updated last year
alex-damian / EOS
View on GitHub
☆15Sep 29, 2022Updated 3 years ago
mcleish7 / arithmetic
View on GitHub
Code to reproduce "Transformers Can Do Arithmetic with the Right Embeddings", McLeish et al (NeurIPS 2024)
☆200May 28, 2024Updated 2 years ago
ZeroWeight / NeuralTS
View on GitHub
☆17Jul 6, 2023Updated 3 years ago
TonyXiChen / OASR
View on GitHub
☆20Jun 2, 2026Updated last month
NetPharMedGroup / publication_fingerprint
View on GitHub
code for Zagidullin et al 2021 "Comparative analysis of molecular fingerprints in prediction of drug combination effects"
☆17Aug 1, 2022Updated 3 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hanggao-gh / InteractiveMemorySharingLLM
View on GitHub
☆22Oct 12, 2024Updated last year
Leooyii / LCEG
View on GitHub
[COLM'25] A Controlled Study on Long Context Extension and Generalization in LLMs
☆65Mar 9, 2026Updated 4 months ago
yiqingxyq / RepoST
View on GitHub
Code for "[COLM'25] RepoST: Scalable Repository-Level Coding Environment Construction with Sandbox Testing"
☆24Mar 18, 2025Updated last year
WANGXinyiLinda / planning_tokens
View on GitHub
Official code for Guiding Language Model Math Reasoning with Planning Tokens
☆19Feb 29, 2024Updated 2 years ago
William-wAng618 / M2PT
View on GitHub
Official repo of M$^2$PT: Multimodal Prompt Tuning for Zero-shot Instruction Learning
☆29Mar 23, 2025Updated last year
haolunc / iGSM-Replication-physics-LLM
View on GitHub
This repository contains the replication of the iGSM dataset generation process from the Physics of LLM paper by Zeyuan Zhu.
☆17Sep 13, 2024Updated last year
neelnanda-io / Crosscoders
View on GitHub
☆60Nov 19, 2024Updated last year
automl / is_mamba_capable_of_icl
View on GitHub
☆18Apr 24, 2024Updated 2 years ago
berlino / seq_icl
View on GitHub
☆54May 20, 2024Updated 2 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
bdusell / stack-attention
View on GitHub
Code for the paper "Stack Attention: Improving the Ability of Transformers to Model Hierarchical Patterns"
☆18Mar 15, 2024Updated 2 years ago
fangyuan-ksgk / selective-attention-transformer
View on GitHub
Unofficial Implementation of Selective Attention Transformer
☆20Oct 31, 2024Updated last year
cychomatica / FreeDave
View on GitHub
Free Draft-and-Verification: Toward Lossless Parallel Decoding for Diffusion Large Language Models
☆23May 19, 2026Updated 2 months ago
akashmondal1810 / UncertaintyEstimation
View on GitHub
Uncertainty Estimation Using Deep Neural Network and Gradient Boosting Methods
☆22Jun 1, 2021Updated 5 years ago
WangWenhao0716 / PDF-Embedding
View on GitHub
[NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"
☆18Oct 1, 2024Updated last year
Luckfort / CD
View on GitHub
[COLING'25] Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers?
☆82Jan 22, 2025Updated last year
microsoft / SparseMixer
View on GitHub
Sparse Backpropagation for Mixture-of-Expert Training
☆30Jul 2, 2024Updated 2 years ago
webis-de / set-encoder
View on GitHub
Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders
☆19May 23, 2025Updated last year
BY571 / SCoRe
View on GitHub
SCoRe: Training Language Models to Self-Correct via Reinforcement Learning
☆16May 14, 2026Updated 2 months ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
revelio-diffusion / revelio
View on GitHub
☆26Jun 29, 2025Updated last year
locuslab / llava-token-compression
View on GitHub
☆47Nov 8, 2024Updated last year
MingyuJ666 / sparsityLLM
View on GitHub
[preprint] sparsity
☆20May 6, 2026Updated 2 months ago
GuangyanS / Sys2-LLaVA
View on GitHub
☆31Feb 10, 2025Updated last year
dmis-lab / Monet
View on GitHub
[ICLR 2025] Monet: Mixture of Monosemantic Experts for Transformers
☆79Jun 23, 2025Updated last year
Z1zs / MMNeuron
View on GitHub
Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our co…
☆26Dec 20, 2024Updated last year
Bond1995 / Markov
View on GitHub
Code for experiments on transformers using Markovian data.
☆22Nov 22, 2024Updated last year