ezyang / ai-blindspotsLinks

Blindspots in LLMs I've noticed while AI coding. Sonnet family emphasis.

☆13

Alternatives and similar repositories for ai-blindspots

Users that are interested in ai-blindspots are comparing it to the libraries listed below

Sorting:

codelion / pts
Pivotal Token Search
☆109Updated this week
blackhole89 / autopen
Editor with LLM generation tree exploration
☆71Updated 5 months ago
JD-P / RetroInstruct
Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.
☆32Updated 4 months ago
slashml / awesome-finetuning
☆28Updated 10 months ago
doomslide / autoloom
Approximating the joint distribution of language models via MCTS
☆21Updated 8 months ago
egozverev / Should-It-Be-Executed-Or-Processed
Accompanying code and SEP dataset for the "Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?" paper.
☆54Updated 4 months ago
lechmazur / divergent
LLM Divergent Thinking Creativity Benchmark. LLMs generate 25 unique words that start with a given letter with no connections to each oth…
☆31Updated 3 months ago
silphendio / sliced_llama
Simple LLM inference server
☆20Updated last year
lechmazur / nyt-connections
Benchmark that evaluates LLMs using 651 NYT Connections puzzles extended with extra trick words
☆130Updated this week
kubernetes-bad / reward-composer
Lego for GRPO
☆28Updated last month
av / klmbr
klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs
☆78Updated 9 months ago
retab-dev / retab
The developper starter pack for document processing
☆16Updated this week
xjdr-alt / llmri
look how they massacred my boy
☆63Updated 9 months ago
simonw / llm-command-r
Access the Cohere Command R family of models
☆37Updated 3 months ago
kyegomez / Exa
Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…
☆26Updated 8 months ago
dnbt777 / RL-Experiments
☆38Updated 4 months ago
smolorg / smoltropix
MLX port for xjdr's entropix sampler (mimics jax implementation)
☆64Updated 8 months ago
socketteer / loomsidian
a socketteer/loom reimplementation in obsidian
☆17Updated last year
nexusflowai / NexusBench
Nexusflow function call, tool use, and agent benchmarks.
☆25Updated 7 months ago
MNoorFawi / curlora
The code repository for the CURLoRA research paper. Stable LLM continual fine-tuning and catastrophic forgetting mitigation.
☆47Updated 10 months ago
charlesfrye / cuda-substrings
Because it's there.
☆16Updated 9 months ago
serp-ai / Parameter-Efficient-MoE
Parameter-Efficient Sparsity Crafting From Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
☆31Updated last year
vgel / logitloom
explore token trajectory trees on instruct and base models
☆134Updated last month
SpellcraftAI / turing
Turing machines, Rule 110, and A::B reversal using Claude 3 Opus.
☆58Updated last year
Nero10578 / LLM-Inference-Benchmark
☆14Updated 10 months ago
reka-ai / rekaquant
☆49Updated last week
valine / training-hot-swap
Pytorch script hot swap: Change code without unloading your LLM from VRAM
☆126Updated 2 months ago
xjdr-alt / muzero_sketch
☆38Updated 11 months ago
ScalingIntelligence / good-kernels
Samples of good AI generated CUDA kernels
☆84Updated last month
nisten / grokadamw
new optimizer
☆20Updated 11 months ago