spcl / CheckEmbedLinks
Official Implementation of "CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks"
☆21Updated 4 months ago
Alternatives and similar repositories for CheckEmbed
Users that are interested in CheckEmbed are comparing it to the libraries listed below
Sorting:
- ☆40Updated 5 months ago
- ☆72Updated last year
- AskIt: Unified programming interface for programming with LLMs (GPT-3.5, GPT-4, Gemini, Claude, Cohere, Llama 2)☆79Updated 9 months ago
- ☆36Updated 3 weeks ago
- [NAACL 2025] Official Implementation of "HMT: Hierarchical Memory Transformer for Long Context Language Processing"☆75Updated 3 months ago
- Compression for Foundation Models☆35Updated 2 months ago
- Cascade Speculative Drafting☆31Updated last year
- Code for paper "Analog Foundation Models"☆27Updated last month
- ☆32Updated 8 months ago
- Beyond KV Caching: Shared Attention for Efficient LLMs☆19Updated last year
- Utilities for constructing a large dataset of LLVM IR☆23Updated 4 months ago
- Estimating hardware and cloud costs of LLMs and transformer projects☆19Updated 4 months ago
- ☆11Updated 6 months ago
- RepoQA: Evaluating Long-Context Code Understanding☆119Updated 11 months ago
- Source code for Activated LoRA☆21Updated last week
- Mask-Enhanced Autoregressive Prediction: Pay Less Attention to Learn More☆34Updated 5 months ago
- ☆19Updated 6 months ago
- MPI Code Generation through Domain-Specific Language Models☆14Updated 11 months ago
- Library to interface Compilers and ML models for ML-Enabled Compiler Optimizations☆18Updated last month
- Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"☆25Updated 10 months ago
- Information and artifacts for "LoRA Learns Less and Forgets Less" (TMLR, 2024)☆16Updated last year
- FlexAttention w/ FlashAttention3 Support☆27Updated last year
- [ICML 24 NGSM workshop] Associative Recurrent Memory Transformer implementation and scripts for training and evaluation☆52Updated this week
- Layer-Condensed KV cache w/ 10 times larger batch size, fewer params and less computation. Dramatic speed up with better task performance…☆155Updated 6 months ago
- ☆13Updated 6 months ago
- An extention to the GaLore paper, to perform Natural Gradient Descent in low rank subspace☆17Updated 11 months ago
- [ICLR'25] Code for KaSA, an official implementation of "KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models"☆20Updated 9 months ago
- Pytorch code for paper QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models☆24Updated 2 years ago
- ☆51Updated last year
- Tree Attention: Topology-aware Decoding for Long-Context Attention on GPU clusters☆130Updated 10 months ago