"Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?"
☆39Nov 13, 2024Updated last year
Alternatives and similar repositories for PoolingAndAttn
Users that are interested in PoolingAndAttn are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- The first dense retrieval model that can be prompted like an LM☆91May 8, 2025Updated 11 months ago
- Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)☆23Jun 2, 2025Updated 10 months ago
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆61Feb 10, 2025Updated last year
- ☆21Apr 3, 2026Updated 2 weeks ago
- LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture☆213Jan 6, 2025Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Official Code for paper "Towards Efficient and Effective Unlearning of Large Language Models for Recommendation" (Frontiers of Computer S…☆38Jul 19, 2024Updated last year
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆33Nov 29, 2023Updated 2 years ago
- Simple program to manually caption your images (or any other file types) so you can use them for AI training☆37Mar 20, 2023Updated 3 years ago
- Co-LLM: Learning to Decode Collaboratively with Multiple Language Models☆126May 7, 2024Updated last year
- official repository for ListT5☆48Nov 27, 2025Updated 4 months ago
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆92Jul 21, 2024Updated last year
- Code for Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking, EMNLP 2022, https://aclan…☆14Mar 30, 2026Updated 2 weeks ago
- Adversaial attack comparative assessment Large Language Model☆13May 21, 2025Updated 10 months ago
- XmodelLM☆38Nov 19, 2024Updated last year
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Code repository for the paper "The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Le…☆13Jan 16, 2025Updated last year
- ☆21Oct 22, 2025Updated 5 months ago
- Mixture of Lora Experts☆10Apr 7, 2024Updated 2 years ago
- EnsLoss: Stochastic Calibrated Loss Ensembles for Preventing Overfitting in Classification☆34Nov 1, 2025Updated 5 months ago
- Official Pytorch Implementation of Self-emerging Token Labeling☆35Mar 27, 2024Updated 2 years ago
- ☆11Oct 8, 2023Updated 2 years ago
- Code for the paper "Closing the Curious Case of Neural Text Degeneration"☆12Apr 9, 2025Updated last year
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆149Nov 9, 2024Updated last year
- ☆17Apr 9, 2025Updated last year
- Deploy open-source AI quickly and easily - Bonus Offer • AdRunpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
- Extended Inductive Reasoning for Personalized Preference Inference from Behavioral Signals☆11Jan 8, 2026Updated 3 months ago
- The dataset consists of public social media url pairs and the corresponding entailment label for an external conference (ACL 2021). Each …☆14Aug 16, 2021Updated 4 years ago
- [ACL 2025 Findings] Implicit Reasoning in Transformers is Reasoning through Shortcuts☆17Mar 11, 2025Updated last year
- Code for the paper "Cottention: Linear Transformers With Cosine Attention"☆20Nov 15, 2025Updated 5 months ago
- Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays☆20May 28, 2024Updated last year
- Danmuku dataset☆12Jul 7, 2023Updated 2 years ago
- ☆15Apr 6, 2026Updated last week
- [ACCV 2024 ] Official code for "DeBiFormer: Vision Transformer with Deformable Agent Bi-level Routing Attention"☆32Jan 8, 2025Updated last year
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆15Jul 24, 2024Updated last year
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- ☆21Mar 26, 2025Updated last year
- Graph Diffusion Policy Optimization☆42Mar 17, 2024Updated 2 years ago
- AceParse: A Comprehensive Dataset with Diverse Structured Texts for Academic Literature Parsing☆44Sep 17, 2024Updated last year
- Code release for Type-Aware Bi-Encoders for Open-Domain Entity Retrieval☆19Sep 24, 2022Updated 3 years ago
- The open-source materials for paper "Sparsing Law: Towards Large Language Models with Greater Activation Sparsity".☆30Nov 12, 2024Updated last year
- DALI Multi Agent System Framework☆42Mar 24, 2026Updated 3 weeks ago
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year