☆51Jun 21, 2025Updated 8 months ago
Alternatives and similar repositories for cornstack
Users that are interested in cornstack are comparing it to the libraries listed below
Sorting:
- ☆12Jul 31, 2025Updated 7 months ago
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆16Jan 16, 2024Updated 2 years ago
- CodeSage: Code Representation Learning At Scale (ICLR 2024)☆116Oct 27, 2024Updated last year
- The backup repository for FairytaleQA dataset and paper "Fantastic Questions and Where to Find Them: FairytaleQA – An Authentic Dataset f…☆10May 30, 2023Updated 2 years ago
- Implementation for Decision-focused Summarization (EMNLP2021)☆12Mar 14, 2022Updated 3 years ago
- BlockRank makes LLMs efficient and scalable for RAG and in-context ranking☆41Dec 12, 2025Updated 2 months ago
- Official implementation of "Data Mixture Inference: What do BPE tokenizers reveal about their training data?"☆18May 15, 2025Updated 9 months ago
- Evaluate state-of-the-art sparse embedding models on the LIMIT dataset (`limit-small` and `limit`) from google's paper `On the Theoretica…☆15Sep 4, 2025Updated 5 months ago
- Training and Benchmarking LLMs for Code Preference.☆38Nov 15, 2024Updated last year
- ☆13Jun 6, 2022Updated 3 years ago
- [SIGIR 2025] The official repo for "Scaling Sparse and Dense Retrieval in Decoder-Only LLMs"☆20Mar 31, 2025Updated 10 months ago
- Code release for "TempLM: Distilling Language Models into Template-Based Generators"☆14Jul 21, 2022Updated 3 years ago
- DImensionality REduction in JAX☆25Nov 21, 2025Updated 3 months ago
- QRHead: Query-Focused Retrieval Heads Improve Long-Context Reasoning and Re-ranking☆35Jan 20, 2026Updated last month
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆76Oct 19, 2024Updated last year
- [COLM 2025] "C3PO: Critical-Layer, Core-Expert, Collaborative Pathway Optimization for Test-Time Expert Re-Mixing"☆20Apr 9, 2025Updated 10 months ago
- SPRINT Toolkit helps you evaluate diverse neural sparse models easily using a single click on any IR dataset.☆47Jul 25, 2023Updated 2 years ago
- FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions☆52Jul 3, 2024Updated last year
- Deep Weighted Averaging Classifiers☆23Feb 4, 2019Updated 7 years ago
- ☆19Oct 2, 2023Updated 2 years ago
- ☆21Sep 6, 2021Updated 4 years ago
- ☆59Jan 28, 2025Updated last year
- Prompting Large Language Models to Generate Dense and Sparse Representations for Zero-Shot Document Retrieval☆52Jan 6, 2026Updated last month
- Code release for Type-Aware Bi-Encoders for Open-Domain Entity Retrieval☆19Sep 24, 2022Updated 3 years ago
- A curated list of awesome human-centered AI resources.☆49Apr 14, 2022Updated 3 years ago
- A collection of models built with ColossalAI☆32Nov 22, 2022Updated 3 years ago
- ☆57Dec 27, 2025Updated 2 months ago
- [ICML2024 Spotlight] Fine-Tuning Pre-trained Large Language Models Sparsely☆24Jun 26, 2024Updated last year
- Enhancing AI Software Engineering with Repository-level Code Graph☆252Apr 1, 2025Updated 10 months ago
- This repo illustrates how to evaluate the artifacts in the paper An Extensive Study on Pre-trained Models for Program Understanding and G…☆27Aug 12, 2022Updated 3 years ago
- BYOeB is a tool to build a chatbot with a custom knowledge base and an expert-in-the-loop.☆38Nov 4, 2025Updated 3 months ago
- A hackable, simple, and reseach-friendly GRPO Training Framework with high speed weight synchronization in a multinode environment.☆36Aug 27, 2025Updated 6 months ago
- Fast and versatile tokenizer for language models, compatible with SentencePiece, Tokenizers, Tiktoken and more. Supports BPE, Unigram and…☆44Oct 10, 2025Updated 4 months ago
- Checkpointable dataset utilities for foundation model training☆32Jan 29, 2024Updated 2 years ago
- Transform a corpus of text documents (any kind) into a map with different zoom levels and topics names to summarise sub corpus of similar…☆29Jan 1, 2024Updated 2 years ago
- Data and code for the paper "Future is not One-dimensional: Complex Event Schema Induction via Graph Modeling".☆29Apr 24, 2021Updated 4 years ago
- A massively multilingual modern encoder language model☆127Jan 20, 2026Updated last month
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Aug 2, 2024Updated last year
- [NeurIPS 2022] DreamShard: Generalizable Embedding Table Placement for Recommender Systems☆29Mar 24, 2023Updated 2 years ago