DunZhang / Stella
☆43Updated 4 months ago
Related projects ⓘ
Alternatives and complementary repositories for Stella
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆68Updated last month
- Codebase accompanying the Summary of a Haystack paper.☆72Updated 2 months ago
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Search☆61Updated 4 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".☆129Updated this week
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]☆124Updated 3 weeks ago
- Official implementation for 'Extending LLMs’ Context Window with 100 Samples'☆74Updated 10 months ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆90Updated 8 months ago
- Codes and datasets for the paper Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Ref…☆23Updated last month
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)☆115Updated last week
- Finetune mistral-7b-instruct for sentence embeddings☆71Updated 6 months ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated last month
- 🚢 Data Toolkit for Sailor Language Models☆82Updated 4 months ago
- ☆42Updated 4 months ago
- ☆56Updated 9 months ago
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answers☆122Updated 8 months ago
- [IJCAI 2024] FactCHD: Benchmarking Fact-Conflicting Hallucination Detection☆81Updated 6 months ago
- Data preparation code for CrystalCoder 7B LLM☆42Updated 6 months ago
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark☆106Updated last month
- XTR: Rethinking the Role of Token Retrieval in Multi-Vector Retrieval☆37Updated 5 months ago
- Leveraging passage embeddings for efficient listwise reranking with large language models.☆33Updated last month
- ☆43Updated last month
- ☆69Updated last year
- The code for the paper: "Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models"☆49Updated 4 months ago
- Retrieval-Augmented Generation battle!☆43Updated last week
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [F…☆58Updated 5 months ago
- ☆22Updated 2 weeks ago
- Simple replication of [ColBERT-v1](https://arxiv.org/abs/2004.12832).☆77Updated 8 months ago
- minimal LLM scripts for 24GB VRAM GPUs. training, inference, whatever☆33Updated this week
- ☆14Updated 2 weeks ago