mesolitica / llm-embedding
Finetune Malaysian LLM for Malaysian context embedding task.
☆20Updated 9 months ago
Alternatives and similar repositories for llm-embedding:
Users that are interested in llm-embedding are comparing it to the libraries listed below
- This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…☆31Updated last year
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆46Updated last year
- ☆15Updated last year
- ☆19Updated 3 months ago
- Implementation of the model: "Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models" in PyTorch☆29Updated last week
- ☆26Updated 2 years ago
- Code for "Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking" (https://arxiv.org/abs/2…☆13Updated last year
- [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training Signal Generators☆24Updated last year
- No Parameter Left Behind: How Distillation and Model Size Affect Zero-Shot Retrieval☆28Updated 2 years ago
- KETOD Knowledge-Enriched Task-Oriented Dialogue☆32Updated 2 years ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆14Updated last year
- Implementation of pQRNN in PyTorch☆46Updated 3 years ago
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated 2 years ago
- The codebase for our ACL2023 paper: Did You Read the Instructions? Rethinking the Effectiveness of Task Definitions in Instruction Learni…☆29Updated last year
- Code for the paper-"Mirostat: A Perplexity-Controlled Neural Text Decoding Algorithm" (https://arxiv.org/abs/2007.14966).☆58Updated 3 years ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Updated 2 years ago
- ☆17Updated 6 months ago
- PyTorch code for EMNLP 2021 paper: Don't be Contradicted with Anything! CI-ToD: Towards Benchmarking Consistency for Task-oriented Dialog…☆27Updated 3 years ago
- Code repo for "Model-Generated Pretraining Signals Improves Zero-Shot Generalization of Text-to-Text Transformers" (ACL 2023)☆22Updated last year
- ☆97Updated 2 years ago
- Unifew: Unified Fewshot Learning Model☆18Updated 3 years ago
- Convenient Text-to-Text Training for Transformers☆19Updated 3 years ago
- Code for the paper "Getting the most out of your tokenizer for pre-training and domain adaptation"☆15Updated last year
- Code for the arXiv paper: "LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond"☆59Updated 3 weeks ago
- ☆34Updated last year
- Data and code accompanying the paper "Intent Detection with WikiHow"☆10Updated 3 years ago
- CREL: Personal Entity, Concept, and Named Entity Linking in Conversations☆10Updated last year
- A Benchmark for Robust, Multi-evidence, Multi-answer Question Answering☆16Updated 2 years ago