website for MS Marco
☆35Mar 26, 2025Updated last year
Alternatives and similar repositories for msmarco
Users that are interested in msmarco are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- C4RepSet: Representative Subset from C4 data for Training Pre-trained LMs☆11Jan 13, 2023Updated 3 years ago
- Dataset and code for three Web crawling-related papers from SIGIR-2019, NeurIPS-2019. and ICML-2020.☆42Jan 7, 2025Updated last year
- Twitter bot that tweets translated arXiv paper summaries☆10Dec 11, 2021Updated 4 years ago
- Demo of fine-tuning QA models for answering FAQ of cloud providers documentation☆11Mar 7, 2023Updated 3 years ago
- NIILC QA data☆18Nov 20, 2015Updated 10 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Website for the TREC Deep Learning Track 2019☆86Jun 12, 2023Updated 2 years ago
- bootstrap my zsh shell☆17Mar 28, 2026Updated 2 months ago
- Scripts for creating a Japanese-English parallel corpus and training NMT models☆18Nov 9, 2021Updated 4 years ago
- ☆34Feb 17, 2021Updated 5 years ago
- ☆10Dec 17, 2020Updated 5 years ago
- Code for the MTEB leaderboard☆31Feb 4, 2025Updated last year
- LightRAG with Neo4j Example Project☆18May 19, 2025Updated last year
- ☆17Jul 31, 2021Updated 4 years ago
- Tutorial and talk about the Reasonable Ontology Language at the Knowledge Graph Conference 2022.☆12May 9, 2023Updated 3 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- A Multi-domain Benchmark for Personalized Search Evaluation☆12Sep 7, 2023Updated 2 years ago
- Fully local RAG setup: GPT4ALL, HuggingFace Embeddings model, FAISS, LangChain☆10May 10, 2023Updated 3 years ago
- This project aims to convert the content of GitHub repositories into a structured, machine-readable format, enabling AI models like ChatG…☆12May 13, 2024Updated 2 years ago
- Pretrained segmenter models for Portuguese legislative text.☆15Oct 13, 2024Updated last year
- ☆89Sep 13, 2023Updated 2 years ago
- ☆11Apr 21, 2025Updated last year
- ☆12Mar 24, 2026Updated 2 months ago
- pialign - A Phrasal ITG Aligner☆24Apr 29, 2019Updated 7 years ago
- Sentence Similarity Checker using Encoder Decoder Model☆21Dec 12, 2017Updated 8 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Implementation of the sacabench framework☆19May 15, 2021Updated 5 years ago
- ☆18Updated this week
- ☆19Oct 6, 2020Updated 5 years ago
- The QA datasets used for DrQA evaluation.☆14Nov 30, 2018Updated 7 years ago
- A module to easily integrate Clarity Analytics into your Nuxt 3 project.☆14Jun 2, 2026Updated last week
- Yet another dependency parser, integrated with tokenizer, tagger and visualization tool.☆11Mar 18, 2018Updated 8 years ago
- A simple toolkit to process TREC files in Python.☆174Aug 24, 2024Updated last year
- MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, …☆343Jun 12, 2023Updated 2 years ago
- Folio Flat File to XML/HTML/Lucene conversion framework☆14Apr 22, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- A C++ library implementing fast language models estimation using the 1-Sort algorithm.☆16May 18, 2023Updated 3 years ago
- ☆13Jun 26, 2021Updated 4 years ago
- Playground and exercises for basic deep learning algorithms, based on the Stanford UFLDL deep learning tutorials.☆11Nov 16, 2014Updated 11 years ago
- KL3M training data collection and preprocessing☆22Apr 14, 2025Updated last year
- [ICLR 2026] Official PyTorch implementation for "ReFusion: A Diffusion Large Language Model with Parallel Autoregressive Decoding"☆63Dec 26, 2025Updated 5 months ago
- Word2vec Model Reader for Node.js Client☆13May 8, 2019Updated 7 years ago
- This repo contains the source code for the "Structured Logging in ASP.NET Core with Serilog" article on Code Maze☆13May 12, 2022Updated 4 years ago