dcarpintero / wikisearch
Multilingual Semantic Search with Reranking on a prepared large vectorized dataset comprising 10 million Wikipedia documents. It supports dense retrieval, keyword search, and hybrid search.
β13Updated last year
Alternatives and similar repositories for wikisearch:
Users that are interested in wikisearch are comparing it to the libraries listed below
- Ingest PDFs into Weaviateβ33Updated 10 months ago
- π Unstructured Data Connectors for Haystack 2.0β16Updated last year
- LLM Chatbot w/ Retrieval Augmented Generation using Llamaindex. It demonstrates how to impl. chunking, indexing, and source citation.β44Updated last year
- β19Updated 6 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β66Updated 5 months ago
- Split and analyze text files using langchain and streamlitβ48Updated 11 months ago
- A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector databaseβ54Updated 8 months ago
- I have explained how to create superior RAG pipeline for complex pdfs using LlamaParse. We can extract text and tables from pdf and QA onβ¦β44Updated last year
- Mistral + Haystack: build RAG pipelines that rock π€β103Updated last year
- Building a Chain of Thought RAG Model with DSPy, Qdrant and Ollamaβ31Updated last year
- Streamlit app for recommending eval functions using prompt diffsβ27Updated last year
- β20Updated last year
- streamlit component for image viewerβ9Updated last year
- π A contracts clause summarization system using LLM and vector databaseβ16Updated 2 months ago
- π A list of Haystack Integrations, maintained by the community or deepset.β85Updated this week
- β40Updated 2 weeks ago
- Using ChatGPT to build a Kedro ML pipeline and Streamlit frontendβ30Updated 2 years ago
- Benchmark study on LanceDB, an embedded vector DB, for full-text search and vector searchβ24Updated last year
- Web App for generating synthetic dataβ46Updated 7 months ago
- Create a knowledge graph out of unstructed legal text - use said knowledge graph in a graph augmented retrieval augmented generation pipeβ¦β43Updated 7 months ago
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.β27Updated 2 years ago
- Question Answer Generation App using Mistral 7B, Langchain, and FastAPI.β65Updated last year
- Efficient few-shot learning with cross-encoders.β51Updated last year
- Github repo for storing LlamaDatasetsβ33Updated last year
- Explore the use of DSPy for extracting features from PDFs πβ39Updated last year
- LLM Agent that performs sentiment analysis of drawings and natural language using a combination of Google Gemini Vision model and GPT-4 Tβ¦β13Updated last year
- Adding NeMo Guardrails to a LlamaIndex RAG pipelineβ36Updated last year
- Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)β76Updated 2 months ago
- Paste Word, get Markdownβ16Updated 8 months ago
- AgenticSearch operates within an agentic workflow, utilizing Gemini 2.0 and an extensive tool registry to handle complex questions. By inβ¦β16Updated 3 months ago