dcarpintero / wikisearch
Multilingual Semantic Search with Reranking on a prepared large vectorized dataset comprising 10 million Wikipedia documents. It supports dense retrieval, keyword search, and hybrid search.
☆13Updated last year
Alternatives and similar repositories for wikisearch
Users that are interested in wikisearch are comparing it to the libraries listed below
Sorting:
- Contains Google Colab or Jupyter notebooks, as well as other associated files for my Medium blogposts.☆35Updated 11 months ago
- Code to extract Knowledge Graph from normal, unstructured text and visualize the resulting graph☆57Updated last year
- 💙 Unstructured Data Connectors for Haystack 2.0☆16Updated last year
- ☆19Updated last year
- A python package that provides a custom streamlit connection to query data from weaviate, the AI native vector database☆55Updated 9 months ago
- Applying Evaluation Driven Development (EDD) to aid in the design decision of RAG pipelines☆31Updated last year
- Github repo for storing LlamaDatasets☆33Updated last year
- Use Grounding DINO, Segment Anything, and CLIP to label objects in images.☆31Updated last year
- Document Q&A on Wikipedia articles using LLMs☆76Updated last year
- LLM Agent that performs sentiment analysis of drawings and natural language using a combination of Google Gemini Vision model and GPT-4 T…☆13Updated last year
- Neo4j Extensions and Integrations with Vertex AI and LangChain☆26Updated last month
- Question Answer Generation App using Mistral 7B, Langchain, and FastAPI.☆65Updated last year
- LlamaIndex Notebooks☆26Updated last year
- Label your images using GPT-4!☆18Updated last year
- API to load and query documents using RAG☆15Updated last year
- Explore the use of DSPy for extracting features from PDFs 🔎☆39Updated last year
- Mistral + Haystack: build RAG pipelines that rock 🤘☆103Updated last year
- ☆22Updated last year
- Multimodal AI App using Llava 7B and Gradio.☆38Updated last year
- Your Chatbot Mastery: build a super small custom AI assistant with Gradio_client Python and Streamlit - Chapter 1☆16Updated last year
- Streamlit application that helps users analyze RFP's using the latest Gemini 2.0 Flash Experimental LLM.☆13Updated 4 months ago
- Open source libraries and APIs to build custom preprocessing pipelines for labeling, training, or production machine learning pipelines.☆27Updated 2 years ago
- Winning Hackathon entry for Streamlit LLM Hackathon October 2023☆15Updated last year
- Experimenting text-embeddings-inference server on both CPU and GPU☆18Updated last year
- Visualization for a Retrieval-Augmented Generation (RAG) Assistant 🤖❤️📚☆190Updated 5 months ago
- Resources for exploring Generative Feedback Loops with Weaviate!☆37Updated 3 weeks ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆47Updated 8 months ago
- Building a Chain of Thought RAG Model with DSPy, Qdrant and Ollama☆32Updated last year
- Fine tuning ModernBERT-embed-base on synthetic domain specific data for improvement to unseen queries☆29Updated 3 months ago
- Chat with Documents from scratch using LLMs and a vector databse☆18Updated last year