taylorai / galacticLinks

data cleaning and curation for unstructured text

☆328

Alternatives and similar repositories for galactic

Users that are interested in galactic are comparing it to the libraries listed below

Sorting:

arcee-ai / DALM
Domain Adapted Language Modeling Toolkit - E2E RAG
☆325Updated 8 months ago
KarelDO / xmc.dspy
In-Context Learning for eXtreme Multi-Label Classification (XMC) using only a handful of examples.
☆433Updated last year
redotvideo / pluto
Synthetic Data for LLM Fine-Tuning
☆120Updated last year
Muhtasham / summarization-eval
📝 Reference-Free automatic summarization evaluation with potential hallucination detection
☆101Updated last year
cohere-ai / DiskVectorIndex
☆210Updated last month
neuml / txtinstruct
📚 Datasets and models for instruction-tuning
☆238Updated last year
jxnl / n-levels-of-rag
☆195Updated last year
tigerlab-ai / tiger
Open Source LLM toolkit to build trustworthy LLM applications. TigerArmor (AI safety), TigerRAG (embedding, RAG), TigerTune (fine-tuning)
☆398Updated last year
FastEval / FastEval
Fast & more realistic evaluation of chat language models. Includes leaderboard.
☆187Updated last year
shroominic / funcchain
⛓️ build cognitive systems, pythonic
☆339Updated 8 months ago
Arize-ai / LLMTest_NeedleInAHaystack
Doing simple retrieval from LLM models at various context lengths to measure accuracy
☆102Updated last year
cohere-ai / BinaryVectorDB
Efficient vector database for hundred millions of embeddings.
☆207Updated last year
raphaelsty / neural-cherche
Neural Search
☆362Updated 4 months ago
migtissera / Sensei
Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI
☆222Updated last year
philschmid / easyllm
☆461Updated last year
Preemo-Inc / text-generation-inference
☆199Updated last year
VikParuchuri / textbook_quality
Generate textbook-quality synthetic LLM pretraining data
☆501Updated last year
SpellcraftAI / oaib
Use the OpenAI Batch tool to make async batch requests to the OpenAI API.
☆99Updated last year
AnswerDotAI / fastdata
☆154Updated 8 months ago
mixedbread-ai / baguetter
Baguetter is a flexible, efficient, and hackable search engine library implemented in Python. It's designed for quickly benchmarking, imp…
☆186Updated 11 months ago
zetaalphavector / RAGElo
RAGElo is a set of tools that helps you selecting the best RAG-based LLM agents by using an Elo ranker
☆114Updated 3 weeks ago
HazyResearch / evaporate
This repo contains data and code for the paper "Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Da…
☆489Updated last year
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated last year
villagecomputing / superopenai
Logging and caching superpowers for the openai sdk
☆105Updated last year
advanced-stack / py-llm-core
A pythonic library providing light-weighted interface with LLMs
☆127Updated 2 months ago
MoritzLaurer / zeroshot-classifier
Notebooks for training universal 0-shot classifiers on many different tasks
☆133Updated 7 months ago
arthur-ai / bench
A tool for evaluating LLMs
☆423Updated last year
natolambert / blogcaster
Python tools for easily translating your blog content to podcasts & YouTube
☆206Updated 10 months ago
VikParuchuri / libgen_to_txt
Convert all of libgen to high quality markdown
☆253Updated last year
AymenKallala / RAG_Maestro
Building a chatbot powered with a RAG pipeline to read,summarize and quote the most relevant papers related to the user query.
☆167Updated last year