philschmid / text-generation-inference-tests
β20Updated last year
Alternatives and similar repositories for text-generation-inference-tests:
Users that are interested in text-generation-inference-tests are comparing it to the libraries listed below
- Writing Blog Posts with Generative Feedback Loops!β47Updated last year
- Explore the use of DSPy for extracting features from PDFs πβ39Updated last year
- β51Updated 3 months ago
- β30Updated 8 months ago
- Routing on Random Forest (RoRF)β136Updated 6 months ago
- β24Updated last year
- Check for data drift between two OpenAI multi-turn chat jsonl files.β37Updated 11 months ago
- Mixing Language Models with Self-Verification and Meta-Verificationβ102Updated 3 months ago
- β48Updated last year
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β67Updated 4 months ago
- β76Updated 9 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Modelsβ104Updated 3 months ago
- β55Updated 2 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absoluteβ¦β49Updated 8 months ago
- β18Updated 5 months ago
- β48Updated 4 months ago
- A specification for OpenInference, a semantic mapping of ML inferencesβ46Updated 11 months ago
- Creating Generative AI Apps which workβ17Updated 8 months ago
- Self-host LLMs with vLLM and BentoMLβ97Updated this week
- β27Updated 4 months ago
- Using modal.com to process FineWeb-edu dataβ20Updated 3 weeks ago
- Automated testing and benchmarking for code generation agents.β18Updated last year
- Leverage your LangChain trace data for fine tuningβ41Updated 7 months ago
- Vector Database with support for late interaction and token level embeddings.β53Updated 6 months ago
- β34Updated 8 months ago
- A framework for evaluating function calls made by LLMsβ37Updated 8 months ago
- Code for evaluating with Flow-Judge-v0.1 - an open-source, lightweight (3.8B) language model optimized for LLM system evaluations. Crafteβ¦β64Updated 5 months ago
- Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignmentβ55Updated 7 months ago
- NeurIPS 2023 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorerβ41Updated last year
- SCIPE is a powerful tool for evaluating and diagnosing LLM (Large Language Model) graphs or chains.β21Updated 4 months ago