foundation-model-stack / fms-dgtLinks
Synthetic Data Generation for Foundation Models
β21Updated 7 months ago
Alternatives and similar repositories for fms-dgt
Users that are interested in fms-dgt are comparing it to the libraries listed below
Sorting:
- π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data β¦β208Updated this week
- AI Steerability 360 toolkitβ28Updated this week
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Modelsβ564Updated last year
- Collection of evals for Inspect AIβ233Updated this week
- In-Context Explainability 360 toolkitβ28Updated this week
- β306Updated last year
- Efficient LLM inference on Slurm clusters using vLLM.β77Updated this week
- β47Updated 6 months ago
- β114Updated last year
- A benchmark that challenges language models to code solutions for scientific problemsβ141Updated last week
- A framework for few-shot evaluation of autoregressive language models.β105Updated 2 years ago
- A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.β256Updated last month
- LM engine is a library for pretraining/finetuning LLMsβ66Updated 2 weeks ago
- Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.β148Updated 2 weeks ago
- Discovering Data-driven Hypotheses in the Wildβ110Updated 3 months ago
- β359Updated this week
- β24Updated 3 months ago
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasetsβ224Updated 10 months ago
- [ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".β256Updated 10 months ago
- Reproducible, flexible LLM evaluationsβ248Updated 2 months ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.β138Updated last year
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β224Updated last year
- β129Updated 2 weeks ago
- Code release for "Debating with More Persuasive LLMs Leads to More Truthful Answers"β118Updated last year
- ControlArena is a collection of settings, model organisms and protocols - for running control experiments.β93Updated last week
- Evaluate your LLM's response with Prometheus and GPT4 π―β989Updated 4 months ago
- Official repository for "Scaling Retrieval-Based Langauge Models with a Trillion-Token Datastore".β216Updated last month
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomicβ¦β380Updated 5 months ago
- Open source project for data preparation for GenAI applicationsβ800Updated this week
- Taxonomy tree that will allow you to create models tuned with your dataβ281Updated 2 weeks ago