foundation-model-stack / fms-dgtLinks
Synthetic Data Generation for Foundation Models
β21Updated last month
Alternatives and similar repositories for fms-dgt
Users that are interested in fms-dgt are comparing it to the libraries listed below
Sorting:
- The AI Steerability 360 toolkit is an extensible library for general purpose steering of LLMs.β54Updated last month
- π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data β¦β212Updated this week
- β401Updated last week
- Collection of evals for Inspect AIβ313Updated this week
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Modelsβ586Updated last year
- ACPBench: Reasoning about Action, Change, and Planningβ31Updated 2 weeks ago
- Large language model and dataset for natural language to first-order logic translationβ73Updated 2 years ago
- β25Updated 6 months ago
- β321Updated last year
- What's In My Big Data (WIMBD) - a toolkit for analyzing large text datasetsβ224Updated last year
- Efficient LLM inference on Slurm clusters using vLLM.β86Updated last week
- Discovering Data-driven Hypotheses in the Wildβ122Updated 6 months ago
- A benchmark that challenges language models to code solutions for scientific problemsβ161Updated this week
- TDD-Bench-Verified is a new benchmark for generating test cases for test-driven development (TDD)β25Updated 3 months ago
- Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.β154Updated 3 months ago
- Mellea is a library for writing generative programs.β260Updated last week
- This repository contains the implementation for our EMNLP 2023 paper: HoneyBee: Progressive Instruction Finetuning of Large Language Modeβ¦β30Updated last year
- β165Updated last year
- ACL2023 - AlignScore, a metric for factual consistency evaluation.β147Updated last year
- β116Updated last year
- Aligning AI With Shared Human Values (ICLR 2021)β305Updated 2 years ago
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β232Updated last year
- β189Updated 5 months ago
- Run safety benchmarks against AI models and view detailed reports showing how well they performed.β114Updated this week
- A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.β307Updated 2 months ago
- Interpretating the latent space representations of attention head outputs for LLMsβ36Updated last year
- A framework for few-shot evaluation of autoregressive language models.β105Updated 2 years ago
- β242Updated last year
- AssetOpsBench - Industry 4.0β571Updated last week
- β27Updated last year