foundation-model-stack / fms-dgtLinks
Synthetic Data Generation for Foundation Models
β21Updated 5 months ago
Alternatives and similar repositories for fms-dgt
Users that are interested in fms-dgt are comparing it to the libraries listed below
Sorting:
- π¦ Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data β¦β206Updated this week
- β291Updated this week
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Modelsβ543Updated last year
- AssetOpsBench - Industry 4.0β121Updated this week
- Synthetic question-answering dataset to formally analyze the chain-of-thought output of large language models on a reasoning task.β146Updated 9 months ago
- β¨ RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems - ICLR 2024β169Updated 11 months ago
- Large language model and dataset for natural language to first-order logic translationβ60Updated last year
- Collection of evals for Inspect AIβ178Updated this week
- CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (NeurIPS 2023)β149Updated 11 months ago
- A curated list of papers related to constrained decoding of LLM, along with their relevant code and resources.β235Updated this week
- β297Updated last year
- [NeurIPS 2023 D&B] Code repository for InterCode benchmark https://arxiv.org/abs/2306.14898β222Updated last year
- Official repo for SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistencyβ35Updated 6 months ago
- ACPBench: Reasoning about Action, Change, and Planningβ24Updated 2 months ago
- A package to evaluate factuality of long-form generation. Original implementation of our EMNLP 2023 paper "FActScore: Fine-grained Atomicβ¦β359Updated 3 months ago
- Discovering Data-driven Hypotheses in the Wildβ99Updated last month
- This repository contains the implementation for our EMNLP 2023 paper: HoneyBee: Progressive Instruction Finetuning of Large Language Modeβ¦β27Updated last year
- [ICML 2023] Data and code release for the paper "DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation".β250Updated 8 months ago
- A benchmark that challenges language models to code solutions for scientific problemsβ127Updated last week
- β110Updated last year
- Repository for "Detoxification with MaRCo: Controllable Revision with Experts and Anti-Experts"β9Updated last year
- The project page for "LOGIC-LM: Empowering Large Language Models with Symbolic Solvers for Faithful Logical Reasoning"β333Updated last year
- β97Updated 2 weeks ago
- π€ A specialized library for integrating context-free grammars (CFG) in EBNF with the Hugging Face Transformersβ121Updated 3 months ago
- ACL2023 - AlignScore, a metric for factual consistency evaluation.β132Updated last year
- [ICLR 2024 Spotlight] FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Setsβ217Updated last year
- Code and data for "Lost in the Middle: How Language Models Use Long Contexts"β351Updated last year
- This is the repository of HaluEval, a large-scale hallucination evaluation benchmark for Large Language Models.β488Updated last year
- β18Updated last month
- β51Updated 2 years ago