amazon-science/synthesizrr

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/amazon-science/synthesizrr)

amazon-science / synthesizrr

Synthesizing realistic and diverse text-datasets from augmented LLMs

☆19

Alternatives and similar repositories for synthesizrr

Users that are interested in synthesizrr are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

alycialee / beyond-scale-language-data-diversity
View on GitHub
☆13Jul 22, 2026Updated last week
Genius1237 / TyDiP
View on GitHub
TyDiP Multilingual Politeness dataset and code
☆12Oct 15, 2023Updated 2 years ago
causalNLP / amr_llm
View on GitHub
This repo explores how AMR to address tasks difficult for LLMs
☆13Jan 15, 2024Updated 2 years ago
StefanHeng / ProgGen
View on GitHub
Code for paper "ProgGen: Generating Named Entity Recognition Datasets Step-by-step with Self-Reflexive Large Language Models"
☆17Mar 29, 2024Updated 2 years ago
aasthavar / finetune-evaluate-codestral
View on GitHub
Different approaches for finetuning, evaluating, optimizations for code generation model - codestral
☆11Jun 18, 2024Updated 2 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
chun19920827 / corpus
View on GitHub
中文医学语料库
☆15Jul 2, 2021Updated 5 years ago
Linzwcs / AFT
View on GitHub
☆13Jan 22, 2025Updated last year
NLie2 / what_features_jailbreak_LLMs
View on GitHub
☆18Mar 30, 2025Updated last year
prrao87 / fine-grained-sentiment-app
View on GitHub
A Flask LIME explainer app for fine-grained sentiment classification.
☆12May 1, 2023Updated 3 years ago
aryopg / decore
View on GitHub
Official Implementation of "DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucination"
☆30Dec 18, 2024Updated last year
rkishony / data-to-paper-supplementary
View on GitHub
☆12Aug 21, 2024Updated last year
mixedbread-ai / ofen
View on GitHub
WIP: Ofen is a toolkit aimed at making transformer models production-ready. API included
☆17Oct 2, 2024Updated last year
linkedin / ControlLLM
View on GitHub
Control LLM
☆23Apr 6, 2025Updated last year
janphilippfranken / sami
View on GitHub
Self-Supervised Alignment with Mutual Information
☆20May 24, 2024Updated 2 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
cmu-ci-lab / dual_pixel_defocus_estimation_deblurring
View on GitHub
☆32Jan 20, 2022Updated 4 years ago
LCM-Lab / LOGO
View on GitHub
Code for paper: Long cOntext aliGnment via efficient preference Optimization
☆26Oct 10, 2025Updated 9 months ago
ethanjperez / pytorch-pretrained-BERT
View on GitHub
📖The Big-&-Extending-Repository-of-Transformers: Pretrained PyTorch models for Google's BERT, OpenAI GPT & GPT-2, Google/CMU Transformer…
☆10Dec 4, 2020Updated 5 years ago
andrew-templeton / cfn-lex-bot
View on GitHub
Custom::LexBot | AWS CloudFormation Custom Lambda Resource | Lex Bot
☆10Jan 13, 2021Updated 5 years ago
trapoom555 / Language-Model-STS-CFT
View on GitHub
Improving Text Embedding of Language Models Using Contrastive Fine-tuning
☆64Aug 2, 2024Updated last year
quanshr / AugCon
View on GitHub
[AAAI 2025]Automatically Generating Numerous Context-Driven SFT Data for LLMs across Diverse Granularity
☆30Mar 17, 2025Updated last year
PositionalHidden / PositionalHidden
View on GitHub
To mitigate position bias in LLMs, especially in long-context scenarios, we scale only one dimension of LLMs, reducing position bias and …
☆12Jun 18, 2024Updated 2 years ago
amazon-science / controllable-readability-summarization
View on GitHub
Generating Summaries with Controllable Readability Levels (EMNLP 2023)
☆15Jul 2, 2026Updated 3 weeks ago
PKU-TANGENT / ConFiguRe
View on GitHub
Dataset and baseline for Coling 2022 long paper (oral): "ConFiguRe: Exploring Discourse-level Chinese Figures of Speech"
☆12Jul 27, 2023Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
aws-samples / aws-cost-control-approval-workflow
View on GitHub
SAM project that deploys sample aws cost control approval workflow
☆11Sep 26, 2023Updated 2 years ago
nicolay-r / bulk-chain
View on GitHub
A no-string API framework for deploying schema-based reasoning into third-party apps
☆23Jul 21, 2026Updated last week
pkunlp-icler / MLS
View on GitHub
Source code of our paper "Focus on the Target’s Vocabulary: Masked Label Smoothing for Machine Translation" @ ACL 2022
☆13Apr 13, 2022Updated 4 years ago
syncdoth / Chain-of-Hindsight-PyTorch
View on GitHub
Unofficial implementation of Chain of Hindsight (https://arxiv.org/abs/2302.02676) using pytorch and huggingface Trainers.
☆11Apr 5, 2023Updated 3 years ago
M3-IT / YING-VLM
View on GitHub
Vision Large Language Models trained on M3IT instruction tuning dataset
☆17Aug 16, 2023Updated 2 years ago
aws-samples / prompt-engineering-playground-with-sagemaker
View on GitHub
Utility which provides a UI to do prompt engineering within SageMaker Studio.
☆14Jul 5, 2023Updated 3 years ago
mustaszewski / europarl-extract
View on GitHub
☆20Jan 10, 2019Updated 7 years ago
momo-journey / CDial-GPT-NEZHA
View on GitHub
pytorch版基于gpt+nezha的中文多轮Cdial
☆11Oct 22, 2022Updated 3 years ago
EricLee8 / MPD_EMVI
View on GitHub
Official implementation of our paper at ACL 2023: Pre-training Multi-party Dialogue Models with Latent Discourse Inference
☆10Jul 10, 2023Updated 3 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
chenllliang / ParetoMNMT
View on GitHub
Source code for paper "On the Pareto Front of Multilingual Neural Machine Translation" @ NeurIPS 2023
☆17Sep 27, 2023Updated 2 years ago
lifan-yuan / FactMix
View on GitHub
Code for COLING 2022 paper "FactMix: Using a Few Labeled In-domain Examples to Generalize to Cross-domain Named Entity Recognition"
☆15Jan 15, 2023Updated 3 years ago
jasonppy / word-discovery
View on GitHub
Word Discovery in Visually Grounded, Self-Supervised Speech Models
☆27Dec 4, 2023Updated 2 years ago
envfluids / py2d
View on GitHub
Python 2D Navier-Stokes solver
☆31Aug 4, 2025Updated 11 months ago
lpq29743 / HAN-PL
View on GitHub
A Pytorch implementation for "Hierarchical Attention Network with Pairwise Loss for Chinese Zero Pronoun Resolution“ (AAAI 2020).
☆10Dec 10, 2020Updated 5 years ago
Social-AI-Studio / MemeCraft
View on GitHub
Official repository for WWW'24 paper "MemeCraft: Contextual and Stance-Driven Multimodal Meme Generation"
☆12Jul 25, 2024Updated 2 years ago
tencent-ailab / OASum
View on GitHub
☆15Oct 20, 2023Updated 2 years ago