zhliu0106/probing-lm-data

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/zhliu0106/probing-lm-data)

zhliu0106 / probing-lm-data

Official Implementation of "Probing Language Models for Pre-training Data Detection"

☆20

Alternatives and similar repositories for probing-lm-data

Users that are interested in probing-lm-data are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

zhliu0106 / learning-to-refuse
View on GitHub
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
☆10Dec 13, 2024Updated last year
Spico197 / MoE-SFT
View on GitHub
🍼 Official implementation of Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
☆41Sep 29, 2024Updated last year
Spico197 / writing-comrade
View on GitHub
✒️ ChatGPT as a writing partner.
☆14Mar 6, 2023Updated 3 years ago
fairyshine / SUDA_TeX_Template
View on GitHub
苏州大学研究生学位论文模板 - Soochow University Thesis TeX Template
☆23Feb 27, 2026Updated 5 months ago
Spico197 / REx
View on GitHub
🎮 A toolkit for Relation Extraction and more...
☆24May 8, 2025Updated last year
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
hhan1018 / NesTools
View on GitHub
[COLING 2025] NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models
☆18Jan 18, 2025Updated last year
Spico197 / Mirror
View on GitHub
🪞A powerful toolkit for almost all the Information Extraction tasks.
☆124Apr 21, 2025Updated last year
changmenseng / accept_prob
View on GitHub
Calculate the probability of a paper being accepted by EMNLP2023 based on score distribution of ACL2023.
☆14Sep 7, 2023Updated 2 years ago
HillZhang1999 / RobustGEC
View on GitHub
Code & Data for our Paper "RobustGEC: Robust Grammatical Error Correction Against Subtle Context Perturbation" (EMNLP 2023)
☆17Jan 23, 2024Updated 2 years ago
Spico197 / watchmen
View on GitHub
😎 A simple and easy-to-use toolkit for GPU scheduling.
☆45May 12, 2025Updated last year
yanghh2000 / Alirector
View on GitHub
Source code of paper "Alirector: Alignment-Enhanced Chinese Grammatical Error Corrector" (Findings of ACL 2024)
☆14Mar 19, 2025Updated last year
zsLin177 / SRL-as-GP
View on GitHub
☆18Mar 10, 2023Updated 3 years ago
Jacob-Zhou / gecdi
View on GitHub
The repo of "Improving Seq2Seq Grammatical Error Correction via Decoding Interventions"
☆32Jan 22, 2024Updated 2 years ago
yzhangcs / ctc-copy
View on GitHub
[EMNLP'23] Code for "Non-autoregressive Text Editing with Copy-aware Latent Alignments".
☆20Oct 17, 2023Updated 2 years ago
Deploy open-source AI quickly and easily - Special Bonus Offer • Ad
Runpod Hub is built for open source. One-click deployment and autoscaling endpoints without provisioning your own infrastructure.
ttw1018 / MoPE-DST
View on GitHub
The code for "MoPE: Mixture of Prefix Experts for Zero-Shot Dialogue State Tracking"
☆19Jan 25, 2025Updated last year
Jacob-Zhou / simple-csc
View on GitHub
This repository provides an implementation of "A Simple yet Effective Training-free Prompt-free Approach to Chinese Spelling Correction B…
☆89Jul 9, 2025Updated last year
Spico197 / NYT-H
View on GitHub
📜 Codes and Data for COLING2020 paper: Towards Accurate and Consistent Evaluation: A Dataset for Distantly-Supervised Relation Extractio…
☆31Feb 2, 2021Updated 5 years ago
fairyshine / Seal-Tools
View on GitHub
The source code and dataset mentioned in the paper Seal-Tools: Self-Instruct Tool Learning Dataset for Agent Tuning and Detailed Benchmar…
☆57Nov 5, 2024Updated last year
fairyshine / OpenAgentHarness
View on GitHub
Open Agent Harness System for vibe coding, learning, experiment, design and so on.
☆32May 29, 2026Updated 2 months ago
yzhangcs / master-thesis
View on GitHub
基于树形条件随机场的高阶句法分析
☆16Apr 28, 2022Updated 4 years ago
THU-KEG / DICE
View on GitHub
DICE: Detecting In-distribution Data Contamination with LLM's Internal State
☆12Sep 21, 2024Updated last year
francescortu / comp-mech
View on GitHub
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals; ACL 2024
☆13May 24, 2024Updated 2 years ago
JD-AI-Research-NLP / RoR
View on GitHub
☆14Jul 27, 2022Updated 4 years ago
Deploy on Railway without the complexity - Free Credits Offer • Ad
Connect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
pjlab-sys4nlp / llama-moe
View on GitHub
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)
☆1,004Dec 6, 2024Updated last year
McGill-NLP / feedbackqa
View on GitHub
FeedbackQA: Improving Question Answering Post-Deployment with Interactive Feedback
☆12Jul 13, 2022Updated 4 years ago
Spico197 / DocEE
View on GitHub
🕹️ A toolkit for document-level event extraction, containing some SOTA model implementations.
☆244Sep 5, 2023Updated 2 years ago
Zhang-Yihao / Adversarial-Representation-Engineering
View on GitHub
Official implementation repository for the paper Towards General Conceptual Model Editing via Adversarial Representation Engineering.
☆20Dec 6, 2024Updated last year
Job-Bench / job-bench-eval
View on GitHub
Official eval scripts for JobBench
☆31Jul 18, 2026Updated last week
OpenNLG / OpenBA
View on GitHub
☆95Oct 8, 2023Updated 2 years ago
zsLin177 / camr
View on GitHub
The system of SUDA-HUAWEI submitted at CAMR2022.
☆12Nov 22, 2022Updated 3 years ago
genglinliu / UnknownBench
View on GitHub
Repo for paper: Examining LLMs' Uncertainty Expression Towards Questions Outside Parametric Knowledge
☆14Feb 20, 2024Updated 2 years ago
GTCOM-NLP / GeWu-BigModel
View on GitHub
格物-多语言和中文大规模预训练模型-轻量版，涵盖纯中文、知识增强、113个语种多语言，采用主流Roberta架构，适用于NLU和NLG任务，支持pytorch、tensorflow、uer、huggingface等框架。 Multilingual and Chinese …
☆30Nov 17, 2022Updated 3 years ago
Simple, predictable pricing with DigitalOcean hosting • Ad
Always know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
cnamejj / PyProc
View on GitHub
Linux /proc data in a consistent, parsed format.
☆10Mar 28, 2016Updated 10 years ago
WinnieHAN / mndmv
View on GitHub
☆12Mar 4, 2022Updated 4 years ago
hkust-nlp / AgentVista
View on GitHub
Benchmarking multimodal agents on realistic, ultra-challenging visual scenarios requiring long-horizon hybrid tool use.
☆68Mar 10, 2026Updated 4 months ago
Claude-Liu / ReLM
View on GitHub
Rephrasing Language Model for CSC (AAAI 2024)
☆47May 14, 2024Updated 2 years ago
diyiy / ACL2022_Limited_Data_Learning_Tutorial
View on GitHub
☆91May 21, 2022Updated 4 years ago
JHLew / Learnable-Fourier-Features
View on GitHub
Unofficial pytorch implementation of the paper "Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding", NeurIPS 20…
☆13Apr 24, 2024Updated 2 years ago
ellenmellon / CGRG
View on GitHub
A Controllable Model of Grounded Response Generation (AAAI 21)
☆13Oct 25, 2022Updated 3 years ago