LLM Benchmark
☆44May 24, 2025Updated last year
Alternatives and similar repositories for PERSONA
Users that are interested in PERSONA are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆15Aug 30, 2025Updated 9 months ago
- [EMNLP Findings 2025]. NLP-ADBench is a comprehensive benchmarking tool designed for Anomaly Detection in Natural Language Processing (NL…☆21Oct 9, 2025Updated 8 months ago
- GlotEval: a unified evaluation toolkit designed to benchmark multilingual Large Language Models (LLMs) in a language-specific way☆18Nov 4, 2025Updated 7 months ago
- A multi-agent framework to fully automate anomaly detection in different modalities, tabular, graph, time series, and more (work in progr…☆100Apr 24, 2026Updated last month
- LOLA: Large and Open Source Multilingual Language Model☆11Apr 8, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- A Julia Library for Outlier Detection (Anomaly Detection)☆11Sep 1, 2018Updated 7 years ago
- NLPBench: Evaluating NLP-Related Problem-solving Ability in Large Language Models☆10Oct 27, 2023Updated 2 years ago
- [ICLR 2025] Permute-and-Flip: An optimally robust and watermarkable decoder for LLMs☆19Mar 20, 2025Updated last year
- The official codebase for "Experiential Reinforcement Learning" - https://arxiv.org/pdf/2602.13949v1☆72May 8, 2026Updated last month
- PyTorch implementation of "A Simple Baseline for Low-Budget Active Learning".☆14Dec 22, 2021Updated 4 years ago
- Visualize expert firing frequencies across sentences in the Mixtral MoE model☆18Dec 22, 2023Updated 2 years ago
- ReportParse is a unified NLP analyzer for corporate sustainability reports☆21Sep 18, 2024Updated last year
- https://arxiv.org/abs/2404.10917☆14Mar 18, 2025Updated last year
- Word acquisition in neural language models (TACL 2022).☆21Jan 30, 2025Updated last year
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- ☆16Oct 6, 2024Updated last year
- ☆31May 15, 2026Updated 3 weeks ago
- ☆23May 21, 2025Updated last year
- Learning from Indirect Observations☆11Jul 16, 2021Updated 4 years ago
- ☆13Oct 12, 2020Updated 5 years ago
- Official repository for FLAME-MoE: A Transparent End-to-End Research Platform for Mixture-of-Experts Language Models☆39Sep 19, 2025Updated 8 months ago
- [ICLR'26, NAACL'25 Demo] Toolkit & Benchmark for evaluating the trustworthiness of generative foundation models.☆131Aug 22, 2025Updated 9 months ago
- AdaRFT: Efficient Reinforcement Finetuning via Adaptive Curriculum Learning☆56Jun 13, 2025Updated last year
- A framework for steering MoE models by detecting and controlling behavior-linked experts.☆35Sep 12, 2025Updated 9 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Official demo repository for our ACL 2019 long paper "Generating Question-Answer Hierarchies".☆20Feb 13, 2026Updated 4 months ago
- ☆22Mar 19, 2024Updated 2 years ago
- [ICLR 2026] The official code for "Doxing via the Lens: Revealing Location-related Privacy Leakage on Multi-modal Large Reasoning Models"☆26Feb 7, 2026Updated 4 months ago
- [ACL 2025 Main] Code and data for paper "Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation?"☆22Jun 18, 2025Updated 11 months ago
- A new algorithm that formulates jailbreaking as a reasoning problem.☆26Jul 2, 2025Updated 11 months ago
- Reddit Collector and Text Processor☆24Sep 7, 2022Updated 3 years ago
- Code for ICML 2023 paper "When and How Does Known Class Help Discover Unknown Ones? Provable Understandings Through Spectral Analysis"☆14Jun 24, 2023Updated 2 years ago
- Code, data, and models for the EMNLP 2020 paper "Learning to Fuse Sentences with Transformers for Summarization"☆16Nov 2, 2022Updated 3 years ago
- doi2bibtex: Resolve DOIs and arXiv identifiers to formatted BibTeX entries☆23Aug 20, 2024Updated last year
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- An automated data pipeline scaling RL to pretraining levels☆77Jun 2, 2026Updated last week
- A dataset for multimodal machine translation☆13Dec 6, 2021Updated 4 years ago
- Uncertainty-Aware Reliable Text Classification (KDD 2021)☆18Oct 4, 2022Updated 3 years ago
- SVIP: Towards Verifiable Inference of Open-Source Large Language Models☆15Jun 3, 2025Updated last year
- [COLM 2025] JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model☆26Nov 25, 2025Updated 6 months ago
- Code and data for paper "Can Watermarked LLMs be Identified by Users via Crafted Prompts?" Accepted by ICLR 2025 (Spotlight)☆28Dec 28, 2024Updated last year
- https://wavelandspeech.github.io/☆10Jan 12, 2024Updated 2 years ago