Using conversational games to evaluate powerful LLMs
☆18Sep 3, 2023Updated 2 years ago
Alternatives and similar repositories for GameEval
Users that are interested in GameEval are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- [ACL2023, Findings] Source codes for the paper "Werewolf Among Us: Multimodal Resources for Modeling Persuasion Behaviors in Social Deduc…☆16Feb 22, 2025Updated last year
- OpenBA-V2: 3B LLM (Large Language Model) with T5 architecture, utilizing model pruning technique and continuing pretraining from OpenBA-1…☆25May 10, 2024Updated last year
- This is the codebase for pre-training, compressing, extending, and distilling LLMs with Megatron-LM.☆12Mar 11, 2024Updated 2 years ago
- 收集优质的角色扮演聊天数据 | Collection of roleplay conversations of high quality☆15Dec 1, 2024Updated last year
- Official Code for "Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning" (ICLR 2025)☆13Mar 6, 2025Updated last year
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Use the tokenizer in parallel to achieve superior acceleration☆20Mar 21, 2024Updated 2 years ago
- 敬語変換タスクにおける評価用データセット☆21Nov 24, 2022Updated 3 years ago
- [ICASSP 2025 Oral] The official implementation of paper "TextureDiffusion: Target Prompt Disentangled Editing for Various Texture Transfe…☆16Mar 13, 2025Updated last year
- Text-based game of lies and deceit, made for language models.☆32Aug 25, 2023Updated 2 years ago
- This is the official Gtihub repo for our paper: "BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Lang…☆22Jul 3, 2024Updated last year
- SpyGame: An interactive multi-agent framework to evaluate intelligence with large language models :D☆15Nov 9, 2023Updated 2 years ago
- ☆27Mar 6, 2023Updated 3 years ago
- The OpenAI Whisper speech-to-text model as a simple HTTP server☆14Oct 26, 2023Updated 2 years ago
- ☆17Apr 30, 2025Updated 11 months ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Source code of paper “A Novel Three-Stage Learning Framework for Low-Resource Knowledge-Grounded Dialogue Generation”☆16Nov 25, 2021Updated 4 years ago
- [AAAI 2025] Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems☆13May 5, 2025Updated 11 months ago
- The code used to power DeepRole☆37Nov 21, 2022Updated 3 years ago
- [ACL 2024 Findings] Code implementation of Paper "Rethinking Negative Instances for Generative Named Entity Recognition"☆60Mar 20, 2024Updated 2 years ago
- m&ms: A Benchmark to Evaluate Tool-Use for multi-step multi-modal tasks☆46Sep 26, 2024Updated last year
- Game-based AI Platforms☆26Jun 27, 2024Updated last year
- LongAttn :Selecting Long-context Training Data via Token-level Attention☆15Jul 16, 2025Updated 9 months ago
- ☆36Oct 14, 2022Updated 3 years ago
- ☆11Oct 11, 2023Updated 2 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The code of ACL2022 paper "Conditional Bilingual Mutual Information based Adaptive Training for Neural Machine Translation"..☆14Aug 6, 2022Updated 3 years ago
- Display contacts from the AddressBook database☆11May 4, 2022Updated 3 years ago
- SGLang Kernel Wheel Index☆20Updated this week
- Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lo…☆16Nov 27, 2024Updated last year
- A minimum demo for PyTorch distributed extension functionality for collectives.☆15Jul 29, 2024Updated last year
- ☆41Jun 19, 2024Updated last year
- OmniGAIA: Towards Native Omni-Modal AI Agents☆89Apr 2, 2026Updated 2 weeks ago
- MultiTool-CoT: GPT-3 Can Use Multiple External Tools with Chain of Thought Prompting☆20Jul 11, 2023Updated 2 years ago
- ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs☆29Aug 15, 2025Updated 8 months ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Micrograd in Rust☆11Nov 14, 2024Updated last year
- 清华大学宿舍洗衣机空闲提醒小程序☆14Feb 4, 2021Updated 5 years ago
- ☆21May 30, 2022Updated 3 years ago
- An iMessage interface for emacs☆18May 5, 2016Updated 9 years ago
- High Performance Sorting Based Distributed memory K-mer counter☆15Dec 8, 2025Updated 4 months ago
- Minimize and Maximize Puppeteer Browser in Real Time!☆15Oct 3, 2023Updated 2 years ago
- A record of reading list on some MLsys popular topic☆24Mar 20, 2025Updated last year