Code and Data for "MIRAI: Evaluating LLM Agents for Event Forecasting"
☆91Jul 2, 2024Updated last year
Alternatives and similar repositories for MIRAI
Users that are interested in MIRAI are comparing it to the libraries listed below
Sorting:
- RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment☆16Dec 19, 2024Updated last year
- PyTorch implementation of the paper "Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning", NeurIPS 2023☆62Dec 20, 2023Updated 2 years ago
- DataSciBench: An LLM Agent Benchmark for Data Science☆52Jan 21, 2026Updated last month
- The official repo of paper "Self-Control of LLM Behaviors by Compressing Suffix Gradient into Prefix Controller"☆18Aug 13, 2024Updated last year
- ☆54Dec 20, 2022Updated 3 years ago
- Large language models (LLMs) made easy, EasyLM is a one stop solution for pre-training, finetuning, evaluating and serving LLMs in JAX/Fl…☆78Aug 17, 2024Updated last year
- [NAACL 2025] Source code for MMEvalPro, a more trustworthy and efficient benchmark for evaluating LMMs☆24Sep 26, 2024Updated last year
- Linear Attention Sequence Parallelism (LASP)☆89Jun 4, 2024Updated last year
- Automatic Integration for Neural Spatio-Temporal Point Process models (AI-STPP) is a new paradigm for exact, efficient, non-parametric inf…☆25Oct 14, 2024Updated last year
- ☆11Mar 13, 2023Updated 2 years ago
- Codes for Evolving Plastic ANNs☆14Dec 18, 2022Updated 3 years ago
- JAX Scalify: end-to-end scaled arithmetics☆18Oct 30, 2024Updated last year
- Efficient retrieval head analysis with triton flash attention that supports topK probability☆13Jun 15, 2024Updated last year
- TARS: MinMax Token-Adaptive Preference Strategy for Hallucination Reduction in MLLMs☆23Sep 21, 2025Updated 5 months ago
- ☆15Jan 12, 2026Updated last month
- ☆30Jun 25, 2024Updated last year
- Codebase accompanying the Summary of a Haystack paper.☆80Sep 20, 2024Updated last year
- [KDD 2025] The implementation of "Fine-tuning Multimodal Large Language Models for Product Bundling", KDD'25☆15Sep 20, 2025Updated 5 months ago
- ☆12Mar 18, 2024Updated last year
- [ICLR 2025] Bridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization☆12Jan 26, 2025Updated last year
- [ACL 2025] Analyzing LLMs' Multilingual Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations☆18Oct 18, 2025Updated 4 months ago
- My fork os allen AI's OLMo for educational purposes.☆28Dec 5, 2024Updated last year
- Official implementation of the paper: "A deeper look at depth pruning of LLMs"☆15Jul 24, 2024Updated last year
- Plancraft is a minecraft environment and agent suite to test planning capabilities in LLMs☆26Nov 7, 2025Updated 3 months ago
- ☆17May 25, 2023Updated 2 years ago
- The implementation of paper "Strategy-aware Bundle Recommender System", SIGIR'23.☆15Sep 4, 2023Updated 2 years ago
- [ICLR 2025] Weighted-Reward Preference Optimization for Implicit Model Fusion☆14Mar 17, 2025Updated 11 months ago
- LMQL implementation of tree of thoughts☆36Jan 31, 2024Updated 2 years ago
- [ICLR-2026] Official Implementation of our paper "THOR: Tool-Integrated Hierarchical Optimization via RL for Mathematical Reasoning".☆31Feb 26, 2026Updated last week
- A RL env with procedurally generated symbolic reasoning data☆34Updated this week
- ☆21Jul 25, 2025Updated 7 months ago
- Q-Probe: A Lightweight Approach to Reward Maximization for Language Models☆40Jun 10, 2024Updated last year
- [ACL 2023] PyTorch Implementation of Zero-and Few-Shot Event Detection via Prompt-Based Meta Learning☆16Jun 6, 2023Updated 2 years ago
- [ACL 24 Findings] Implementation of Resonance RoPE and the PosGen synthetic dataset.☆24Mar 5, 2024Updated 2 years ago
- Improving Text Embedding of Language Models Using Contrastive Fine-tuning☆64Aug 2, 2024Updated last year
- Code and Data for "SCTc-TE: A Comprehensive Formulation and Benchmark for Temporal Event Forecasting""☆16Feb 2, 2024Updated 2 years ago
- [NeurIPS 2024] The official implementation of "Image Copy Detection for Diffusion Models"☆18Oct 1, 2024Updated last year
- This is the oficial repository for "Safer-Instruct: Aligning Language Models with Automated Preference Data"☆17Feb 22, 2024Updated 2 years ago
- OneEdit: A Neural-Symbolic Collaboratively Knowledge Editing System.☆19Oct 14, 2024Updated last year