Automatically update arXiv papers about LLM Reasoning, LLM Evaluation, LLM & MLLM and Video Understanding using Github Actions.
☆144May 18, 2026Updated last week
Alternatives and similar repositories for llm-arxiv-daily
Users that are interested in llm-arxiv-daily are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Auditing agents for fine-tuning safety☆21Oct 21, 2025Updated 7 months ago
- [ACL 2025] Are Your LLMs Capable of Stable Reasoning?☆33Aug 5, 2025Updated 9 months ago
- Official repository for the paper "ALERT: A Comprehensive Benchmark for Assessing Large Language Models’ Safety through Red Teaming"☆59Sep 20, 2024Updated last year
- NeurIPS 2024: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation☆13May 24, 2025Updated last year
- [ICASSP'25] Enhancing Vision-Language Tracking by Effectively Converting Textual Cues into Visual Cues☆17Dec 31, 2024Updated last year
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- [ICLR 2026] RewardMap: Tackling Sparse Rewards in Fine-grained Visual Reasoning via Multi-Stage Reinforcement Learning☆43Feb 22, 2026Updated 3 months ago
- ☆19Mar 9, 2024Updated 2 years ago
- Download, parse, and filter data from Court Listener, part of the FreeLaw projects. Data-ready for The-Pile.☆16Jun 3, 2023Updated 2 years ago
- Discovering environments with XRM☆17Dec 6, 2024Updated last year
- Baseline achieving 0.8 accuracy on the private test set in the ZaloAI Challenge 2023 Elementary Math Solving☆24May 1, 2024Updated 2 years ago
- [NeurIPS'24] MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts☆19Oct 7, 2024Updated last year
- DeepRAG: Thinking to Retrieve Step by Step for Large Language Models☆39Feb 17, 2026Updated 3 months ago
- llms related stuff , including code, docs☆13Feb 25, 2025Updated last year
- ☆152Mar 12, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆25Dec 23, 2024Updated last year
- DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models☆13Nov 2, 2023Updated 2 years ago
- 🤖 🤖 GPT-4 Code Interpreter, Financial Assistant and Content Creator Chat Bot (OpenAI or Local-enabled)☆27Jul 2, 2023Updated 2 years ago
- SemEval 2022 Task 5: Multimedia Automatic Misogyny Identification - baseline models and dataset☆15Nov 22, 2022Updated 3 years ago
- Interface Design for Self-Supervised Speech Models, Accepted to Interspeech2024☆16Nov 19, 2024Updated last year
- This is the official repository for all the code of TheoremLlama☆47Aug 4, 2025Updated 9 months ago
- ☆10Feb 6, 2025Updated last year
- ☆21Apr 3, 2026Updated last month
- Code of paper "LP-Diff: Towards Improved Restoration of Real-World Degraded License Plate"☆20Jun 22, 2025Updated 11 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- [WSDM 2026] LookAhead Tuning: Safer Language Models via Partial Answer Previews☆17Dec 14, 2025Updated 5 months ago
- SimpleDeepSearcher: Deep Information Seeking via Web-Powered Reasoning Trajectory Synthesis☆120Jun 3, 2025Updated 11 months ago
- Resources for the Enigmata Project.☆82Aug 13, 2025Updated 9 months ago
- ☆14Jun 10, 2025Updated 11 months ago
- TrackGPT: Track What You Need in Videos via Text Prompts☆25May 16, 2023Updated 3 years ago
- ACL'2025: SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. and preprint: SoftCoT++: Test-Time Scaling with Soft Chain-of…☆89May 30, 2025Updated 11 months ago
- Implementation for the research paper "Enhancing LLM Reasoning via Critique Models with Test-Time and Training-Time Supervision".☆55Nov 29, 2024Updated last year
- ☆33Jan 26, 2026Updated 4 months ago
- Official implementation of the NeurIPS 2024 paper CORY☆33Mar 4, 2026Updated 2 months ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- The official repository of the Omni-MATH benchmark.☆93Dec 22, 2024Updated last year
- Official resources of "The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reaso…☆21Jun 13, 2025Updated 11 months ago
- Code and data for "MT-Eval: A Multi-Turn Capabilities Evaluation Benchmark for Large Language Models"☆55Nov 18, 2025Updated 6 months ago
- [CVPR'24] RTracker: Recoverable Tracking via PN Tree Structured Memory☆28Jun 18, 2024Updated last year
- Collection of the latest spatial, 3D, and video/temporal reasoning papers☆35Sep 29, 2025Updated 7 months ago
- This repository contains a regularly updated paper list for LLMs-reasoning-in-latent-space.☆327May 13, 2026Updated last week
- Accepted LLM Papers in NeurIPS 2024☆38Oct 13, 2024Updated last year