onion-liu / arxiv_daily_aigcLinks
An AI-driven daily arXiv paper crawler, analyzer, and organizer tool, focusing on AIGC
☆68Updated this week
Alternatives and similar repositories for arxiv_daily_aigc
Users that are interested in arxiv_daily_aigc are comparing it to the libraries listed below
Sorting:
- Customize your arXiv recommendation every day.☆134Updated last month
- [EMNLP 2025 Demo] PresentAgent: Multimodal Agent for Presentation Video Generation☆113Updated last month
- [AAAI 2025] StoryWeaver: A Unified World Model for Knowledge-Enhanced Story Character Customization☆222Updated 7 months ago
- Video generation via code☆901Updated last week
- Official GPU implementation of the paper "PPLLaVA: Varied Video Sequence Understanding With Prompt Guidance"☆130Updated last year
- Chrome / Edge extension to turn arXiv papers into Markdown codes in one click.☆85Updated 8 months ago
- Valley is a cutting-edge multimodal large model designed to handle a variety of tasks involving text, images, and video data.☆256Updated 2 weeks ago
- Implementation for the paper "ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems".☆193Updated 8 months ago
- 🔥ICLR 2025 (Spotlight) One-Prompt-One-Story: Free-Lunch Consistent Text-to-Image Generation Using a Single Prompt☆306Updated last month
- 收集整理一些在Seedream 4.0 下生成的令人惊艳的图片和提示词☆108Updated 2 months ago
- ☆289Updated last year
- 如何得到最好的结果,Improve-Your-Prompt是一个用于优化prompt的prompt☆39Updated 11 months ago
- ☆170Updated last year
- [ICLR 2025] The First Multimodal Seach Engine Pipeline and Benchmark for LMMs☆479Updated 9 months ago
- 💡 VideoMind: A Chain-of-LoRA Agent for Long Video Reasoning☆277Updated last month
- AI视频剪辑☆267Updated 3 months ago
- Awesome-RAG-Vision: a curated list of advanced retrieval augmented generation (RAG) for Computer Vision☆264Updated last month
- Open-source alternative for crowdtest.ai. Simulate how users might react to different versions of your content☆157Updated 8 months ago
- MovieAgent: Automated Movie Generation via Multi-Agent CoT Planning☆262Updated 7 months ago
- 论文阅读工具,一键截图+AI翻译,支持数学公式,贴片多窗口管理☆128Updated 2 months ago
- ☆76Updated this week
- [ACL2025 Findings] Migician: Revealing the Magic of Free-Form Multi-Image Grounding in Multimodal Large Language Models☆81Updated 6 months ago
- The official repo for paper "Spatial Speech Translation: Translating Across Space With Binaural Hearables"☆69Updated 3 months ago
- Learning records for building a large language model from scratch☆58Updated 10 months ago
- Cookbook for Crafting Good Code☆57Updated last year
- A simple agent framework that's capable of browser use + mcp + auto instrument + plan + deep research + more☆328Updated last month
- ☆80Updated 7 months ago
- Speech to Text but with all the bells and whistles and most importantly AI! AI will clean up your filler words, edit and will refine what…☆323Updated 9 months ago
- Awesome Instruction Editing. Image and Media Editing with Human Instructions. Instruction-Guided Image and Media Editing.☆94Updated 2 weeks ago
- ☆137Updated 3 months ago