guanhaisu / OBSD
[ACL 2024 Best Paper] Deciphering Oracle Bone Language with Diffusion Models
☆93Updated last month
Alternatives and similar repositories for OBSD:
Users that are interested in OBSD are comparing it to the libraries listed below
- Oracle Bone Script data collected by VLRLab of HUST☆35Updated 4 months ago
- AI-assisted Deciphering Oracle Bone Script☆42Updated this week
- ☆19Updated last year
- ☆43Updated last month
- ☆117Updated 6 months ago
- Official code of SmartEdit [CVPR-2024 Highlight]☆276Updated 6 months ago
- The paper collections for the autoregressive models in vision.☆368Updated this week
- Official implementation for ICDAR 2024 Oral paper "ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expressi…☆23Updated 5 months ago
- [NeurIPS2024] Repo for the paper `ControlMLLM: Training-Free Visual Prompt Learning for Multimodal Large Language Models'☆136Updated this week
- 🔥 Official impl. of "TokenFlow: Unified Image Tokenizer for Multimodal Understanding and Generation".☆222Updated 2 weeks ago
- A paper collection of recent diffusion models for text-image generation tasks, e,g., visual text generation, font generation, text remova…☆221Updated 3 weeks ago
- XQ-GAN🚀: An Open-source Image Tokenization Framework for Autoregressive Generation☆178Updated last month
- Text Image Inpainting via Global Structure-Guided Diffusion Models (Accepted by AAAI-24)☆57Updated 7 months ago
- This resposity maintains a collection of important papers on conditional image synthesis with diffusion models☆86Updated this week
- ☆51Updated last month
- [ECCV 2024] ShareGPT4V: Improving Large Multi-modal Models with Better Captions☆187Updated 6 months ago
- [NeurIPS 2024] Visual Perception by Large Language Model’s Weights☆35Updated 3 months ago
- ☆36Updated 2 weeks ago
- 🔥stable, simple, state-of-the-art VQVAE toolkit & cookbook☆73Updated 6 months ago
- The official repo for “TextCoT: Zoom In for Enhanced Multimodal Text-Rich Image Understanding”.☆38Updated 3 months ago
- MultiMath: Bridging Visual and Mathematical Reasoning for Large Language Models☆22Updated 4 months ago
- Official repo for "VisionZip: Longer is Better but Not Necessary in Vision Language Models"☆219Updated 2 weeks ago
- The official code for NeurIPS 2024 paper: Harmonizing Visual Text Comprehension and Generation☆110Updated 2 months ago
- The official project of paper "Visual Text Meets Low-level Vision: A Comprehensive Survey on Visual Text Processing"☆51Updated 3 months ago
- [CVPR 2024] The official pytorch implementation of "A General and Efficient Training for Transformer via Token Expansion".☆42Updated 8 months ago
- ☆18Updated 11 months ago
- This is the official implementation for ControlVAR.☆88Updated last month
- [CVPR 2024] Dynamic Prompt Optimizing for Text-to-Image Generation☆67Updated 6 months ago
- CAR: Controllable AutoRegressive Modeling for Visual Generation☆94Updated last month
- ✨✨ MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?☆88Updated last month