Samsung / ClickAgent
ClickAgent: Enhancing UI Location Capabilities of Autonomous Agents
☆7Updated 2 weeks ago
Related projects ⓘ
Alternatives and complementary repositories for ClickAgent
- Official repository for the paper "Images as Weight Matrices: Sequential Image Generation Through Synaptic Learning Rules" (ICLR 2023)☆12Updated last year
- Official repository for ICML 2024 paper "MoRe Fine-Tuning with 10x Fewer Parameters"☆16Updated this week
- Official implementation of "GPT or BERT: why not both?"☆12Updated this week
- SSL Video Representation Learning project☆10Updated 11 months ago
- A big_vision inspired repo that implements a generic Auto-Encoder class capable in representation learning and generative modeling.☆29Updated 4 months ago
- ☆19Updated last month
- Pytorch implementation of HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models☆28Updated 7 months ago
- DPO, but faster 🚀☆21Updated 2 weeks ago
- [ICCV23] Official implementation of eP-ALM: Efficient Perceptual Augmentation of Language Models.☆27Updated last year
- ☆15Updated 3 months ago
- SIEVE: Multimodal Dataset Pruning using Image-Captioning Models (CVPR 2024)☆14Updated 6 months ago
- In-Context Alignment: Chat with Vanilla Language Models Before Fine-Tuning☆33Updated last year
- ☆13Updated last year
- GoldFinch and other hybrid transformer components☆39Updated 3 months ago
- Un-*** 50 billions multimodality dataset☆24Updated 2 years ago
- Repository for Skill Set Optimization☆12Updated 3 months ago
- ☆12Updated 2 months ago
- [ACL 24 Findings] Implementation of Resonance RoPE and the PosGen synthetic dataset.☆21Updated 8 months ago
- mm-retrieval-evaluation☆10Updated 2 years ago
- Here we will test various linear attention designs.☆56Updated 6 months ago
- The official repository for our paper "The Dual Form of Neural Networks Revisited: Connecting Test Time Predictions to Training Patterns …☆16Updated last year
- Project for SNARE benchmark☆10Updated 5 months ago
- Official Code Repository for EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents (COLM 2024)☆25Updated 4 months ago
- The open source community's implementation of the all-new Multi-Modal Causal Attention from "DeepSpeed-VisualChat: Multi-Round Multi-Imag…☆12Updated 8 months ago
- Code and Data for Paper: SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data☆32Updated 8 months ago
- Official implementation of ECCV24 paper: POA☆24Updated 3 months ago
- Repo for "Zemi: Learning Zero-Shot Semi-Parametric Language Models from Multiple Tasks" ACL 2023 Findings☆16Updated last year
- Exploration into the proposed "Self Reasoning Tokens" by Felipe Bonetto☆53Updated 5 months ago