ThinamXx / PaliGemma
Reading PaliGemma paper ...
☆10Updated 4 months ago
Alternatives and similar repositories for PaliGemma:
Users that are interested in PaliGemma are comparing it to the libraries listed below
- Notebooks to demonstrate TimmWrapper☆15Updated 2 months ago
- Notebook and Scripts that showcase running quantized diffusion models on consumer GPUs☆38Updated 5 months ago
- minimalistic AI library that resembles HF's transformers☆12Updated 2 months ago
- Building GPT ...☆17Updated 3 months ago
- Set of scripts to finetune LLMs☆37Updated 11 months ago
- Collection of autoregressive model implementation☆83Updated last month
- ☆24Updated last year
- an open source reproduction of NVIDIA's nGPT (Normalized Transformer with Representation Learning on the Hypersphere)☆91Updated 3 weeks ago
- A single repo with all scripts and utils to train / fine-tune the Mamba model with or without FIM☆54Updated 11 months ago
- A collection of notebooks for the Hugging Face blog series (https://huggingface.co/blog).☆44Updated 7 months ago
- Github repo for Peifeng's internship project☆14Updated last year
- BH hackathon☆14Updated 11 months ago
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 8 months ago
- Collection of resources for RL and Reasoning☆25Updated last month
- Cerule - A Tiny Mighty Vision Model☆67Updated 6 months ago
- ☆84Updated 6 months ago
- Complete implementation of Llama2 with/without KV cache & inference 🚀☆47Updated 10 months ago
- ☆125Updated last week
- An introduction to LLM Sampling☆77Updated 3 months ago
- Official Implementation of the 'When XGBoost Outperforms GPT-4 on Text Classification: A Case Study' NAACL-W 2024 paper☆13Updated 3 months ago
- ☆69Updated 6 months ago
- implementation of https://arxiv.org/pdf/2312.09299☆20Updated 8 months ago
- model activation visualiser☆90Updated this week
- Implementation of the Mamba SSM with hf_integration.☆56Updated 6 months ago
- A fast implementation of T5/UL2 in PyTorch using Flash Attention☆96Updated last week
- I learn about and explain quantization☆26Updated 11 months ago
- Chat with Qwen2-VL. Qwen2-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.☆10Updated 6 months ago
- Code, results and other artifacts from the paper introducing the WildChat-50m dataset and the Re-Wild model family.☆28Updated last month
- ☆49Updated last year
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆80Updated 10 months ago