FareedKhan-dev / text2video-from-scratch
A Straightforward, Step-by-Step Implementation of a Video Diffusion Model
☆40Updated 2 months ago
Alternatives and similar repositories for text2video-from-scratch:
Users that are interested in text2video-from-scratch are comparing it to the libraries listed below
- Building LLaMA 4 MoE from Scratch☆32Updated last week
- Maximizing the Performance of a Simple RAG using RL☆55Updated last month
- Lean implementation of various multi-agent LLM methods, including Iteration of Thought (IoT)☆108Updated 2 months ago
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆31Updated 2 months ago
- A tiny 1000 line implementation of GraphRAG in Python☆67Updated last month
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [F…☆62Updated 10 months ago
- Parameter-efficient finetuning script for Phi-3-vision, the strong multimodal language model by Microsoft.☆58Updated 10 months ago
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆82Updated 3 months ago
- Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.☆17Updated last year
- tickr-agent is an enterprise-ready, scalable Python library for building swarms of financial agents that conduct comprehensive stock anal…☆43Updated this week
- Train a 29M parameter GPT from Scratch☆13Updated last month
- LLaMA 3 is one of the most promising open-source model after Mistral, we will recreate it's architecture in a simpler manner.☆157Updated 8 months ago
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆158Updated 11 months ago
- ☆85Updated 7 months ago
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆40Updated 5 months ago
- ☆52Updated 2 months ago
- How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training☆30Updated last week
- Gradio based tool to run opensource LLM models directly from Huggingface☆90Updated 9 months ago
- ☆100Updated 7 months ago
- Hybrid-RAG is a hybrid Retrieval-Augmented Generation (RAG) model that leverages BERT for retrieving relevant documents and GPT-2 for gen…☆26Updated 2 months ago
- Testing and evaluating the capabilities of Vision-Language models (PaliGemma) in performing computer vision tasks such as object detectio…☆80Updated 10 months ago
- XmodelLM☆39Updated 5 months ago
- World's Smallest Vision-Language Model☆27Updated last year
- Unsloth Fine-tuning Notebooks for Google Colab, Kaggle, Hugging Face and more.☆130Updated this week
- Complete example of how to build an Agentic RAG architecture with Redis, Amazon Bedrock, and LlamaIndex.☆91Updated 4 months ago
- OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation☆72Updated last month
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆91Updated 4 months ago
- Chat with Phi 3.5/3 Vision LLMs. Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which includ…☆33Updated 3 months ago
- Official repository for the paper "NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks". This rep…☆58Updated 5 months ago
- ☆19Updated 8 months ago