FareedKhan-dev / text2video-from-scratchLinks
A Straightforward, Step-by-Step Implementation of a Video Diffusion Model
☆47Updated last month
Alternatives and similar repositories for text2video-from-scratch
Users that are interested in text2video-from-scratch are comparing it to the libraries listed below
Sorting:
- Building LLaMA 4 MoE from Scratch☆53Updated 2 months ago
- A new novel multi-modality (Vision) RAG architecture☆28Updated 8 months ago
- Easy to use, High Performant Knowledge Distillation for LLMs☆86Updated last month
- Building a 2.3M-parameter LLM from scratch with LLaMA 1 architecture.☆175Updated last year
- An overview of GRPO & DeepSeek-R1 Training with Open Source GRPO Model Fine Tuning☆32Updated last month
- Official repository for RAGViz: Diagnose and Visualize Retrieval-Augmented Generation [EMNLP 2024]☆84Updated 5 months ago
- This project is a **proof of concept** that aims to replicate the reasoning capabilities of OpenAI's newly released O1 model.☆87Updated 5 months ago
- Maximizing the Performance of a Simple RAG using RL☆62Updated 3 months ago
- Distill thinking dataset more compactly and accurately!☆31Updated 3 weeks ago
- [ICLR'25] ApolloMoE: Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts☆44Updated 7 months ago
- ☆56Updated 7 months ago
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆44Updated 4 months ago
- Source code of the paper: RetrievalQA: Assessing Adaptive Retrieval-Augmented Generation for Short-form Open-Domain Question Answering [F…☆66Updated last year
- ☆16Updated 3 months ago
- This is an NVIDIA AI Workbench example project that demonstrates an end-to-end model development workflow using Llamafactory.☆60Updated 8 months ago
- nanoGRPO is a lightweight implementation of Group Relative Policy Optimization (GRPO)☆105Updated last month
- Parameter-efficient finetuning script for Phi-3-vision, the strong multimodal language model by Microsoft.☆58Updated last year
- Lightweight continuous batching OpenAI compatibility using HuggingFace Transformers include T5 and Whisper.☆24Updated 3 months ago
- ☆22Updated 10 months ago
- ☆41Updated last month
- working implimention of deepseek MLA☆42Updated 5 months ago
- Video+code lecture on building nanoGPT from scratch☆68Updated last year
- Hybrid-RAG is a hybrid Retrieval-Augmented Generation (RAG) model that leverages BERT for retrieving relevant documents and GPT-2 for gen…☆29Updated 4 months ago
- LLM reads a paper and produce a working prototype☆57Updated 2 months ago
- AnyModal is a Flexible Multimodal Language Model Framework for PyTorch☆96Updated 6 months ago
- Auto Thinking Mode switch for Qwen3 in Open webui☆65Updated last month
- ☆17Updated 2 months ago
- FuseAI Project☆87Updated 5 months ago
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆59Updated last year
- Small Multimodal Vision Model "Imp-v1-3b" trained using Phi-2 and Siglip.☆17Updated last year