☆16Apr 7, 2024Updated last year
Alternatives and similar repositories for GemmaLongText
Users that are interested in GemmaLongText are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- ☆14Jun 24, 2024Updated last year
- ☆16Feb 6, 2024Updated 2 years ago
- ☆15Sep 22, 2024Updated last year
- LLM checkpointing for DeepSpeed/Megatron☆25Nov 30, 2025Updated 3 months ago
- ProxyExplainer for Graph Neural Networks☆15Oct 24, 2024Updated last year
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- ☆11Jun 1, 2023Updated 2 years ago
- 苏州大学研究生学位论文模板 - Soochow University Thesis TeX Template☆18Feb 27, 2026Updated last month
- Collection of research papers on time series explainability☆31Mar 5, 2026Updated 3 weeks ago
- [ACL'24 Oral] Analysing The Impact of Sequence Composition on Language Model Pre-Training☆23Aug 18, 2024Updated last year
- Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination.☆22Jul 18, 2025Updated 8 months ago
- Integrating all DeepSeek open-source projects into ComfyUI, looking forward to DeepSeek’s OpenSourceWeek next week.☆18Feb 21, 2025Updated last year
- [ICCV 2025] Dynamic-VLM☆28Dec 16, 2024Updated last year
- ☆13Dec 21, 2024Updated last year
- This is the official code for our paper "Simple and Scalable Nearest Neighbor Machine Translation" (ICLR 2023).☆14Nov 22, 2023Updated 2 years ago
- NordVPN Special Discount Offer • AdSave on top-rated NordVPN 1 or 2-year plans with secure browsing, privacy protection, and support for for all major platforms.
- code for ACL24 "MELoRA: Mini-Ensemble Low-Rank Adapter for Parameter-Efficient Fine-Tuning"☆34Feb 19, 2025Updated last year
- [ICML‘25] Official code for paper "Occult: Optimizing Collaborative Communication across Experts for Accelerated Parallel MoE Training an…☆13Apr 17, 2025Updated 11 months ago
- ☆12Jan 21, 2026Updated 2 months ago
- This repository provides a comprehensive library for parallel training and LoRA algorithm implementations, supporting multiple parallel s…☆57Jan 6, 2026Updated 2 months ago
- ☆13May 9, 2023Updated 2 years ago
- XGEN-MM(BLIP3) Autocaptioning Tools☆17Jun 20, 2024Updated last year
- ☆16Jul 7, 2023Updated 2 years ago
- Code for our work "MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators" in ACL 2022☆20Mar 18, 2022Updated 4 years ago
- The implementation of "Mitigating Hallucinations and Off-target Machine Translation with Source-Contrastive and Language-Contrastive Deco…☆38Aug 29, 2025Updated 6 months ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- [ECCV 2022] MORE: Multi-Order RElation Mining for Dense Captioning in 3D Scenes official implementation☆16Feb 2, 2023Updated 3 years ago
- Tools for formatting WMT hypothesis and test sets in XML☆27Apr 18, 2025Updated 11 months ago
- 石蒜摇摇乐vscode插件☆13Aug 31, 2022Updated 3 years ago
- Learning to Retrieve by Trying - Source code for Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval☆52Oct 31, 2024Updated last year
- GPULZ: Optimizing LZSS Lossless Compression for Multi-byte Data on Modern GPUs☆16Apr 18, 2025Updated 11 months ago
- Unofficial implementation of paper "InstructionNER: A Multi-Task Instruction-Based Generative Framework for Few-shot NER" (https://arxiv.…☆38Feb 14, 2024Updated 2 years ago
- ☆119Mar 18, 2026Updated last week
- 基于langchain框架构建的rag小项目☆35May 22, 2024Updated last year
- ☆136May 29, 2025Updated 9 months ago
- GPU virtual machines on DigitalOcean Gradient AI • AdGet to production fast with high-performance AMD and NVIDIA GPUs you can spin up in seconds. The definition of operational simplicity.
- Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.☆29Mar 11, 2025Updated last year
- [EMNLP 2023] TESTA: Temporal-Spatial Token Aggregation for Long-form Video-Language Understanding☆49Jan 9, 2024Updated 2 years ago
- Implementation of PALI3 from the paper PALI-3 VISION LANGUAGE MODELS: SMALLER, FASTER, STRONGER"☆146Mar 13, 2026Updated 2 weeks ago
- Like ARC, but code to generate visual puzzles. 1D puzzles first.☆22Aug 17, 2024Updated last year
- ☆23Aug 17, 2024Updated last year
- 🔍 Awesome Agentic Search is a curated list of papers, tools, and resources on agentic search—where AI agents plan, search, and reason to…☆55Aug 28, 2025Updated 6 months ago
- ☆18Nov 3, 2025Updated 4 months ago