seanzhang-zhichen / baichuan-Dynamic-NTK-ALiBiView external linksLinks
百川Dynamic NTK-ALiBi的代码实现:无需微调即可推理更长文本
☆49Aug 27, 2023Updated 2 years ago
Alternatives and similar repositories for baichuan-Dynamic-NTK-ALiBi
Users that are interested in baichuan-Dynamic-NTK-ALiBi are comparing it to the libraries listed below
Sorting:
- NTK scaled version of ALiBi position encoding in Transformer.☆69Aug 16, 2023Updated 2 years ago
- This tool(enhance_long) aims to enhance the LlaMa2 long context extrapolation capability in the lowest-cost approach, preferably without …☆45Nov 30, 2023Updated 2 years ago
- LongQLoRA: Extent Context Length of LLMs Efficiently☆168Nov 12, 2023Updated 2 years ago
- Implementation of the LDP module block in PyTorch and Zeta from the paper: "MobileVLM: A Fast, Strong and Open Vision Language Assistant …☆15Mar 11, 2024Updated last year
- Simple Implementation of TinyGPTV in super simple Zeta lego blocks☆16Nov 11, 2024Updated last year
- English or Chinses GPT2Dialog model from GPT2-chitchat☆12Feb 23, 2020Updated 5 years ago
- Realtime segmentation with ENet, the fast and accurate segmentation net.☆14Dec 7, 2018Updated 7 years ago
- LLaVA combines with Magvit Image tokenizer, training MLLM without an Vision Encoder. Unifying image understanding and generation.☆39Jun 20, 2024Updated last year
- 用于大模型 RLHF 进行人工数据标注排序的工具。A tool for manual response data annotation sorting in RLHF stage.☆256Aug 1, 2023Updated 2 years ago
- Implementation of "Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation"☆21Jul 31, 2023Updated 2 years ago
- Explained GPT-2 Transformer model step by step with code.☆17May 8, 2020Updated 5 years ago
- Official code of *Virgo: A Preliminary Exploration on Reproducing o1-like MLLM*☆20May 27, 2025Updated 8 months ago
- Web application for real-time object detection 🔎 using Flask 🌶, OpenCV, and YoloV3 weights. It uses the COCO Dataset 🖼.☆16Apr 19, 2021Updated 4 years ago
- ☆23Jan 8, 2024Updated 2 years ago
- ICLR 2022☆18Apr 15, 2022Updated 3 years ago
- ☆111Jan 8, 2025Updated last year
- Code for "Cross-Domain Sentiment Classification with Contrastive Learning and Mutual Information Maximization" (ICASSP 2021)☆21Apr 20, 2022Updated 3 years ago
- BlockchainGPT: An intuitive, chat-based platform to manage your blockchain environments using natural language processing capabilities.☆11Jul 6, 2023Updated 2 years ago
- 智能制造工业AI Top2解决方案☆20Aug 22, 2018Updated 7 years ago
- ☆20Jan 6, 2023Updated 3 years ago
- ☆84Sep 9, 2023Updated 2 years ago
- 基于Pytorch + BERT的抽取式机器阅读理解☆21Dec 8, 2022Updated 3 years ago
- ☆23Mar 9, 2023Updated 2 years ago
- Modality Gap–Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models☆48Feb 11, 2026Updated last week
- 深度学习基础学习以及工作项目☆23Mar 6, 2018Updated 7 years ago
- smp2018用户画像技术评测☆21Jul 17, 2018Updated 7 years ago
- Scripts for use with LongCLIP, including fine-tuning Long-CLIP☆63Mar 11, 2025Updated 11 months ago
- A Generative Dialogue State Tracking Model☆22Jun 24, 2021Updated 4 years ago
- 🔨🔨🔨Tool for making model training data set☆20Nov 1, 2024Updated last year
- ☆147Apr 16, 2024Updated last year
- CIKM AnalytiCup 2018 – 阿里小蜜机器人跨语言短文本匹配算法竞赛 – Rank12方案☆54Aug 1, 2018Updated 7 years ago
- ☆62Jun 17, 2024Updated last year
- Reproduction of the complete process of DeepSeek-R1 on small-scale models, including Pre-training, SFT, and RL.☆29Mar 11, 2025Updated 11 months ago
- An Experiment on Dynamic NTK Scaling RoPE☆64Nov 26, 2023Updated 2 years ago
- [NeurIPS 2025] Elevating Visual Perception in Multimodal LLMs with Visual Embedding Distillation☆71Oct 17, 2025Updated 4 months ago
- ☆72Nov 24, 2025Updated 2 months ago
- This project aims to collect and collate various datasets for multimodal large model training, including but not limited to pre-training …☆68May 7, 2025Updated 9 months ago
- [EMNLP 2019] Scalable and Accurate Dialogue State Tracking via Hierarchical Sequence Generation☆30May 22, 2023Updated 2 years ago
- 专业领域词库构建/中文新词发现/专业词库发现☆31Jan 10, 2020Updated 6 years ago