Tencent-Hunyuan / Tencent-Hunyuan-7B-0124Links
☆28Updated 4 months ago
Alternatives and similar repositories for Tencent-Hunyuan-7B-0124
Users that are interested in Tencent-Hunyuan-7B-0124 are comparing it to the libraries listed below
Sorting:
- ☆93Updated 7 months ago
- 🌟Official code of our AAAI26 paper 🔍WebFilter☆33Updated last month
- ☆98Updated 4 months ago
- ☆25Updated 4 months ago
- A Framework for Decoupling and Assessing the Capabilities of VLMs☆43Updated last year
- ☆88Updated 6 months ago
- ☆52Updated last year
- The official repository of "R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Integration"☆129Updated 3 months ago
- imagetokenizer is a python package, helps you encoder visuals and generate visuals token ids from codebook, supports both image and video…☆37Updated last year
- SELF-GUIDE: Better Task-Specific Instruction Following via Self-Synthetic Finetuning. COLM 2024 Accepted Paper☆33Updated last year
- ☆75Updated last year
- XVERSE-MoE-A36B: A multilingual large language model developed by XVERSE Technology Inc.☆38Updated last year
- [ICML 2025] |TokenSwift: Lossless Acceleration of Ultra Long Sequence Generation☆119Updated 7 months ago
- The code for paper: Decoupled Planning and Execution: A Hierarchical Reasoning Framework for Deep Search☆63Updated 5 months ago
- The SAIL-VL2 series model developed by the BytedanceDouyinContent Group☆75Updated 3 months ago
- [ACL2025 Findings] Benchmarking Multihop Multimodal Internet Agents☆47Updated 9 months ago
- Official Implementation of APB (ACL 2025 main Oral)☆32Updated 9 months ago
- FuseAI Project☆87Updated 10 months ago
- [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization☆43Updated 9 months ago
- Official Implementation of Flash-Searcher: Fast and Effective Web Agents via DAG-Based Parallel Execution☆52Updated last week
- Code for KaLM-Embedding models☆104Updated 5 months ago
- SkyScript-100M: 1,000,000,000 Pairs of Scripts and Shooting Scripts for Short Drama: https://arxiv.org/abs/2408.09333v2☆130Updated last year
- ☆39Updated 5 months ago
- [NAACL 2025] Representing Rule-based Chatbots with Transformers☆23Updated 10 months ago
- Klear-Reasoner: Advancing Reasoning Capability via Gradient-Preserving Clipping Policy Optimization☆80Updated 2 months ago
- ☆29Updated last year
- Copy the MLP of llama3 8 times as 8 experts , created a router with random initialization,add load balancing loss to construct an 8x8b Mo…☆27Updated last year
- Fast LLM Training CodeBase With dynamic strategy choosing [Deepspeed+Megatron+FlashAttention+CudaFusionKernel+Compiler];☆41Updated last year
- Mixture-of-Basis-Experts for Compressing MoE-based LLMs☆24Updated 3 months ago
- WideSearch: Benchmarking Agentic Broad Info-Seeking☆106Updated 2 months ago