a lightweight C++ LLaMA inference engine for mobile devices
☆15Oct 28, 2023Updated 2 years ago
Alternatives and similar repositories for mobilellama
Users that are interested in mobilellama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Set-Encoder: Permutation-Invariant Inter-Passage Attention for Listwise Passage Re-Ranking with Cross-Encoders☆19May 23, 2025Updated last year
- TUI-klient för SVTs text-tv skriven i Go.☆23May 28, 2021Updated 5 years ago
- ☆35Oct 9, 2025Updated 7 months ago
- [ICLR'25] "Attention in Large Language Models Yields Efficient Zero-Shot Re-Rankers"☆43Mar 31, 2025Updated last year
- The first spoken long-text dataset derived from live streams, designed to reflect the redundancy-rich and conversational nature of real-w…☆12Jun 28, 2025Updated 11 months ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- Cloud Native Distributed Nearest Neighbour Search☆15Jun 9, 2020Updated 5 years ago
- Unsupervised Word Discovery☆10Jul 26, 2019Updated 6 years ago
- Repository for the Q-Filters method (https://arxiv.org/pdf/2503.02812)☆34Mar 7, 2025Updated last year
- [ICML 2024] Official Repository for the paper "Transformers Get Stable: An End-to-End Signal Propagation Theory for Language Models"☆10Jul 19, 2024Updated last year
- Enable RNNLM lattice rescoring with Pytorch [kaldi]☆12Jun 5, 2020Updated 5 years ago
- rabitq rust implementation☆11May 14, 2026Updated 2 weeks ago
- Package vecf32 provides common functions and methods for slices of float32☆13Jun 14, 2023Updated 2 years ago
- Repository for Skill Set Optimization☆14Jul 26, 2024Updated last year
- Codebase for Efficient yet simple Reinforcement Learning Research Framework☆28Jan 14, 2023Updated 3 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- mPLM-Sim: Better Cross-Lingual Similarity and Transfer in Multilingual Pretrained Language Models☆11Jan 19, 2024Updated 2 years ago
- Notebooks for docarray, Jina, Finetuner, and other products from Jina AI☆12Mar 31, 2022Updated 4 years ago
- TSDG: An efficient index graph for graph-based nearest neighbor search☆10Jul 14, 2022Updated 3 years ago
- Latent Large Language Models☆19Aug 24, 2024Updated last year
- A PyTorch implementation of Proxy Anchor Loss based on CVPR 2020 paper "Proxy Anchor Loss for Deep Metric Learning"☆11Jan 16, 2021Updated 5 years ago
- Server wrapper for ml models☆11Sep 11, 2019Updated 6 years ago
- Semi-supervised spoken language understanding (SLU) via self-supervised speech and language model pretraining☆12Mar 23, 2021Updated 5 years ago
- DPO, but faster 🚀☆52Dec 6, 2024Updated last year
- Deep Learning for Video Retrieval by Natural Language☆11Oct 20, 2019Updated 6 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- A Framework for Evaluating AI Agent Safety in Realistic Environments☆37Oct 2, 2025Updated 8 months ago
- High performance implementation of the WARP (SIGIR'25) retrieval engine.☆34May 21, 2026Updated last week
- ☆10Mar 13, 2023Updated 3 years ago
- ☆12Jun 13, 2025Updated 11 months ago
- CLaMR: Contextualized Late-Interaction for Multimodal Content Retrieval☆25Jun 28, 2025Updated 11 months ago
- 工业级中文语音识别系统电子书☆13Oct 30, 2020Updated 5 years ago
- Simple Tensorflow implementation of "SDIT: Scalable and Diverse Cross-domain Image Translation" (ACM-MM 2019)☆16Oct 14, 2019Updated 6 years ago
- [NAACL'25] "Revealing the Barriers of Language Agents in Planning"☆13Jun 22, 2025Updated 11 months ago
- code release☆28May 7, 2026Updated 3 weeks ago
- GPUs on demand by Runpod - Special Offer Available • AdRun AI, ML, and HPC workloads on powerful cloud GPUs—without limits or wasted spend. Deploy GPUs in under a minute and pay by the second.
- ☆15May 15, 2021Updated 5 years ago
- ☆16Feb 6, 2024Updated 2 years ago
- Seminar: intro to deep learning with tensorflow☆13Jun 27, 2017Updated 8 years ago
- ☆16Sep 4, 2025Updated 8 months ago
- ☆18Apr 3, 2023Updated 3 years ago
- gRPC server for hnswlib☆16Mar 6, 2023Updated 3 years ago
- ☆18Mar 2, 2026Updated 3 months ago