Structured inference with Llama 2 in your browser
☆52Nov 1, 2024Updated last year
Alternatives and similar repositories for ad-llama
Users that are interested in ad-llama are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.☆51Jul 23, 2024Updated last year
- A swarm of LLM agents that will help you test, document, and productionize your code!☆16Mar 30, 2026Updated 2 weeks ago
- An Attention Superoptimizer☆22Jan 20, 2025Updated last year
- javascript multivariate data visualization☆14Jan 10, 2017Updated 9 years ago
- Just-in-time Dynamic Batching with MXNet Gluon.☆52May 18, 2020Updated 5 years ago
- Wordpress hosting with auto-scaling - Free Trial • AdFully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
- Home for OctoML PyTorch Profiler☆113Apr 24, 2023Updated 2 years ago
- Vercel and web-llm template to run wasm models directly in the browser.☆172Updated this week
- AgentParse is a high-performance parsing library designed to map various structured data formats (such as Pydantic models, JSON, YAML, an…☆18Oct 13, 2025Updated 6 months ago
- DocQues answers queries on longer and multiple documents build on GPT-Index and GPT-3☆13Jan 1, 2023Updated 3 years ago
- A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer☆97Feb 20, 2026Updated last month
- Simple, opinionated, JSON-typed, and traced LLM framework for TypeScript.☆37Mar 11, 2024Updated 2 years ago
- ☆192Mar 28, 2023Updated 3 years ago
- Host LLM via text-generation-inference☆16Dec 5, 2023Updated 2 years ago
- Experimental Vega Dataflow Visualization☆21Jul 28, 2016Updated 9 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Libraries, guides, blueprints, and sample code, to enable rapidly building 0-1 applications on iOS, Android and web.☆11May 12, 2023Updated 2 years ago
- OmniByteFormer is a generalized Transformer model that can process any type of data by converting it into byte sequences, bypassing tradi…☆15Updated this week
- Sublinear memory optimization for deep learning, reduce GPU memory cost to train deeper nets☆28Apr 22, 2016Updated 9 years ago
- A collaborative online editor.☆12Feb 2, 2025Updated last year
- YourAICHAT☆12Aug 16, 2023Updated 2 years ago
- ☆175Apr 2, 2026Updated 2 weeks ago
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Sep 26, 2023Updated 2 years ago
- An experimental ahead of time compiler for Relay.☆49Apr 21, 2020Updated 5 years ago
- Slides from 2021-12-15 talk, "TVM Developer Bootcamp – Writing Hardware Backends"☆11Jan 20, 2022Updated 4 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- Typescript utilities for input validation, with emphasis on security☆19Jan 3, 2024Updated 2 years ago
- ☆26Dec 13, 2024Updated last year
- NewsAgent is an enterprise-grade news aggregation agent designed to fetch, query, and summarize news from multiple sources at scale.☆27Oct 13, 2025Updated 6 months ago
- ☆122Apr 22, 2024Updated last year
- python interface for mlc chat cli☆14May 7, 2023Updated 2 years ago
- Visualize TVM Relay program graph☆12Nov 19, 2019Updated 6 years ago
- ☆18Jan 9, 2018Updated 8 years ago
- A streamlit application to play with LLM Models☆20Sep 19, 2023Updated 2 years ago
- A curated list of resources related to structured generation 🔥☆23Jul 25, 2025Updated 8 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- ☆14May 9, 2024Updated last year
- Python T-Digest Module☆10Aug 17, 2015Updated 10 years ago
- Material for HolyJS 2020 Moscow☆12Nov 28, 2020Updated 5 years ago
- Experimental scripts for researching data adaptive learning rate scheduling.☆22Oct 18, 2023Updated 2 years ago
- model.yaml is an open standard for defining crossplatform, composable AI models☆56Sep 9, 2025Updated 7 months ago
- chat with gpt models on treed-graph☆20May 23, 2025Updated 10 months ago
- PET: Optimizing Tensor Programs with Partially Equivalent Transformations and Automated Corrections☆126Jun 23, 2022Updated 3 years ago