Fast approximate inference on a single GPU with sparsity aware offloading
☆39Jan 4, 2024Updated 2 years ago
Alternatives and similar repositories for tricksy
Users that are interested in tricksy are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Karpathy's llama2.c transpiled to MLX for Apple Silicon☆14Dec 28, 2023Updated 2 years ago
- ☆53Jan 18, 2024Updated 2 years ago
- This is an going project of mine that generates audiobooks from a book input, and uses a different actors for each character in the book☆16Nov 28, 2023Updated 2 years ago
- Stop messing around with finicky sampling parameters and just use DRµGS!☆364Jun 1, 2024Updated last year
- Just a bunch of benchmark logs for different LLMs☆125Jul 28, 2024Updated last year
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- This is our own implementation of 'Layer Selective Rank Reduction'☆240May 26, 2024Updated last year
- Collection of autoregressive model implementation☆85Feb 23, 2026Updated 3 months ago
- The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction☆392Jul 9, 2024Updated last year
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆39Feb 27, 2024Updated 2 years ago
- Code for the paper "Function-Space Learning Rates"☆24Jun 3, 2025Updated 11 months ago
- BH hackathon☆14Apr 4, 2024Updated 2 years ago
- Let's create synthetic textbooks together :)☆76Jan 29, 2024Updated 2 years ago
- Chrome Extension for YouTube. Acts as an assistant for the YouTube video you are watching☆23Apr 26, 2023Updated 3 years ago
- QLoRA for Masked Language Modeling☆24Sep 11, 2023Updated 2 years ago
- Proton VPN Special Offer - Get 70% off • AdSpecial partner offer. Trusted by over 100 million users worldwide. Tested, Approved and Recommended by Experts.
- Code for paper: "QuIP: 2-Bit Quantization of Large Language Models With Guarantees"☆397Feb 24, 2024Updated 2 years ago
- ☆14Jan 24, 2023Updated 3 years ago
- aigc evals☆10Dec 2, 2023Updated 2 years ago
- Deita: Data-Efficient Instruction Tuning for Alignment [ICLR2024]☆596Dec 9, 2024Updated last year
- ☆18Feb 20, 2024Updated 2 years ago
- A framework for pitting LLMs against each other in an evolving library of games ⚔☆35Apr 17, 2025Updated last year
- A multimodal, function calling powered LLM webui.☆213Sep 23, 2024Updated last year
- 🚀 Automatically convert unstructured data into a high-quality 'textbook' format, optimized for fine-tuning Large Language Models (LLMs)☆26Oct 15, 2023Updated 2 years ago
- Example Fabulous app that uses MSAL to authenticate a user on Azure Active Directory☆11Dec 8, 2022Updated 3 years ago
- Deploy on Railway without the complexity - Free Credits Offer • AdConnect your repo and Railway handles the rest with instant previews. Quickly provision container image services, databases, and storage volumes.
- [NAACL 2025 Main Selected Oral] Repository for the paper: Prompt Compression for Large Language Models: A Survey☆37May 18, 2025Updated last year
- ☆607Aug 23, 2024Updated last year
- This repository contains TA sessions work for the Machine Learning course, Aug '18 - Dec '18.☆11Nov 17, 2018Updated 7 years ago
- ☆14Jul 13, 2025Updated 10 months ago
- ☆35Feb 10, 2025Updated last year
- Ultra Fast Multi-Modality Vector Database☆18Feb 21, 2024Updated 2 years ago
- Client-Server chat app that translate messages based on chosen languages via a simple map (plain, without ML model)☆14Jul 18, 2024Updated last year
- This is a simple guide to help you build an Anthropic Claude Sonnet 3.5 chatbot interface with Gradio☆12Jun 23, 2024Updated last year
- ☆234Jun 11, 2024Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- This is the example project that is referenced in the Technical Article: Continuous Integration for Verification of Simulink Models. Mode…☆16Jun 23, 2022Updated 3 years ago
- Generate Synthetic Data Using OpenAI, MistralAI or AnthropicAI☆222Apr 29, 2024Updated 2 years ago
- an implementation of Self-Extend, to expand the context window via grouped attention☆119Jan 7, 2024Updated 2 years ago
- Minimalistic batching application for LLMs using ASP.NET Core and LLamaSharp☆12Oct 23, 2024Updated last year
- ☆15Nov 2, 2022Updated 3 years ago
- Multi-Layer Key-Value sharing experiments on Pythia models☆34Jun 14, 2024Updated last year
- ☆12Sep 16, 2024Updated last year