CohleM / lilLM
A little(lil) Language Model (LM)
☆47Updated 3 weeks ago
Alternatives and similar repositories for lilLM:
Users that are interested in lilLM are comparing it to the libraries listed below
- Testing LLM reasoning abilities with family relationship quizzes.☆62Updated 2 months ago
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆138Updated last month
- Video+code lecture on building nanoGPT from scratch☆66Updated 9 months ago
- ☆126Updated 7 months ago
- ☆46Updated last month
- AI management tool☆113Updated 4 months ago
- idea: https://github.com/nyxkrage/ebook-groupchat/☆86Updated 7 months ago
- ☆201Updated 10 months ago
- Micro Llama is a small Llama based model with 300M parameters trained from scratch with $500 budget☆145Updated last year
- 1.58-bit LLaMa model☆82Updated 11 months ago
- Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.☆196Updated 8 months ago
- An efficent implementation of the method proposed in "The Era of 1-bit LLMs"☆155Updated 5 months ago
- ☆112Updated 6 months ago
- Train your own SOTA deductive reasoning model☆81Updated 3 weeks ago
- A compact LLM pretrained in 9 days by using high quality data☆303Updated 4 months ago
- Dataset Crafting w/ RAG/Wikipedia ground truth and Efficient Fine-Tuning Using MLX and Unsloth. Includes configurable dataset annotation …☆177Updated 8 months ago
- A pipeline parallel training script for LLMs.☆136Updated last week
- Train your own small bitnet model☆65Updated 5 months ago
- Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.☆117Updated this week
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆71Updated 6 months ago
- Low-Rank adapter extraction for fine-tuned transformers models☆171Updated 10 months ago
- llama.cpp fork with additional SOTA quants and improved performance☆222Updated this week
- Kosmos-2.5 is a cutting-edge Multimodal-LLM (MLLM) specializing in image OCR. However, its stringent software requirements & Python-scrip…☆59Updated 8 months ago
- ☆83Updated 3 months ago
- model activation visualiser☆90Updated this week
- Run multiple resource-heavy Large Models (LM) on the same machine with limited amount of VRAM/other resources by exposing them on differe…☆55Updated last month
- Experimenting with small language models☆64Updated last year
- Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free☆230Updated 4 months ago
- a simplified version of Google's Gemma model to be used for learning☆24Updated last year
- 1.58 Bit LLM on Apple Silicon using MLX☆194Updated 10 months ago