Nikityyy / lilleLinks
A powerful 130-million-parameter model trained from scratch as part of a truly open-source stack, including a custom tokenizer, dataset, and optimizer.
☆62Updated 2 weeks ago
Alternatives and similar repositories for lille
Users that are interested in lille are comparing it to the libraries listed below
Sorting:
- Code to train and evaluate Neural Attention Memory Models to obtain universally-applicable memory systems for transformers.☆322Updated 10 months ago
- Video+code lecture on building nanoGPT from scratch☆69Updated last year
- Exploration into the proposed architecture from Sapient Intelligence of Singapore 🇸🇬☆63Updated last month
- GRadient-INformed MoE☆264Updated 11 months ago
- Implementation snake game based on Diffusion model☆91Updated 8 months ago
- Testing LLM reasoning abilities with family relationship quizzes.☆63Updated 7 months ago
- ☆134Updated last year
- Guaranteed Structured Output from any Language Model via Hierarchical State Machines☆146Updated 3 months ago
- klmbr - a prompt pre-processing technique to break through the barrier of entropy while generating text with LLMs☆80Updated 11 months ago
- Live-bending a foundation model’s output at neural network level.☆265Updated 5 months ago
- smolLM with Entropix sampler on pytorch☆150Updated 10 months ago
- Basic world models☆23Updated last week
- Genertaes control vectors for use with llama.cpp in GGUF format.☆31Updated 6 months ago
- A GPT with self-similar nested properties☆21Updated last year
- look how they massacred my boy☆64Updated 11 months ago
- Fast parallel LLM inference for MLX☆217Updated last year
- Multi-Agent Step Race Benchmark: Assessing LLM Collaboration and Deception Under Pressure. A multi-player “step-race” that challenges LLM…☆73Updated 3 weeks ago
- Transplants vocabulary between language models, enabling the creation of draft models for speculative decoding WITHOUT retraining.☆42Updated last week
- An easy-to-understand framework for LLM samplers that rewind and revise generated tokens☆146Updated 6 months ago
- realtime latent world model inference demo☆47Updated 10 months ago
- ☆73Updated 3 months ago
- ☆31Updated 5 months ago
- ☆133Updated 4 months ago
- Repository to create traveling waves integrate special information through time☆55Updated last month
- DeMo: Decoupled Momentum Optimization☆190Updated 9 months ago
- A little(lil) Language Model (LM). A tiny reproduction of LLaMA 3's model architecture.☆52Updated 4 months ago
- Clue inspired puzzles for testing LLM deduction abilities☆41Updated 5 months ago
- ☆30Updated 11 months ago
- PyTorch implementation of models from the Zamba2 series.☆185Updated 7 months ago
- Plotting (entropy, varentropy) for small LMs☆98Updated 3 months ago