FFengIll / embedding.cpp

ggml implementation of BERT Embedding

☆24

Related projects ⓘ

Alternatives and complementary repositories for embedding.cpp

iamlemec / bert.cpp
GGML implementation of BERT model with Python bindings and quantization.
☆51Updated 8 months ago
xyzhang626 / embeddings.cpp
ggml implementation of embedding models including SentenceTransformer and BGE
☆52Updated 10 months ago
jiamingkong / rwkv_reward
Training a reward model for RLHF using RWKV.
☆14Updated last year
ashvardanian / usearch-images
Semantic Search demo featuring UForm, USearch, UCall, and StreamLit, to visual and retrieve from image datasets, similar to "CLIP Retriev…
☆39Updated 10 months ago
OpenBuddy / GrandSage
☆16Updated 5 months ago
hscspring / llama.np
Inference Llama/Llama2 Modes in NumPy
☆19Updated 11 months ago
RobinQu / instinct.cpp
instinct.cpp provides ready to use alternatives to OpenAI Assistant API and built-in utilities for developing AI Agent applications (RAG,…
☆37Updated 4 months ago
nyunAI / PruneGPT
☆52Updated 5 months ago
ikawrakow / ik_llama.cpp
llama.cpp fork with additional SOTA quants and improved performance
☆86Updated this week
the-crypt-keeper / ggml-downloader
Simple, Fast, Parallel Huggingface GGML model downloader written in python
☆24Updated last year
bentoml / sentence-embedding-bento
Sentence Embedding as a Service
☆14Updated last year
antirez / LLM-FTC-sampling
First token cutoff sampling inference example
☆28Updated 9 months ago
zhuzilin / faster-nougat
Implementation of nougat that focuses on processing pdf locally.
☆73Updated 6 months ago
limcheekin / open-text-embeddings
Open Source Text Embedding Models with OpenAI Compatible API
☆131Updated 3 months ago
gpustack / llama-box
LLM inference server implementation based on llama.cpp.
☆25Updated this week
cwhy / rwkv-decon
Trying to deconstruct RWKV in understandable terms
☆14Updated last year
ggerganov / bark.cpp
Port of Suno AI's Bark in C/C++ for fast inference
☆54Updated 6 months ago
cjpais / whisperfile
☆53Updated 2 months ago
nomic-ai / kompute
General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). …
☆41Updated last month
sdan / selfextend
an implementation of Self-Extend, to expand the context window via grouped attention
☆118Updated 10 months ago
vespa-engine / pyvespa
Python API for https://vespa.ai, the open big data serving engine
☆101Updated this week
cahya-wirawan / rwkv-tokenizer
A fast RWKV Tokenizer written in Rust
☆36Updated 2 months ago
iboB / git-lfs-download
Download full or partial git-lfs repos without temporarily using 2x disk space
☆30Updated last year
wozeparrot / tinyrwkv
tinygrad port of the RWKV large language model.
☆43Updated 4 months ago
Dan-wanna-M / formatron
Formatron empowers everyone to control the format of language models' output with minimal overhead.
☆152Updated this week
nexusflowai / nexusraven-pip
☆37Updated 11 months ago
LLM360 / crystalcoder-data-prep
Data preparation code for CrystalCoder 7B LLM
☆42Updated 6 months ago
LLukas22 / llm-rs-python
Unofficial python bindings for the rust llm library. 🐍❤️🦀
☆73Updated last year
PABannier / biogpt.cpp
Port of Microsoft's BioGPT in C/C++ using ggml
☆87Updated 8 months ago