Cohere-Labs-Community / llm-profiling-toolkitLinks
☆20Updated last year
Alternatives and similar repositories for llm-profiling-toolkit
Users that are interested in llm-profiling-toolkit are comparing it to the libraries listed below
Sorting:
- Code for the paper "Fishing for Magikarp"☆162Updated 3 months ago
- Pre-train Static Word Embeddings☆85Updated 2 months ago
- minimal pytorch implementation of bm25 (with sparse tensors)☆104Updated last year
- Efficiently computing & storing token n-grams from large corpora☆26Updated 10 months ago
- Storing long contexts in tiny caches with self-study☆124Updated this week
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- An introduction to LLM Sampling☆79Updated 8 months ago
- utilities for loading and running text embeddings with onnx☆44Updated last year
- Thorn in a HaizeStack test for evaluating long-context adversarial robustness.☆26Updated last year
- A toolkit implementing advanced methods to transfer models and model knowledge across tokenizers.☆41Updated last month
- Supercharge huggingface transformers with model parallelism.☆77Updated 3 weeks ago
- Experiments for efforts to train a new and improved t5☆76Updated last year
- 📝 Reference-Free automatic summarization evaluation with potential hallucination detection☆101Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆65Updated last year
- Sphynx Hallucination Induction☆53Updated 6 months ago
- NLP with Rust for Python 🦀🐍☆64Updated 3 months ago
- Library for fast text representation and classification.☆31Updated last year
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated 2 years ago
- ☆61Updated last week
- ☆49Updated 6 months ago
- Training and evaluation code for the paper "Headless Language Models: Learning without Predicting with Contrastive Weight Tying" (https:/…☆27Updated last year
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- lossily compress representation vectors using product quantization☆58Updated 3 months ago
- Python library to use Pleias-RAG models☆61Updated 3 months ago
- A library for squeakily cleaning and filtering language datasets.☆47Updated 2 years ago
- ☆67Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆88Updated last year
- ☆136Updated 4 months ago
- Simple GRPO scripts and configurations.☆59Updated 6 months ago
- Your buddy in the (L)LM space.☆64Updated 10 months ago