marepilc / pink-parquetLinks

User-friendly viewer for Parquet files

☆9

Alternatives and similar repositories for pink-parquet

Users that are interested in pink-parquet are comparing it to the libraries listed below

Sorting:

Pringled / korok
Lightweight Hybrid Search and Reranking
☆10Updated 4 months ago
schwartz-lab-NLP / Tokens2Words
☆12Updated 3 months ago
benkaiser / llm-compare
Nodejs script to run an LLM prompt across a bunch of models.
☆9Updated 6 months ago
stefan-it / modern-bert-ner
My NER Experiments with ModernBERT and Ettin
☆21Updated this week
kaistAI / factual-knowledge-acquisition
☆21Updated 2 months ago
enjalot / latent-sae
Training code for Sparse Autoencoders on Embedding models
☆38Updated 4 months ago
shaharl6000 / MoreDocsSameLen
This repository contains code and datasets for our paper on the effects of document multiplicity while the context size is fixed in Retri…
☆15Updated 4 months ago
pchizhov / picky_bpe
BPE modification that implements removing of the intermediate tokens during tokenizer training.
☆24Updated 7 months ago
flairNLP / familiarity
Label shift estimation for transfer difficulty with Familiarity.
☆10Updated 5 months ago
facebookresearch / DIG-In
This library supports evaluating disparities in generated image quality, diversity, and consistency between geographic regions.
☆20Updated last year
hbseong97 / HarmAug
HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models
☆12Updated 4 months ago
castorini / hf-spacerini
Plug-and-play Search Interfaces with Pyserini and Hugging Face
☆32Updated last year
megagonlabs / holobench
🫧 Code for Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data (Maekawa*, Iso* et al.…
☆12Updated 4 months ago
MeLeLBGU / SaGe
Code for SaGe subword tokenizer (EACL 2023)
☆25Updated 7 months ago
huggingface / ember
ANE accelerated embedding models!
☆18Updated 7 months ago
joonspk-research / gabm-stanford-cs222
☆11Updated 9 months ago
amirrezasalimi / friday-agents
Friday Agents. App: https://chat.toolstack.run/
☆11Updated 7 months ago
mlabonne / chessllm
☆38Updated last year
forecastingresearch / forecastbench-datasets
Forecastbench Datasets, updated nightly
☆12Updated this week
Aleph-Alpha-Research / trigrams
☆56Updated 2 months ago
kuzudb / langchain-kuzu
LangChain-Kuzu integration
☆10Updated 3 months ago
microsoft / REBEL
☆40Updated 2 months ago
stair-lab / mlhp
☆10Updated 2 weeks ago
llm-factory / Distill-Factory
a tool for gerenate dataset from doc
☆12Updated 3 months ago
AI-Hypercomputer / tpu-recipes
☆38Updated this week
bigcode-project / bigcode-tokenizer
☆15Updated last year
sfeucht / footprints
https://footprints.baulab.info
☆17Updated 9 months ago
kensho-technologies / pathpiece
PathPiece tokenizer
☆12Updated 8 months ago
dtch1997 / steering-bench
Official codebase for "Analyzing the Generalization and Reliability of Steering Vectors"
☆14Updated 7 months ago
orevaahia / magnet-tokenization
☆12Updated 7 months ago