tylerachang / goldfishLinks

Goldfish: Monolingual language models for 350 languages.

☆17

Alternatives and similar repositories for goldfish

Users that are interested in goldfish are comparing it to the libraries listed below

Sorting:

Knowledgator / FlashDeBERTa
Trully flash implementation of DeBERTa disentangled attention mechanism.
☆59Updated last month
nbroad1881 / strideformer
Using short models to classify long texts
☆21Updated 2 years ago
argilla-io / distilabel-spin-dibt
Repository containing the SPIN experiments on the DIBT 10k ranked prompts
☆24Updated last year
orevaahia / magnet-tokenization
☆12Updated 6 months ago
Leukas / CUTE
☆13Updated last month
pacman100 / peft-codegen-25
☆23Updated last year
salesforce / simplification
☆22Updated 5 months ago
google-research-datasets / swim-ir
SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…
☆48Updated last year
facebookresearch / lss_eval
This is a new metric that can be used to evaluate faithfulness of text generated by LLMs. The work behind this repository can be found he…
☆31Updated last year
jaehunjung1 / cascaded-selective-evaluation
☆24Updated 4 months ago
castorini / hf-spacerini
Plug-and-play Search Interfaces with Pyserini and Hugging Face
☆32Updated last year
ContextualAI / CLAIR_and_APO
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment
☆57Updated 9 months ago
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated last month
malteos / clp-transfer
Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning
☆30Updated 2 years ago
arcee-ai / DAM
☆51Updated 7 months ago
cisnlp / GlotCC
🕸 GlotCC Dataset and Pipline -- NeurIPS 2024
☆19Updated 2 months ago
AnswerDotAI / ModernBERT-Instruct-mini-cookbook
☆47Updated 4 months ago
krypticmouse / matryoshka-representation-learning
PyTorch implementation for MRL
☆18Updated last year
luohongyin / EntST
Entailment self-training
☆25Updated 2 years ago
mungg / FABLES
☆57Updated 9 months ago
lilakk / BLEUBERI
Official repository for "BLEUBERI: BLEU is a surprisingly effective reward for instruction following"
☆23Updated 3 weeks ago
pchizhov / picky_bpe
BPE modification that implements removing of the intermediate tokens during tokenizer training.
☆25Updated 7 months ago
chandar-lab / NeoBERT
☆56Updated 3 weeks ago
IBM / model-recycling
Ranking of fine-tuned HF models as base models.
☆35Updated last month
orionw / FollowIR
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
☆44Updated 11 months ago
allenai / EmbeddingRecycling
Embedding Recycling for Language models
☆38Updated last year
allenai / infinigram-api
☆61Updated 3 weeks ago
EleutherAI / mdl
Minimum Description Length probing for neural network representations
☆18Updated 5 months ago
davanstrien / haiku-dpo
Using open source LLMs to build synthetic datasets for direct preference optimization
☆64Updated last year
ElleLeonne / Lightning-ReLoRA
A public implementation of the ReLoRA pretraining method, built on Lightning-AI's Pytorch Lightning suite.
☆33Updated last year