pacman100 / openhathi_instructLinks

This repository contains the code for dataset curation and finetuning of instruct variant of the Bilingual OpenHathi model. The resulting model is meant to follow instructions and chat in Hindi and Hinglish.

☆23

Alternatives and similar repositories for openhathi_instruct

Users that are interested in openhathi_instruct are comparing it to the libraries listed below

Sorting:

tcapelle / llm_recipes
A set of scripts and notebooks on LLM finetunning and dataset creation
☆110Updated 10 months ago
Locutusque / TPU-Alignment
Fully fine-tune large models like Mistral, Llama-2-13B, or Qwen-14B completely for free
☆232Updated 9 months ago
TeluguLLMLabs / Indic-gemma-7b-Navarasa
Repository for fine-tuning gemma models using unsloth for indic languages
☆95Updated last year
AI4Bharat / IndicLLMSuite
A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages
☆107Updated 9 months ago
PrithivirajDamodaran / Route0x
Low latency, High Accuracy, Custom Query routers for Humans and Agents. Built by Prithivi Da
☆111Updated 4 months ago
AI4Bharat / IndicInstruct
Code repository for "Introducing Airavata: Hindi Instruction-tuned LLM"
☆60Updated 9 months ago
muellerzr / minimal-trainer-zoo
Minimal example scripts of the Hugging Face Trainer, focused on staying under 150 lines
☆197Updated last year
VarunGumma / IndicTransToolkit
A simple, consistent and extendable toolkit for IndicTrans2. (Pypi: https://pypi.org/project/indictranstoolkit)
☆34Updated last week
ayulockin / neurips-llm-efficiency-challenge
Starter pack for NeurIPS LLM Efficiency Challenge 2023.
☆125Updated last year
adithya-s-k / indic_eval
A lightweight evaluation suite tailored specifically for assessing Indic LLMs across a diverse range of tasks
☆37Updated last year
daniel-furman / sft-demos
Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.
☆77Updated 9 months ago
premAI-io / benchmarks
🕹️ Performance Comparison of MLOps Engines, Frameworks, and Languages on Mainstream AI Models.
☆137Updated last year
center-for-humans-and-machines / transformer-heads
Toolkit for attaching, training, saving and loading of new heads for transformer models
☆284Updated 5 months ago
rashmimarganiatgithub / LLMS_Library_2023
LLM_library is a comprehensive repository serves as a one-stop resource hands-on code, insightful summaries.
☆69Updated last year
deshwalmahesh / PHUDGE
Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…
☆49Updated last year
huggingface / competitions
☆124Updated 9 months ago
arcee-ai / DALM
Domain Adapted Language Modeling Toolkit - E2E RAG
☆325Updated 8 months ago
rasbt / LLM-finetuning-scripts
☆205Updated last year
AblateIt / finetune-study
Comprehensive analysis of difference in performance of QLora, Lora, and Full Finetunes.
☆82Updated last year
huggingface / llm-swarm
Manage scalable open LLM inference endpoints in Slurm clusters
☆268Updated last year
jina-ai / correlations
Simple UI for debugging correlations of text embeddings
☆288Updated 2 months ago
merveenoyan / awesome-osml-for-devs
List of resources, libraries and more for developers who would like to build with open-source machine learning off-the-shelf
☆199Updated last year
anyscale / e2e-llm-workflows
Fine-tune an LLM to perform batch inference and online serving.
☆112Updated 2 months ago
huggingface / data-is-better-together
Let's build better datasets, together!
☆260Updated 7 months ago
abacaj / train-with-fsdp
☆93Updated last year
cohere-ai / DiskVectorIndex
☆211Updated last month
mixedbread-ai / batched
The Batched API provides a flexible and efficient way to process multiple requests in a batch, with a primary focus on dynamic batching o…
☆142Updated 3 weeks ago
cohere-ai / BinaryVectorDB
Efficient vector database for hundred millions of embeddings.
☆207Updated last year
BhabhaAI / dataformer
Solving data for LLMs - Create quality synthetic datasets!
☆150Updated 6 months ago
IlyasMoutawwakil / py-txi
A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.
☆33Updated 2 months ago