malaysia-ai / pretrain-text-dataset
Prepare pretrain dataset for Malaysian context.
☆11Updated 2 months ago
Related projects ⓘ
Alternatives and complementary repositories for pretrain-text-dataset
- Implementation of the DocLLM paper for Llama models.☆12Updated 2 weeks ago
- Fine-tune Mistral 7B to generate fashion style suggestions☆31Updated 10 months ago
- Deploy Pytorch models to production via panini☆10Updated 5 years ago
- ☆21Updated 7 months ago
- Document Image Classification☆11Updated 6 years ago
- 🦖 Streamlined Recommender Systems with TensorFlow and KubeFlow☆18Updated last year
- ☆19Updated 3 years ago
- ☆29Updated last year
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆34Updated last year
- ☆29Updated last year
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated last year
- ☆14Updated 5 months ago
- An implementation of bidirectional LSTM-CRF for Named Entity Relationship on custom corpus with custom word embeddings☆13Updated 5 years ago
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆64Updated 3 weeks ago
- ☆14Updated last year
- Visual similarity search engine demo with use of PyTorch Metric Learning and Qdrant☆10Updated last year
- LLaMA implementation for HuggingFace Transformers☆39Updated last year
- Reward Model framework for LLM RLHF☆58Updated last year
- ☆22Updated 8 months ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated last month
- This repository contains all code examples for my TensorFlow World talk about "Advanced model deployments with TensorFlow Serving"☆17Updated last year
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated last year
- Fast whitespace correction with Transformers☆14Updated 6 months ago
- Open sourced backend for Martian's LLM Inference Provider Leaderboard☆17Updated 2 months ago
- ☆37Updated last year
- An unofficial Implementation of DocParser: End-to-end OCR-free Information Extraction from Visually Rich Documents☆33Updated last year
- ☆16Updated 3 years ago
- Hugging Face RoBERTa with Flash Attention 2☆19Updated last year
- ☆73Updated 10 months ago