malaysia-ai / dataset
Recipes to prepare datasets!
☆12Updated this week
Alternatives and similar repositories for dataset:
Users that are interested in dataset are comparing it to the libraries listed below
- Supervised instruction finetuning for LLM with HF trainer and Deepspeed☆34Updated last year
- A Python wrapper around HuggingFace's TGI (text-generation-inference) and TEI (text-embedding-inference) servers.☆34Updated 2 months ago
- The collection of bulding blocks building fine-tunable metric learning models☆32Updated last month
- ☆28Updated last year
- Prompt Engineering for Large Language Models - Notebooks, Demos, Exercises, and Projects☆22Updated last year
- minimal LLM scripts for 24GB VRAM GPUs. training, inference, whatever☆37Updated 2 weeks ago
- A collection of my data science articles published in Towards Data Science and Towards AI.☆16Updated last year
- Trace LLM calls (and others) and visualize them in WandB, as interactive SVG or using a streaming local webapp☆15Updated this week
- Fast model deployment on AWS Lambda☆14Updated 11 months ago
- This project shows how to derive the total number of training tokens from a large text dataset from 🤗 datasets with Apache Beam and Data…☆24Updated 2 years ago
- IEEE-CIS Fraud Detection Kaggle Competition Code☆9Updated 5 years ago
- [WIP] Behold, semantic-search, built over sentence-transformers to make it easy for search engineers to evaluate, optimise and deploy mod…☆15Updated last year
- Comparing M2M and mT5 on a rare language pairs, blog post: https://medium.com/@abdessalemboukil/comparing-facebooks-m2m-to-mt5-in-low-re…☆15Updated 3 years ago
- ☆12Updated last month
- Article about deploying machine learning models using grpc, pytorch and asyncio☆27Updated 2 years ago
- Fine-tune Mistral 7B to generate fashion style suggestions☆34Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆29Updated 4 months ago
- Fine-Tuning LLM and embedding models☆27Updated last year
- serving a torch model using Celery, Redis and RabbitMQ to serve users asynchronously☆20Updated last year
- Fine tuning Mistral-7b with PEFT(Parameter Efficient Fine-Tuning) and LoRA(Low-Rank Adaptation) on Puffin Dataset(multi-turn conversation…☆12Updated last year
- Bi-Directional Attention Flow for Machine Comprehensions☆9Updated 7 years ago
- ☆15Updated last year
- Reward Model framework for LLM RLHF☆60Updated last year
- Lightweight demos for finetuning LLMs. Powered by 🤗 transformers and open-source datasets.☆67Updated 4 months ago
- ☆22Updated 11 months ago
- ☆30Updated 2 years ago
- Using short models to classify long texts☆21Updated last year
- Universal text classifier for generative models☆22Updated 6 months ago
- Code for "Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking" (https://arxiv.org/abs/2…☆13Updated last year
- Repository containing the SPIN experiments on the DIBT 10k ranked prompts☆24Updated 11 months ago