Finetune Malaysian LLM for Malaysian context embedding task.
☆23Apr 27, 2024Updated last year
Alternatives and similar repositories for llm-embedding
Users that are interested in llm-embedding are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Official repo for FSE'24 paper "CodeArt: Better Code Models by Attention Regularization When Symbols Are Lacking"☆18Mar 10, 2025Updated last year
- Code for "Incorporating Relevance Feedback for Information-Seeking Retrieval using Few-Shot Document Re-Ranking" (https://arxiv.org/abs/2…☆14Feb 2, 2026Updated last month
- ☆32Jul 29, 2024Updated last year
- ☆13Mar 27, 2020Updated 6 years ago
- ☆65Dec 17, 2025Updated 3 months ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- Official implementation of Vector-ICL: In-context Learning with Continuous Vector Representations (ICLR 2025)☆21Jun 2, 2025Updated 9 months ago
- Code for EMNLP 2023 paper: DALE: Generative Data Augmentation for Low-Resource Legal NLP☆10Oct 27, 2023Updated 2 years ago
- AgentsCourt: Building Judicial Decision-Making Agents with Court Debate Simulation and Legal Knowledge Augmentation (EMNLP 2024 Findings)☆16Dec 30, 2024Updated last year
- Stronger Baselines for Grammatical Error Correction Using a Pretrained Encoder-Decoder Model.☆37Apr 6, 2023Updated 2 years ago
- GPU-accelerated algorithm for subsampling datasets while preserving diversity☆27Jan 12, 2024Updated 2 years ago
- Tracking part of siamese-fc.☆10Feb 25, 2017Updated 9 years ago
- Electronic funhouse mirror for Halloween that puts animals and monsters on people's faces☆11Oct 31, 2019Updated 6 years ago
- This project is based on Opencv, and achieves the part of the generation of segmentation (using depth map) and image denoising using Mark…☆11Oct 29, 2018Updated 7 years ago
- Python and R scripts for visualising and analysing baby sleep patterns.☆12May 17, 2017Updated 8 years ago
- End-to-end encrypted email - Proton Mail • AdSpecial offer: 40% Off Yearly / 80% Off First Month. All Proton services are open source and independently audited for security.
- Arabic News Stance Corpus☆11Feb 5, 2021Updated 5 years ago
- Codebase for VideoConviction, accepted at KDD 2025 (D&B Track)☆18Jan 22, 2026Updated 2 months ago
- 敏感信息,垃圾信息,黄赌毒信息判断☆11Jul 17, 2017Updated 8 years ago
- Simulation of a stop cascade occurring on an exchange☆13Nov 2, 2021Updated 4 years ago
- Multi-Modal Language Modeling with Image, Audio and Text Integration, included multi-images and multi-audio in a single multiturn.☆18Feb 20, 2024Updated 2 years ago
- Metadata and per-statute PDFs for the U.S. Statutes at Large through volume 64 (1789-1951).☆17Apr 24, 2020Updated 5 years ago
- Resources for the Semeval 2016 Task 3 Community Question Answering. Contains word embeddings and system description results☆10Jan 13, 2017Updated 9 years ago
- This repository contains code used for our Multi Sentence Inference NAACL'22 paper.☆12Mar 6, 2023Updated 3 years ago
- explores Chinese language models with sub-character level visual information☆16Oct 5, 2018Updated 7 years ago
- Virtual machines for every use case on DigitalOcean • AdGet dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
- Transformer model for Portuguese language (Brazil pt_BR)☆16Sep 15, 2025Updated 6 months ago
- SiamFC tracking in MXNet.☆17Jun 5, 2019Updated 6 years ago
- Routines for implementing various statistical and machine learning techniques.☆19Nov 28, 2022Updated 3 years ago
- Nextein starter☆17Jan 18, 2023Updated 3 years ago
- Reduce the size of pretrained Hugging Face models via vocabulary trimming.☆48Dec 28, 2022Updated 3 years ago
- Neural networks in Theano (ABANDONED/DISCONTINUED) - see dagbldr for a continuation of this code with some new tricks☆18Feb 26, 2015Updated 11 years ago
- ☆35May 18, 2023Updated 2 years ago
- A Python reimplementation + extension of "Planning with Large Language Models for Code Generation" (https://arxiv.org/abs/2303.05510)☆18Dec 1, 2023Updated 2 years ago
- Adaptation of Monte Carlo and SARSA algorithms (Reinforcement Learning) for learning the policy of sellers/ buyers in stock market☆12Jul 23, 2018Updated 7 years ago
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting with the flexibility to host WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Cloudways by DigitalOcean.
- This repository provides the code for applying Contrastive Learning Penalty Loss (CLPL) and Mixture of Experts (MoE) to the BGE-M3 text e…☆11Dec 27, 2024Updated last year
- The official implementation of the paper "Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset"(ICASSP 2…☆12Feb 19, 2023Updated 3 years ago
- The contrastive token loss function for reducing generative repetition of autoregressive neural language models.☆13May 11, 2022Updated 3 years ago
- Code Roberta version of RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder☆10Mar 16, 2023Updated 3 years ago
- Collections of IR Research☆37May 18, 2025Updated 10 months ago
- [ACL2024] Exploring the Potential of Large Language Models in Computational Argumentation☆17Aug 21, 2024Updated last year
- T5Patches is a set of tools for fast and targeted editing of generative language models built with T5X.☆12May 31, 2024Updated last year