LoserCheems / WonderfulMatrices
Wonderful Matrices to Build Small Language Models
☆44Updated last month
Alternatives and similar repositories for WonderfulMatrices:
Users that are interested in WonderfulMatrices are comparing it to the libraries listed below
- Problem-Oriented Segmentation and Retrieval EMNLP 2024 Findings☆30Updated 4 months ago
- ☆31Updated 10 months ago
- Maya: An Instruction Finetuned Multilingual Multimodal Model using Aya☆107Updated last month
- Code associated with the EMNLP 2024 Main paper: "Image, tell me your story!" Predicting the original meta-context of visual misinformatio…☆36Updated last week
- [NeurIPS XAIA & Springer] Code and notebooks to paper "A Fresh Look at Sanity Checks for Saliency Maps"☆25Updated 8 months ago
- A tool to assist in the interpretation of learned features in sparse autoencoders (in particular the four SAE's trained by Joseph Bloom o…☆19Updated 5 months ago
- Code for this paper "HyperRouter: Towards Efficient Training and Inference of Sparse Mixture of Experts via HyperNetwork"☆31Updated last year
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆56Updated last week
- XmodelLM☆39Updated 4 months ago
- This is the official code for the paper "Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation"☆44Updated last month
- Official implementation of MetaTree: Learning a Decision Tree Algorithm with Transformers☆104Updated 6 months ago
- The first dense retrieval model that can be prompted like an LM☆68Updated 6 months ago
- VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models☆31Updated 3 weeks ago
- Open-source Python toolkit focused on deep learning with ordinal methodologies☆50Updated this week
- This repository contains the code for the paper: SirLLM: Streaming Infinite Retentive LLM☆57Updated 10 months ago
- ☆11Updated last year
- Repo hosting codes and materials related to speeding LLMs' inference using token merging.☆35Updated 11 months ago
- ☆46Updated last month
- gzip Predicts Data-dependent Scaling Laws☆34Updated 10 months ago
- Backtracing: Retrieving the Cause of the Query, EACL 2024 Long Paper, Findings.☆88Updated 8 months ago
- Code and data releases for the paper -- DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory☆38Updated last month
- Towards Medical Small Language Models with Self-Evolved \\ Slow Thinking☆66Updated 2 months ago
- "Pooling And Attention: What Are Effective Designs For LLM-Based Embedding Models?"☆35Updated 4 months ago
- a curated list of the role of small models in the LLM era☆96Updated 6 months ago
- Automated Qualitative Analysis of LLMs☆35Updated last week
- This is the repository for NAACL'25 paper "TART: An Open-Source Tool-Augmented Framework for Explainable Table-based Reasoning"☆49Updated 5 months ago
- [ECCV'24 Workshops Oral] DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling☆29Updated 4 months ago
- ☆48Updated 4 months ago
- Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation☆26Updated last month
- ☆18Updated 6 months ago