ZhaofengWu / semantic-hubLinks
☆19Updated 3 months ago
Alternatives and similar repositories for semantic-hub
Users that are interested in semantic-hub are comparing it to the libraries listed below
Sorting:
- Introducing Filtered Direct Preference Optimization (fDPO) that enhances language model alignment with human preferences by discarding lo…☆16Updated last year
- Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control☆75Updated 3 years ago
- This repository contains the code used for the experiments in the paper "Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity…☆28Updated last month
- Official implementation of Bootstrapping Language Models via DPO Implicit Rewards☆45Updated 7 months ago
- ☆108Updated 2 years ago
- Code for ICML 25 paper "Metadata Conditioning Accelerates Language Model Pre-training (MeCo)"☆48Updated 5 months ago
- ☆106Updated last year
- Skill-It! A Data-Driven Skills Framework for Understanding and Training Language Models☆47Updated 2 years ago
- [NeurIPS 2023] Repetition In Repetition Out: Towards Understanding Neural Text Degeneration from the Data Perspective☆38Updated 2 years ago
- Official repository of paper "RNNs Are Not Transformers (Yet): The Key Bottleneck on In-context Retrieval"☆27Updated last year
- Official repository for ACL 2025 paper "Model Extrapolation Expedites Alignment"☆76Updated 6 months ago
- The source code of "Merging Experts into One: Improving Computational Efficiency of Mixture of Experts (EMNLP 2023)":☆40Updated last year
- Repo accompanying our paper "Do Llamas Work in English? On the Latent Language of Multilingual Transformers".☆80Updated last year
- A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models☆68Updated 9 months ago
- Code for paper "Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning"☆83Updated last year
- [ACL 2024] Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models☆26Updated last year
- Evaluate your agent memory on real-world dialogues, not LLM-simulated dialogues.☆35Updated 5 months ago
- Long Context Extension and Generalization in LLMs☆62Updated last year
- GSM-Plus: Data, Code, and Evaluation for Enhancing Robust Mathematical Reasoning in Math Word Problems.☆63Updated last year
- PyTorch implementation of StableMask (ICML'24)☆14Updated last year
- Code for ICLR 2025 Paper "What is Wrong with Perplexity for Long-context Language Modeling?"☆105Updated 2 months ago
- Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our co…☆24Updated 11 months ago
- DiffusER: Discrete Diffusion via Edit-based Reconstruction (Reid, Hellendoorn & Neubig, 2022)☆54Updated 4 months ago
- [NeurIPS 2024 Main Track] Code for the paper titled "Instruction Tuning With Loss Over Instructions"☆38Updated last year
- Code for paper "Patch-Level Training for Large Language Models"☆95Updated last month
- Code and Model for NeurIPS 2024 Spotlight Paper "Stacking Your Transformers: A Closer Look at Model Growth for Efficient LLM Pre-Training…☆44Updated last year
- Function Vectors in Large Language Models (ICLR 2024)☆187Updated 7 months ago
- Tasks for describing differences between text distributions.☆17Updated last year
- ☆20Updated 3 years ago
- Implementation of 🥥 Coconut, Chain of Continuous Thought, in Pytorch☆181Updated 5 months ago