[ACL 2023]: Training Trajectories of Language Models Across Scales https://arxiv.org/pdf/2212.09803.pdf
☆25Nov 14, 2023Updated 2 years ago
Alternatives and similar repositories for training_trajectory_analysis
Users that are interested in training_trajectory_analysis are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Evaluating Durability: Benchmark Insights into Multimodal Watermarking☆12Jun 7, 2024Updated last year
- Code and dataset for Polyglot Prompting: Multilingual Multitask Prompt Training.☆18Dec 7, 2022Updated 3 years ago
- ☆14Oct 28, 2023Updated 2 years ago
- Language Model Baselines for PyTorch☆41Aug 18, 2020Updated 5 years ago
- [ECCV 2022] "Improve Few-Shot Transfer Learning with Low-Rank Decompose and Align" by Ziyu Jiang, Tianlong Chen, Xuxi Chen, Yu Cheng, Luo…☆13Jul 19, 2022Updated 3 years ago
- AI Agents on DigitalOcean Gradient AI Platform • AdBuild production-ready AI agents using customizable tools or access multiple LLMs through a single endpoint. Create custom knowledge bases or connect external data.
- ☆10Jul 16, 2023Updated 2 years ago
- Functional Optimal Transport: Map Estimation and Domain Adaptation for Functional data☆27Jun 7, 2021Updated 4 years ago
- [EACL 2023] Transfer Knowledge from Natural Language to Electrocardiography: Can We Detect Cardiovascular Disease Through Language Models…☆17May 7, 2024Updated 2 years ago
- Code accompanying our paper at AISTATS 2020☆21Jan 12, 2021Updated 5 years ago
- ☆33Jul 8, 2024Updated last year
- Source code and data for The Magic of IF: Investigating Causal Reasoning Abilities in Large Language Models of Code (Findings of ACL 2023…☆31Jun 4, 2023Updated 2 years ago
- Scripts for preprocessing the CoNLL-2005 SRL dataset.☆24Mar 28, 2019Updated 7 years ago
- ☆26Mar 4, 2022Updated 4 years ago
- DiWA: Diverse Weight Averaging for Out-of-Distribution Generalization☆31Jan 31, 2023Updated 3 years ago
- Deploy to Railway using AI coding agents - Free Credits Offer • AdUse Claude Code, Codex, OpenCode, and more. Autonomous software development now has the infrastructure to match with Railway.
- [EMNLP 2023] An Empirical Exploration of Cross-domain Alignment between Language and Electroencephalogram☆30Nov 9, 2023Updated 2 years ago
- Fun project to run your own LLM chat bot using llama.cpp☆11Jun 9, 2023Updated 2 years ago
- Pre-trained models for our work on Temporal Graph Generation☆19Jun 2, 2021Updated 4 years ago
- [ACL 2022] Structured Pruning Learns Compact and Accurate Models https://arxiv.org/abs/2204.00408☆198May 9, 2023Updated 3 years ago
- BLOOM+1: Adapting BLOOM model to support a new unseen language☆74Mar 2, 2024Updated 2 years ago
- ☆13Oct 8, 2021Updated 4 years ago
- [DMLR 2024] Benchmarking Robustness of Multimodal Image-Text Models under Distribution Shift☆39Jan 25, 2024Updated 2 years ago
- Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification☆11Aug 12, 2023Updated 2 years ago
- REvolveR: Continuous Evolutionary Models for Robot-to-robot Policy Transfer (ICML 2022 Long Oral)☆27Sep 10, 2022Updated 3 years ago
- Simple, predictable pricing with DigitalOcean hosting • AdAlways know what you'll pay with monthly caps and flat pricing. Enterprise-grade infrastructure trusted by 600k+ customers.
- Preprint: Asymmetry in Low-Rank Adapters of Foundation Models☆39Feb 27, 2024Updated 2 years ago
- This is a comprehensive guide on how you can automate your feature engineering process.☆11Jun 25, 2018Updated 7 years ago
- ☆12Sep 1, 2023Updated 2 years ago
- Evaluating Visual Fidelity of Image Descriptions☆11Aug 15, 2019Updated 6 years ago
- ☆10Oct 28, 2019Updated 6 years ago
- ☆16May 12, 2026Updated 2 weeks ago
- 🧬 BioRelEx: Biological Relation Extraction Benchmark @ ACL BioNLP Workshop 2019☆18Jul 31, 2019Updated 6 years ago
- Code for the NIPS 2016 paper "Single-Image Depth Perception in the Wild"☆11Nov 17, 2017Updated 8 years ago
- ☆19Nov 11, 2023Updated 2 years ago
- Managed Kubernetes at scale on DigitalOcean • AdDigitalOcean Kubernetes includes the control plane, bandwidth allowance, container registry, automatic updates, and more for free.
- Repository for paper Decrypting Cryptic Crosswords☆11Jan 15, 2022Updated 4 years ago
- GOPHI: an AMR-to-English Verbalizer☆11Feb 5, 2020Updated 6 years ago
- Implements several Markov chain Monte Carlo (MCMC) algorithms for the latent Dirichlet allocation (LDA) model☆11Feb 11, 2020Updated 6 years ago
- Alignment between clustered datasets via hierarchical Wasserstein distance☆38Sep 26, 2023Updated 2 years ago
- Code for "Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes"☆31Mar 28, 2024Updated 2 years ago
- Implementation of Neural Style Transfer on Video☆10Nov 6, 2018Updated 7 years ago
- Dataset for Bilingual VLN☆11Dec 5, 2020Updated 5 years ago