sail-sg / sailcraft
π’ Data Toolkit for Sailor Language Models
β82Updated 4 months ago
Related projects β
Alternatives and complementary repositories for sailcraft
- Official code for "MAmmoTH2: Scaling Instructions from the Web" [NeurIPS 2024]β124Updated 3 weeks ago
- Official implementation for 'Extending LLMsβ Context Window with 100 Samples'β74Updated 10 months ago
- LongEmbed: Extending Embedding Models for Long Context Retrieval (EMNLP 2024)β115Updated last week
- β112Updated last month
- This is the official repository for Inheritune.β105Updated last month
- [EMNLP 2024] A Retrieval Benchmark for Scientific Literature Searchβ61Updated 4 months ago
- Codebase accompanying the Summary of a Haystack paper.β72Updated 2 months ago
- EvolKit is an innovative framework designed to automatically enhance the complexity of instructions used for fine-tuning Large Language Mβ¦β180Updated 3 weeks ago
- Benchmarking LLMs with Challenging Tasks from Real Usersβ195Updated 2 weeks ago
- Evaluating LLMs with fewer examplesβ134Updated 7 months ago
- AIR-Bench: Automated Heterogeneous Information Retrieval Benchmarkβ106Updated last month
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents [EMNLP 2024]β103Updated last month
- β125Updated 7 months ago
- Evaluating LLMs with CommonGen-Liteβ85Updated 8 months ago
- Reformatted Alignmentβ112Updated last month
- Retrieval Augmented Generation Generalized Evaluation Datasetβ51Updated this week
- We aim to provide the best references to search, select, and synthesize high-quality and large-quantity data for post-training your LLMs.β47Updated last month
- Expert Specialized Fine-Tuningβ145Updated last month
- Lightweight demos for finetuning LLMs. Powered by π€ transformers and open-source datasets.β64Updated last month
- Small and Efficient Mathematical Reasoning LLMsβ71Updated 9 months ago
- Code of ICLR paper: https://openreview.net/forum?id=-cqvvvb-NkIβ91Updated last year
- Official implementation of DPFM @ ICLR 2024 paper "Autonomous Data Selection with Language Models for Mathematical Texts" (As Huggingfaceβ¦β78Updated 2 weeks ago
- This repository contains the joint use of CPO and SimPO method for better reference-free preference learning methods.β35Updated 3 months ago
- Unofficial implementation of AlpaGasusβ84Updated last year
- [Data + code] ExpertQA : Expert-Curated Questions and Attributed Answersβ122Updated 8 months ago
- Code for Zero-Shot Tokenizer Transferβ115Updated 3 weeks ago
- minimal pytorch implementation of bm25 (with sparse tensors)β90Updated 8 months ago
- The official evaluation suite and dynamic data release for MixEval.β224Updated last week
- Positional Skip-wise Training for Efficient Context Window Extension of LLMs to Extremely Length (ICLR 2024)β199Updated 6 months ago
- [EMNLP 2024 (Oral)] Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QAβ92Updated last week