darrow-labs / LegalLensLinks
☆8Updated last year
Alternatives and similar repositories for LegalLens
Users that are interested in LegalLens are comparing it to the libraries listed below
Sorting:
- SMASHED is a toolkit designed to apply transformations to samples in datasets, such as fields extraction, tokenization, prompting, batchi…☆33Updated last year
- Ranking of fine-tuned HF models as base models.☆35Updated 2 months ago
- SWIM-IR is a Synthetic Wikipedia-based Multilingual Information Retrieval training set with 28 million query-passage pairs spanning 33 la…☆48Updated last year
- ☆12Updated 7 months ago
- ☆19Updated this week
- Embedding Recycling for Language models☆38Updated 2 years ago
- ☆14Updated 9 months ago
- ☆11Updated 3 years ago
- Tokenization across languages. Useful as preprocessing for subword tokenization.☆22Updated 2 years ago
- Starbucks: Improved Training for 2D Matryoshka Embeddings☆21Updated 2 weeks ago
- Plug-and-play Search Interfaces with Pyserini and Hugging Face☆32Updated last year
- ☆14Updated last month
- Code for our paper Resources and Evaluations for Multi-Distribution Dense Information Retrieval☆15Updated last year
- This repo contains code for the paper "Psychologically-informed chain-of-thought prompts for metaphor understanding in large language mod…☆14Updated 2 years ago
- Code for running the experiments in Deep Subjecthood: Higher Order Grammatical Features in Multilingual BERT☆17Updated last year
- NLG Best Practices for Data-Efficient Modeling How to Train Production-Ready Models with Little Data☆10Updated 3 years ago
- ☆18Updated last month
- ☆29Updated 3 years ago
- FAMIE: A Fast Active Learning Framework for Multilingual Information Extraction☆24Updated 3 years ago
- Exploring Few-Shot Adaptation of Language Models with Tables☆24Updated 2 years ago
- Code and pre-trained models for "ReasonBert: Pre-trained to Reason with Distant Supervision", EMNLP'2021☆29Updated 2 years ago
- Codes and files for the paper Are Emergent Abilities in Large Language Models just In-Context Learning☆33Updated 6 months ago
- ☆29Updated last year
- ☆40Updated 2 months ago
- Code for SaGe subword tokenizer (EACL 2023)☆25Updated 7 months ago
- official repo of AAAI2024 paper Mitigating the Impact of False Negatives in Dense Retrieval with Contrastive Confidence Regularization☆13Updated last year
- Documentation effort for the BookCorpus dataset☆34Updated 4 years ago
- QAmeleon introduces synthetic multilingual QA data using PaLM, a 540B large language model. This dataset was generated by prompt tuning P…☆34Updated last year
- SCREWS: A Modular Framework for Reasoning with Revisions☆27Updated last year
- Code and data for Teddy https://arxiv.org/abs/2001.05171.☆15Updated 3 years ago