EastTower16/LLMDataDistill

Readme badge preview -

If you own this repo, copy the snippet below and add it to your README.md

[![RelatedRepos](https://img.shields.io/badge/related-repos-yellow)](https://relatedrepos.com/gh/EastTower16/LLMDataDistill)

EastTower16 / LLMDataDistill

distill large scale web page text

☆12

Alternatives and similar repositories for LLMDataDistill

Users that are interested in LLMDataDistill are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.

Sorting:

koth / EmotiVoice.cpp
View on GitHub
cpp inference for EmotiVoice
☆16Jan 1, 2024Updated 2 years ago
OpenSealion / sealion-client
View on GitHub
SeaLion Client can help create project quickly
☆53May 7, 2025Updated last year
gmftbyGMFTBY / MomentumDecoding
View on GitHub
Momentum Decoding: Open-ended Text Generation as Graph Exploration
☆19Jan 27, 2023Updated 3 years ago
zhehengluoK / Biomedical-Text-Summarization-Survey
View on GitHub
This repository lists papers, codes, and datasets in Biomedical Text Summarisation based on PLM
☆23Oct 4, 2022Updated 3 years ago
INK-USC / ReCross
View on GitHub
ReCross: Unsupervised Cross-Task Generalization via Retrieval Augmentation
☆23May 1, 2022Updated 4 years ago
Wordpress hosting with auto-scaling - Free Trial Offer • Ad
Fully Managed hosting for WordPress and WooCommerce businesses that need reliable, auto-scalable performance. Cloudways SafeUpdates now available.
YujieLu10 / Seeker
View on GitHub
☆11May 24, 2024Updated 2 years ago
RUCAIBox / MPOP
View on GitHub
☆13Jun 16, 2021Updated 5 years ago
sammi / bazel-to-msbuild
View on GitHub
Generate visual studio solution from a bazel workspace.
☆13Jan 19, 2022Updated 4 years ago
wabyking / word2fun
View on GitHub
☆11May 9, 2022Updated 4 years ago
Pallas1992 / NLP
View on GitHub
notes and codes about NLP
☆25Jan 22, 2019Updated 7 years ago
222464 / HTFERL
View on GitHub
A variant of HTM where spatial and temporal pooling are accomplished with the same mechanism
☆13Apr 11, 2015Updated 11 years ago
joeljang / ELM
View on GitHub
[ICML 2023] Exploring the Benefits of Training Expert Language Models over Instruction Tuning
☆99Apr 26, 2023Updated 3 years ago
leuchine / self_play_picard
View on GitHub
Using self-play to augment multi-turn text-to-SQL datasets
☆12Oct 20, 2022Updated 3 years ago
swaggy-TN / EfficientVLM
View on GitHub
EfficientVLM: Fast and Accurate Vision-Language Models via Knowledge Distillation and Modal-adaptive Pruning (ACL 2023)
☆33Jul 18, 2023Updated 3 years ago
Virtual machines for every use case on DigitalOcean • Ad
Get dependable uptime with 99.99% SLA, simple security tools, and predictable monthly pricing with DigitalOcean's virtual machines, called Droplets.
imyhxy / ccocotools
View on GitHub
This is a C++ implementation of cocoapi bbox evaluation code.
☆11Dec 9, 2021Updated 4 years ago
zhzihao / WikiGenBench
View on GitHub
WIKIGENBENCH: Exploring Full-length Wikipedia Generation under Real-World Scenario (COLING 2025)
☆13Jan 5, 2025Updated last year
acl-org / emnlp-2023
View on GitHub
Repository containing the website for the EMNLP 2023 conference
☆17Feb 12, 2025Updated last year
Unified-Language-Model-Alignment / src
View on GitHub
☆14Oct 7, 2023Updated 2 years ago
SkafteNicki / cuda_expm
View on GitHub
Matrix exponential in cuda for pytorch and tensorflow
☆17Nov 26, 2018Updated 7 years ago
alito / mamele
View on GitHub
Machine learning environment over MAME-supported games
☆15Apr 2, 2026Updated 3 months ago
jcjohnson / pytorch-multinomial-benchmark
View on GitHub
☆12Oct 23, 2018Updated 7 years ago
Bernard-Yang / HERB
View on GitHub
☆14Nov 14, 2022Updated 3 years ago
Yifan-Gao / open_retrieval_conversational_machine_reading
View on GitHub
Open-Retrieval Conversational Machine Reading: A new setting & OR-ShARC dataset
☆13Nov 19, 2022Updated 3 years ago
1-Click AI Models by DigitalOcean Gradient • Ad
Deploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
veugene / spectre_release
View on GitHub
Analyzing deviation from orthogonality in RNNs
☆16Oct 30, 2017Updated 8 years ago
bigganbing / Fairseq_MorphTE
View on GitHub
[NeurIPS 2022]MorphTE: Injecting Morphology in Tensorized Embeddings
☆17Oct 29, 2022Updated 3 years ago
zhjohnchan / awesome-vision-and-language-pretraining
View on GitHub
A curated list of vision-and-language pre-training (VLP). :-)
☆62Jul 6, 2022Updated 4 years ago
noiseQA / NoiseQA
View on GitHub
☆12Feb 22, 2021Updated 5 years ago
KernelErr / paddle-sys
View on GitHub
Rust wrapper for Paddle Inference.
☆11May 22, 2021Updated 5 years ago
koth / kokoro.cpp
View on GitHub
kokoro tts in cpp
☆16Nov 30, 2025Updated 7 months ago
Timothyxxx / NeuralSymbolicPapers
View on GitHub
☆14Aug 18, 2022Updated 3 years ago
MichaelZhouwang / VLUE
View on GitHub
This repo contains codes and instructions for baselines in the VLUE benchmark.
☆41Jul 16, 2022Updated 4 years ago
simongog / RoSA
View on GitHub
Reduced on-disk Suffix Array
☆22Oct 9, 2013Updated 12 years ago
Managed hosting for WordPress and PHP on Cloudways • Ad
Managed hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
google-research-datasets / maxm
View on GitHub
MaXM is a suite of test-only benchmarks for multilingual visual question answering in 7 languages: English (en), French (fr), Hindi (hi),…
☆13Jan 16, 2024Updated 2 years ago
ocastel / exact-extract
View on GitHub
☆12Sep 2, 2021Updated 4 years ago
Oneplus / ELMo
View on GitHub
☆10May 20, 2019Updated 7 years ago
bosondata / badwolf
View on GitHub
Docker based continuous integration, continuous deployment and code lint review system for BitBucket
☆89Jul 11, 2019Updated 7 years ago
LieluoboAi / radish
View on GitHub
C++ model train&inference framework
☆222Dec 25, 2019Updated 6 years ago
fajieyuan / CIKM2016-LambdaFM
View on GitHub
LambdaFM: Learning Optimal Ranking with Factorization Machines Using Lambda Surrogates
☆18Aug 17, 2019Updated 6 years ago
Sherrylone / Zero-CL
View on GitHub
ICLR 2022 paper
☆16May 6, 2022Updated 4 years ago