Ongoing research training transformer language models at scale, including: BERT & GPT-2
☆18Feb 17, 2023Updated 3 years ago
Alternatives and similar repositories for Megatron-LM
Users that are interested in Megatron-LM are comparing it to the libraries listed below. We may earn a commission when you buy through links labeled 'Ad' on this page.
Sorting:
- Codebase for multilingual neural machine translation☆13Nov 24, 2022Updated 3 years ago
- ☆10Jul 15, 2024Updated last year
- Topic supervised non-negative matrix factorization with sparse matrices☆12Mar 24, 2020Updated 6 years ago
- A retrieval augmented sequence modeling toolkit implemented based on Fairseq☆29Mar 3, 2023Updated 3 years ago
- Multiple correspondence analysis☆10Apr 2, 2015Updated 11 years ago
- Serverless GPU API endpoints on Runpod - Get Bonus Credits • AdSkip the infrastructure headaches. Auto-scaling, pay-as-you-go, no-ops approach lets you focus on innovating your application.
- Code to reproduce the paper "Do causal predictors generalize better to new domains?"☆16Feb 7, 2025Updated last year
- SAFE Drive: access SAFE Network using the file system of Windows, Mac OS and Linux☆14Dec 9, 2022Updated 3 years ago
- Code for the anonymous submission "Cockpit: A Practical Debugging Tool for Training Deep Neural Networks"☆31Nov 24, 2020Updated 5 years ago
- ☆10Jan 19, 2026Updated 4 months ago
- Safe serialization of ML models☆18Apr 21, 2023Updated 3 years ago
- Maximum mean discrepancy comparisons for single cell profiling experiments☆20Feb 9, 2022Updated 4 years ago
- Using GPT-3 to detect hate speech that contains sexist and racist content☆24Nov 11, 2025Updated 7 months ago
- A set of pre-trained machine-learning models that predict (im-)politeness scores in texts.☆19Jan 2, 2025Updated last year
- A Pytorch-based library to evaluate learning methods on small image classification datasets☆18Jun 22, 2022Updated 3 years ago
- 1-Click AI Models by DigitalOcean Gradient • AdDeploy popular AI models on DigitalOcean Gradient GPU virtual machines with just a single click. Zero configuration with optimized deployments.
- OffensEval2020 Shared Task☆17Apr 5, 2021Updated 5 years ago
- Detecting bursty terms in computer science☆10Feb 2, 2022Updated 4 years ago
- Android Videokit - basic FFMPEG build for Android with x264 and libtheora support.☆22Jun 23, 2012Updated 13 years ago
- End-to-end integration of HuggingFace's models for sequence labeling.☆11Oct 4, 2020Updated 5 years ago
- ☆20Aug 30, 2022Updated 3 years ago
- Product Quantization k-Nearest Neighbors☆22Jun 24, 2021Updated 4 years ago
- I have created a dataset of Image-Text-Pairs by using the cosine similarity of the CLIP embeddings of the image & it's caption derrived f…☆16Apr 22, 2021Updated 5 years ago
- tiktoken is a BPE tokeniser for use with OpenAI's models☆26Jul 16, 2023Updated 2 years ago
- ☆14May 15, 2025Updated last year
- Managed hosting for WordPress and PHP on Cloudways • AdManaged hosting for WordPress, Magento, Laravel, or PHP apps, on multiple cloud providers. Deploy in minutes on Cloudways by DigitalOcean.
- Usage of Siamese Recurrent Neural network architectures for semantic textual similarity☆22Mar 5, 2019Updated 7 years ago
- Interpretation of Isolation Forests☆22Jun 17, 2024Updated last year
- Text generation with entities as context☆30Jun 13, 2018Updated 8 years ago
- Lawma: A lightly fine-tuned Llama model for legal classification tasks.☆31Sep 14, 2024Updated last year
- 🧬 an evolving design philosophy (masquerading as a color scheme)☆11Dec 8, 2025Updated 6 months ago
- A multi-frame-inpainting script for stable diffusion webui☆11Apr 7, 2023Updated 3 years ago
- 《智能投顾》读书笔记☆12May 23, 2019Updated 7 years ago
- ☆86Dec 26, 2022Updated 3 years ago
- ☆54Jan 29, 2018Updated 8 years ago
- Managed Database hosting by DigitalOcean • AdPostgreSQL, MySQL, MongoDB, Kafka, Valkey, and OpenSearch available. Automatically scale up storage and focus on building your apps.
- auto image cropping/composition methods☆16Oct 23, 2018Updated 7 years ago
- Highly concurrent and fast content processing for Mighty Inference Server☆10Feb 6, 2023Updated 3 years ago
- Code for paper "Interactive Machine Comprehension with Information Seeking Agents" -- public version☆23Sep 3, 2019Updated 6 years ago
- ☆16Aug 22, 2017Updated 8 years ago
- OIR Segmentation☆13Mar 14, 2018Updated 8 years ago
- MXNet implementation of AC-BLSTM☆23Apr 3, 2019Updated 7 years ago
- Automatically exported from code.google.com/p/caltech-lane-detection☆13Jul 25, 2015Updated 10 years ago