nihaljn / datahawkLinks
Viewer for text datasets in formats like HuggingFace, JSONL, etc.
☆15Updated 9 months ago
Alternatives and similar repositories for datahawk
Users that are interested in datahawk are comparing it to the libraries listed below
Sorting:
- Run embedding models using ONNX☆35Updated last year
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and min…☆26Updated last year
- Median is an open-source flashcard application that leverages the power of spaced repetition and artificial intelligence to transform the…☆22Updated last year
- Aana SDK is a powerful framework for building AI enabled multimodal applications.☆53Updated 3 months ago
- utilities for loading and running text embeddings with onnx☆44Updated 3 months ago
- Python library to use Pleias-RAG models☆67Updated 7 months ago
- OpenAI GPT hosted Agent Framework for Windows and MacOS☆36Updated last year
- Pivotal Token Search☆132Updated last week
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for you…☆37Updated last year
- YouTube Transcript Cleaner is a simple web-based application that improves the readability of YouTube transcripts.☆26Updated 9 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing ⚡☆68Updated 3 weeks ago
- PyLate efficient inference engine☆68Updated 2 months ago
- ☆53Updated 10 months ago
- Tools for formatting large language model prompts.☆13Updated last year
- Claudette's sister, a helper for OpenAI GPT☆41Updated 3 weeks ago
- Synthetic data derived by templating, few shot prompting, transformations on public domain corpora, and monte carlo tree search.☆32Updated 2 months ago
- Editor with LLM generation tree exploration☆80Updated 9 months ago
- Embedding models from Jina AI☆65Updated last year
- Using modal.com to process FineWeb-edu data☆20Updated 8 months ago
- ☆21Updated last year
- Efficiently computing & storing token n-grams from large corpora☆26Updated last year
- Tools to make language models a bit easier to use☆60Updated 2 weeks ago
- ☆35Updated 4 months ago
- ☆31Updated last year
- LLM plugin for clustering embeddings☆82Updated last year
- Tools for LLM agents.☆61Updated 11 months ago
- Some tough questions to test new models.☆28Updated last year
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.☆45Updated last year
- Very minimal (and stateless) agent framework☆44Updated 10 months ago
- Chrome Extension for exploring Hugging Face datasets 🔎☆49Updated last year