nihaljn / datahawkLinks
Viewer for text datasets in formats like HuggingFace, JSONL, etc.
β15Updated 11 months ago
Alternatives and similar repositories for datahawk
Users that are interested in datahawk are comparing it to the libraries listed below
Sorting:
- Chrome Extension for exploring Hugging Face datasets πβ48Updated last year
- Median is an open-source flashcard application that leverages the power of spaced repetition and artificial intelligence to transform theβ¦β22Updated last year
- Pivotal Token Searchβ144Updated last month
- YouTube Transcript Cleaner is a simple web-based application that improves the readability of YouTube transcripts.β27Updated 11 months ago
- An introduction to DSPyβ33Updated 5 months ago
- Run embedding models using ONNXβ35Updated 2 years ago
- Unleash the full potential of exascale LLMs on consumer-class GPUs, proven by extensive benchmarks, with no long-term adjustments and minβ¦β26Updated last year
- LLM plugin for clustering embeddingsβ82Updated last year
- Tools for formatting large language model prompts.β13Updated 2 years ago
- utilities for loading and running text embeddings with onnxβ45Updated 5 months ago
- Aana SDK is a powerful framework for building AI enabled multimodal applications.β55Updated 5 months ago
- OpenAI GPT hosted Agent Framework for Windows and MacOSβ36Updated last year
- never forget anything again! combine AI and intelligent tooling for a local knowledge base to track catalogue, annotate, and plan for youβ¦β37Updated last year
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsemblesβ61Updated 9 months ago
- High level library for batched embeddings generation, blazingly-fast web-based RAG and quantized indexes processing β‘β69Updated 2 months ago
- Python examples using the bigcode/tiny_starcoder_py 159M model to generate codeβ45Updated 2 years ago
- Dataset Viber is your chill repo for data collection, annotation and vibe checks.β47Updated last year
- β37Updated 6 months ago
- Tools for LLM agents.β61Updated last year
- A Python library to orchestrate LLMs in a neural network-inspired structureβ52Updated last year
- β24Updated last year
- Efficiently computing & storing token n-grams from large corporaβ26Updated last year
- β53Updated 11 months ago
- Synthetic Text Dataset Generation for LLM projectsβ55Updated 2 months ago
- Plugin for LLM adding a Markov chain generating modelβ20Updated last year
- Small python package to measure OCR quality and other related metrics.β26Updated last year
- Some tough questions to test new models.β28Updated last year
- Complex RAG backendβ29Updated last year
- a single interface around speech-to-speech foundation modelsβ26Updated 7 months ago
- β31Updated last year