trokhymovych / WikiCheck
Implementation for WikiCheck API, an open-source Wikipedia-based fact-checking API. The project is done in cooperation with Wikimedia Foundation and Ukrainian Catholic University.
☆22Updated 2 months ago
Related projects: ⓘ
- Search through Facebook Research's PyTorch BigGraph Wikidata-dataset with the Weaviate vector search engine☆31Updated 2 years ago
- A TextTiling-based algorithm for text segmentation (aka topic segmentation) that uses neural sentence encoders, as well as extractive sum…☆41Updated last year
- A pipeline using LLMs for Knowledge Engineering, combining knowledge probing and Wikidata entity mapping.☆34Updated 10 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆67Updated 2 months ago
- Training & Implementation of chatbots leveraging GPT-like architecture with the aitextgen package to enable dynamic conversations.☆46Updated 2 years ago
- Source codes for the paper "Bounding the Capabilities of Large Language Models in Open Text Generation with Prompt Constraints"☆27Updated last year
- A summarization dataset consisting of over 17k open access business journal articles.☆9Updated 3 years ago
- The Synthetic-Persona-Chat dataset is a synthetically generated persona-based dialogue dataset. It extends the original Persona-Chat data…☆62Updated 8 months ago
- Source code and data for Like a Good Nearest Neighbor☆28Updated 7 months ago
- Official repo for NAACL 2024 Findings paper "LeTI: Learning to Generate from Textual Interactions."☆60Updated last year
- Code for Relevance-guided Supervision for OpenQA with ColBERT (TACL'21)☆40Updated 3 years ago
- Documentation effort for the BookCorpus dataset☆30Updated 3 years ago
- [COLM '24] Source-Aware Training Enables Knowledge Attribution in Language Models☆13Updated last month
- Finding semantically meaningful and accurate prompts.☆45Updated 10 months ago
- StAtutory Reasoning Assessment☆11Updated last year
- Code for Stage-wise Fine-tuning for Graph-to-Text Generation☆26Updated last year
- A Wikipedia-based summarization dataset☆13Updated last year
- Using open source LLMs to build synthetic datasets for direct preference optimization☆33Updated 6 months ago
- DSPy program/pipeline inspector widget for Jupyter/VSCode Notebooks.☆26Updated 7 months ago
- An easy to use framework for large-scale fact-checking and question answering☆68Updated last year
- ☆31Updated 3 months ago
- ☆10Updated 9 months ago
- Legal document similarity - Code, data, and models for the ICAIL 2021 paper "Evaluating Document Representations for Content-based Legal …☆31Updated 3 years ago
- A new way to generate large quantities of high quality synthetic data (on par with GPT-4), with better controllability, at a fraction of …☆19Updated last month
- Zero-shot evaluation on LEXGLUE tasks with GTP3.5☆27Updated last year
- Using Machine Learning to Create Funny Memes☆24Updated last year
- Seahorse is a dataset for multilingual, multi-faceted summarization evaluation. It consists of 96K summaries with human ratings along 6 q…☆84Updated 6 months ago
- Using short models to classify long texts☆20Updated last year
- RaKUn 2.0 - A fast keyword detection algorithm☆61Updated last month
- Vespa application making an index of the CORD-19 dataset.☆39Updated 2 weeks ago