This script is designed to convert bodies of text into a question and answer JSON format using the GPT-4 language model. The process involves extracting text from PDF files, tokenizing the text, generating questions and answers, and then saving the results in a JSON file.
☆24Aug 22, 2023Updated 2 years ago
Alternatives and similar repositories for synthetic_data_generator
Users that are interested in synthetic_data_generator are comparing it to the libraries listed below
Sorting:
- LAWLIA is an open-source computational legal framework designed to revolutionize legal reasoning and analysis. It combines the power of l…☆20Dec 6, 2023Updated 2 years ago
- 🦄 Use GPT to generate and label data☆25Apr 30, 2024Updated last year
- GPT4MAX is a free AI chatbot app built with Next.js, the Vercel AI SDK, and OpenAI GPT-4 Turbo.☆18May 10, 2024Updated last year
- A simple GPT-3 interface to automate core legal writing tasks☆13Mar 8, 2023Updated 3 years ago
- Your own GPT-powered Personal Assistant to whom you can ORDER or INSTRUCT to do some task or search for something using your VOICE comman…☆20Jul 23, 2023Updated 2 years ago
- The code used to evaluate embedding models on the Massive Legal Embedding Benchmark (MLEB).☆33Feb 24, 2026Updated 3 weeks ago
- An automated E2E natural language test runner built on Claude Code☆22Aug 19, 2025Updated 7 months ago
- 基于langchain和chatglm6b构建的智能问答系统,支持自定义语料☆10Jun 25, 2023Updated 2 years ago
- Developing a legal research tool leveraging ChatGPT / GPT-4☆15Mar 10, 2024Updated 2 years ago
- Vector search with Pinecone and Openai to search through contract law textbook. If downloaded, remeber to install all dependencies. Refer…☆13Mar 30, 2023Updated 2 years ago
- A fake livestreaming twitch.tv chat, where small streamers can communicate with unique ChatGPT bots that act as fans.☆16Jan 19, 2025Updated last year
- Disease Pattern Miner is a free, open-source mining framework for interactively discovering sequential disease patterns in medical health…☆12Mar 21, 2019Updated 7 years ago
- MERN APP - Midjourney & DALL-E Clone☆15Jan 9, 2026Updated 2 months ago
- Probe how GPT-n performs on statutory reasoning☆10Sep 17, 2024Updated last year
- ☆16Jun 18, 2024Updated last year
- A question answering AI tool for the content from the PDF files of the Civil Code, Criminal Code, Code of Criminal Procedure, Labor Stand…☆11May 14, 2023Updated 2 years ago
- Build a Full stack Q&A Chatbot with Langchain, and LLM Models on Amazon Sagemaker☆12Nov 10, 2023Updated 2 years ago
- AI Pull-Request Reviewer Companion (in the command line)☆13Apr 11, 2024Updated last year
- ☆18Oct 16, 2024Updated last year
- Web one-click mode full process platform, including train data upload, fine-tuning, model merge, model deploy, gpu monitor etc., no need …☆19Nov 28, 2023Updated 2 years ago
- 基于pytorch_rnn的古诗词生成☆11Oct 24, 2021Updated 4 years ago
- A zero-configuration (no registry.json required), shadcn add / open in v0 compatible registry builder. With amazing visual feedback like …☆26Updated this week
- ☆27Sep 10, 2025Updated 6 months ago
- Nano Bots for Obsidian: small, AI-powered bots that can be easily shared as a single file, designed to support multiple providers such as…☆15Jan 13, 2024Updated 2 years ago
- An RAG (retrieval augmented generation) app which iterates through a PDF document and can answer user's questions based on the document u…☆16Mar 23, 2025Updated 11 months ago
- ☆17Jul 16, 2024Updated last year
- ☆10Aug 28, 2018Updated 7 years ago
- ☆15Aug 3, 2024Updated last year
- Experiments codes for COLING '22 paper "Augmenting Legal Judgment Prediction with Contrastive Case Relations"☆11Apr 25, 2024Updated last year
- Graph QABot Demo| 图谱问答案例☆15Apr 11, 2023Updated 2 years ago
- Changes in this fork has been merged to upstream.☆16Jun 10, 2025Updated 9 months ago
- WebRTC-HTTP Ingestion Protocol (WHIP) in Rust☆14Dec 17, 2025Updated 3 months ago
- 本项目主要研究大模型在单独的法律数据集上的效果,现在支持belle和chatglm相关的模型训练,预测,验证和在线部署, 另外增加爬虫代码,langchain,结合数据库预测等功能。☆12Jul 16, 2023Updated 2 years ago
- 爬取去哪网热门景点信息,抽取三元组信息,构建中文知识图谱☆12Apr 27, 2021Updated 4 years ago
- KL3M training data collection and preprocessing☆20Apr 14, 2025Updated 11 months ago
- A OpenAI GPT3 based QnA agent for documents and links☆12Jul 11, 2023Updated 2 years ago
- ☆49Jun 13, 2024Updated last year
- A simple NextJS app that streams Langserve (python) streamings on NextJS frontend, using a hook to make it clean on components, and api c…☆10Mar 12, 2024Updated 2 years ago
- Waste Segregation @HackBash2021 : ML based deployed waste segregation web app☆12Apr 8, 2021Updated 4 years ago