jehumtine / synthetic_data_generator

This script is designed to convert bodies of text into a question and answer JSON format using the GPT-4 language model. The process involves extracting text from PDF files, tokenizing the text, generating questions and answers, and then saving the results in a JSON file.
18Updated last year

Related projects

Alternatives and complementary repositories for synthetic_data_generator