zorse-project / COBOLEvalLinks
Evaluate LLM-generated COBOL
☆39Updated last year
Alternatives and similar repositories for COBOLEval
Users that are interested in COBOLEval are comparing it to the libraries listed below
Sorting:
- ReLM is a Regular Expression engine for Language Models☆106Updated 2 years ago
- Official Repo for CRMArena and CRMArena-Pro☆118Updated 3 months ago
- IBM development fork of https://github.com/huggingface/text-generation-inference☆61Updated 2 weeks ago
- Language Model for Mainframe Modernization☆58Updated last year
- XTR/WARP (SIGIR'25) is an extremely fast and accurate retrieval engine based on Stanford's ColBERTv2/PLAID and Google DeepMind's XTR.☆166Updated 5 months ago
- Public repository containing METR's DVC pipeline for eval data analysis☆115Updated 6 months ago
- Code and data for "StructLM: Towards Building Generalist Models for Structured Knowledge Grounding" (COLM 2024)☆75Updated 11 months ago
- Mixing Language Models with Self-Verification and Meta-Verification☆110Updated 9 months ago
- Experimental Code for StructuredRAG: JSON Response Formatting with Large Language Models☆111Updated 5 months ago
- Advanced Reasoning Benchmark Dataset for LLMs☆47Updated last year
- ReDel is a toolkit for researchers and developers to build, iterate on, and analyze recursive multi-agent systems. (EMNLP 2024 Demo)☆86Updated 3 weeks ago
- [FORGE 2025] Graph-based method for end-to-end code completion with context awareness on repository☆66Updated last year
- ☆61Updated 3 months ago
- ☆19Updated last month
- Pre-training code for CrystalCoder 7B LLM☆55Updated last year
- The Granite Guardian models are designed to detect risks in prompts and responses.☆118Updated 2 weeks ago
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆55Updated 8 months ago
- Pre-train Static Word Embeddings☆86Updated 3 weeks ago
- 🦄 Unitxt is a Python library for enterprise-grade evaluation of AI performance, offering the world's largest catalog of tools and data …☆210Updated this week
- Seemless interface of using PyTOrch distributed with Jupyter notebooks☆50Updated 3 weeks ago
- ☆40Updated 3 months ago
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆75Updated 10 months ago
- Google Deepmind's PromptBreeder for automated prompt engineering implemented in langchain expression language.☆147Updated last year
- Query language for blending SQL and LLMs across structured + unstructured data, with type constraints.☆114Updated this week
- ☆50Updated last year
- A framework for high-fidelity retrieval augmented generation in industrial knowledge bases. Integrates jargon identification, context rec…☆35Updated last year
- ☆48Updated last year
- Automatic Prompt Optimization☆45Updated last year
- ☆43Updated last year
- Based on the tree of thoughts paper☆48Updated 2 years ago