wandb / aihackercupLinks
A competition to get you started on the NeurIPS AI Hackercup
☆28Updated 8 months ago
Alternatives and similar repositories for aihackercup
Users that are interested in aihackercup are comparing it to the libraries listed below
Sorting:
- Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute…☆49Updated 10 months ago
- Fine-tune an LLM to perform batch inference and online serving.☆111Updated last week
- Repository containing awesome resources regarding Hugging Face tooling.☆47Updated last year
- Source code of "How to Correctly do Semantic Backpropagation on Language-based Agentic Systems" 🤖☆68Updated 5 months ago
- Build Agentic workflows with function calling using open LLMs☆26Updated this week
- ☆23Updated last year
- ☆49Updated 6 months ago
- Source code for the collaborative reasoner research project at Meta FAIR.☆87Updated last month
- Matrix (Multi-Agent daTa geneRation Infra and eXperimentation framework) is a versatile engine for multi-agent conversational data genera…☆60Updated this week
- Optimizing Causal LMs through GRPO with weighted reward functions and automated hyperparameter tuning using Optuna☆53Updated 4 months ago
- ☆29Updated 6 months ago
- ☆77Updated last year
- a pipeline for using api calls to agnostically convert unstructured data into structured training data☆30Updated 8 months ago
- ☆19Updated 7 months ago
- Code for our paper PAPILLON: PrivAcy Preservation from Internet-based and Local Language MOdel ENsembles☆33Updated 3 weeks ago
- QAlign is a new test-time alignment approach that improves language model performance by using Markov chain Monte Carlo methods.☆24Updated last month
- rl from zero pretrain, can it be done? we'll see.☆24Updated this week
- An introduction to LLM Sampling☆78Updated 5 months ago
- ☆64Updated 7 months ago
- Hub for researchers exploring VLMs and Multimodal Learning:)☆35Updated this week
- Verbosity control for AI agents☆63Updated last year
- ☆19Updated last week
- Synthetic data generation and benchmark implementation for "Episodic Memories Generation and Evaluation Benchmark for Large Language Mode…☆45Updated last month
- Streamlit app for recommending eval functions using prompt diffs☆27Updated last year
- Simple GRPO scripts and configurations.☆58Updated 4 months ago
- ☆50Updated this week
- Collection of resources for RL and Reasoning☆25Updated 4 months ago
- [ACL 2024] Do Large Language Models Latently Perform Multi-Hop Reasoning?☆67Updated 2 months ago
- Arrakis is a library to conduct, track and visualize mechanistic interpretability experiments.☆29Updated last month
- Fine tune Gemma 3 on an object detection task☆43Updated this week