second-state / meetups
☆69Updated last year
Related projects: ⓘ
- ☆123Updated 3 months ago
- ☆70Updated 9 months ago
- AI Accelerator Benchmark focuses on evaluating AI Accelerators from a practical production perspective, including the ease of use and ver…☆188Updated 3 weeks ago
- ☆19Updated 9 months ago
- export llama to onnx☆91Updated 3 months ago
- ☆140Updated 4 months ago
- Compare different hardware platforms via the Roofline Model for LLM inference tasks.☆71Updated 6 months ago
- run ChatGLM2-6B in BM1684X☆48Updated 6 months ago
- ☆56Updated last week
- Transformer related optimization, including BERT, GPT☆17Updated last year
- LLM Inference benchmark☆331Updated last month
- Transformer related optimization, including BERT, GPT☆39Updated last year
- ☆133Updated 2 months ago
- ☆251Updated last week
- Inferflow is an efficient and highly configurable inference engine for large language models (LLMs).☆232Updated 6 months ago
- Transformer related optimization, including BERT, GPT☆58Updated last year
- simplify >2GB large onnx model☆41Updated 6 months ago
- FlagGems is an operator library for large language models implemented in Triton Language.☆246Updated last week
- Run generative AI models in sophgo BM1684X☆103Updated this week
- TePDist (TEnsor Program DISTributed) is an HLO-level automatic distributed system for DL models.☆87Updated last year
- This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit…☆227Updated this week
- FlagScale is a large model toolkit based on open-sourced projects.☆129Updated last week
- ☆53Updated 4 years ago
- Models and examples built with OneFlow☆94Updated last month
- Performance of the C++ interface of flash attention and flash attention v2 in large language model (LLM) inference scenarios.☆20Updated 2 weeks ago
- ☆148Updated this week
- ☆97Updated 5 months ago
- OneFlow->ONNX☆41Updated last year
- ☆110Updated 4 months ago
- ☆105Updated last week