.

Technology

CantoneseLLM

A specialized open-source model optimized for colloquial Cantonese syntax and traditional script accuracy.

CantoneseLLM leverages the 100 million token Honore dataset to master the specific grammar of the Yue language. The architecture (built on Llama 3) handles complex sentence-final particles like 'ge3' and 'maa1' with high precision. By focusing on regional code-switching and traditional characters, the model outperforms general-purpose LLMs in Hong Kong-specific tasks: legal document parsing, medical advice, and natural chat. It offers a 4-bit quantized version for efficient deployment on consumer-grade hardware.

https://huggingface.co/CantoneseLLM
1 project · 1 city

Related technologies

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

Sign in to see who built these projects