Technology
GLM-ASR STT
A multimodal speech-to-text framework leveraging General Language Model (GLM) architecture for high-accuracy, context-aware transcription.
GLM-ASR STT integrates the architectural strengths of the GLM-4 family to process audio signals through a unified tokenization strategy. Developed by the THUDM team at Tsinghua University, this system moves beyond traditional acoustic modeling by treating speech as a primary linguistic modality. It excels in complex environments (noisy offices or multi-speaker dialogues) where context is critical for disambiguation. By utilizing a transformer-based backbone with billions of parameters, GLM-ASR delivers low-latency transcription that maintains semantic coherence across long-form recordings.
Related technologies
Recent Talks & Demos
Showing 1-1 of 1