Technology

GLM-ASR STT

A multimodal speech-to-text framework leveraging General Language Model (GLM) architecture for high-accuracy, context-aware transcription.

GLM-ASR STT integrates the architectural strengths of the GLM-4 family to process audio signals through a unified tokenization strategy. Developed by the THUDM team at Tsinghua University, this system moves beyond traditional acoustic modeling by treating speech as a primary linguistic modality. It excels in complex environments (noisy offices or multi-speaker dialogues) where context is critical for disambiguation. By utilizing a transformer-based backbone with billions of parameters, GLM-ASR delivers low-latency transcription that maintains semantic coherence across long-form recordings.

https://github.com/THUDM/GLM-4

1 project · 1 city

Related technologies

Android 11 GLM-ASR 1 iOS 6 Ministral-3 1 Pocket-TTS 1 Rust 49 Server 4 TTS 3

Recent Talks & Demos

Showing 1-1 of 1

Members-Only

UnaMentis: On-Device Voice Models

Portland Mar 5

Pocket-TTS GLM-ASR