Research
Speech-to-Text (STT)
We develop advanced Speech-to-Text systems that convert spoken language into accurate and structured text in real time. Our research focuses on improving recognition accuracy across diverse accents, noisy environments, and low-resource languages. A key goal of our work is building efficient STT models optimized for edge devices such as Android smartphones, enabling fast, private, and offline voice interaction without reliance on cloud infrastructure.
Text-to-Speech (TTS)
Our Text-to-Speech research focuses on generating natural, expressive, and human-like speech from written text. We design models capable of producing clear pronunciation, realistic prosody, and multilingual voice synthesis. Special attention is given to lightweight architectures that run efficiently on edge devices like Android smartphones, enabling real-time speech generation while maintaining low power and memory consumption.
Sketch Recognition
Our Sketch Recognition research explores intelligent systems that can understand hand-drawn diagrams, sketches, and visual notes. Using computer vision and machine learning techniques, we train models to interpret shapes, symbols, and spatial relationships. These models are designed to operate directly on edge devices such as Android smartphones, enabling instant recognition and interaction without requiring cloud connectivity.
Small Language Models (SLM)
We research Small Language Models designed specifically for edge computing. These models provide strong language understanding while remaining compact, efficient, and fast. By optimizing model architecture and inference techniques, we enable deployment on edge devices like Android smartphones, delivering intelligent language capabilities with low latency, reduced energy consumption, and improved privacy.