Thanh V. T. Tran
I am Thanh Tran, an AI Research Resident at the FPT Software – AI Center, working under the supervision of Dr. Van Nguyen and Professor Truong-Son Hy. I’m starting my Ph.D. at Nanyang Technological University (NTU) in Fall 2026, advised by Professor Woon-Seng Gan.
I’m always open to collaborations, discussions, and new opportunities. Feel free to reach out if you’re interested in my research or would like to discuss potential projects.
Research: My research spans several key areas in artificial intelligence, with a primary focus on multimodal AI, generative models, and AI for scientific discovery.
1. Multimodal AI and Audio-Visual Learning. I develop deep learning models for audio-visual understanding and generation, including video-to-audio synthesis, automated video dubbing, and speech reconstruction from silent videos.
2. Generative Models for Speech and Audio. I work on flow models for text-to-speech and audio generation, aiming to build efficient, low-latency systems for real-world deployment.
3. AI for Scientific Discovery. Inspired by evolutionary algorithms, I optimize protein sequences using black-box optimization methods in discrete and latent spaces.
News
| May 01, 2026 | DiFlowDubber got accepted at CVPR Findings 2026. DiFlowDubber and Flowley also got accepted at Sight and Sound Workshop, CVPR 2026. |
|---|---|
| Jan 10, 2026 | Honored to receive the Best Performance Award 2025, ranking in the top 3 out of 100+ AI engineers and researchers at FPT Software – AI Center. |
| May 20, 2025 | RESOUND got accepted at Interspeech 2025. |
| Dec 21, 2024 | ConxGNN got accepted at ICASSP 2025. |
| Nov 17, 2024 | GROOT got accepted at KDD 2025. |