Job Description
About The Team
Proto’s team is extremely global with 80% BIPOC (Black, Indigenous, and Persons of Colour) and 50% female management working across 15+ countries. We’re a remote team of self-starting and entrepreneurial SaaS engineers, operations and growth professionals. Our remote team follows established processes for cross-cultural and timezone collaboration, with opportunities to periodic in-person work opportunities.
Job Summary
We are seeking a highly skilled and motivated ASR (Automatic Speech Recognition) and TTS (Text-to-Speech) Engineer to join our innovative team. The successful candidate will be responsible for developing, optimizing, and deploying state-of-the-art ASR and TTS systems, contributing to our cutting-edge speech technology solutions. This role requires a deep understanding of speech processing, machine learning, and software development.
Responsibility
- Design, develop, and optimize ASR and TTS models and algorithms.
- Implement and improve speech recognition and synthesis systems using deep learning frameworks.
- Collaborate with cross-functional teams to integrate ASR and TTS solutions into various applications and platforms.
- Conduct research and stay updated on the latest advancements in speech-processing technologies.
- Evaluate and benchmark the performance of ASR and TTS systems, ensuring high accuracy and efficiency.
- Troubleshoot and resolve issues related to speech processing systems and improve their robustness.
- Maintain and enhance existing speech processing models and pipelines.
- Contribute to the development of documentation, guidelines, and best practices for ASR and TTS systems.
Requirements
- Bachelor’s, Master’s, or Ph.D. degree in Computer Science, Electrical Engineering, or a related field with a focus on speech processing or machine learning.
- Proven experience in developing and deploying ASR and TTS systems.
- Strong knowledge of machine learning frameworks such as TensorFlow, PyTorch, or similar.
- Proficiency in Python programming languages.
- Experience with signal processing, feature extraction, and speech recognition techniques.
- Familiarity with deep learning architectures, including RNN, CNN, Transformer models, etc.
- Ability to work effectively in a collaborative and fast-paced environment.
- Excellent problem-solving skills and attention to detail.
- Strong written and verbal communication skills.
Nice-to-Haves
- Experience with large-scale speech data collection and annotation.
- Knowledge of cloud-based services and tools for ASR and TTS deployment.
- Publications or contributions to the speech processing research community.
- Experience with end-to-end speech processing pipelines and real-time processing systems.
Benefits
- 20 vacation days. In addition to local holidays.
- Full remote. Work anywhere in the world with stable internet.
- Cowork. Access any coworking office on Earth (and drink free coffee).
- High & equal salaries. Get paid higher than average and equally with global colleagues.
- Laptop incentive. Let us pay you back for work devices and tech upgrades (or give them if you are in the R&D department).
- Visa support. Request support with immigration to countries in our corporate group.
- Stock options. Qualify for employee stock options with leadership positions.
- Refugee friendly. Proto prioritises candidates who are displaced or relocating due to conflict.
Proto is proud to be an equal-opportunity workplace and affirmative-action employer. We are committed to equal employment opportunities regardless of race, colour, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or veteran status.