Patent Monetization Fundamentals: Must Know Before Monetizing Your Patent

This is where you can find the latest news and insights about Sway — new products, in-depth interviews and successfully finished projects. Never miss a beat.

PRE-TRAINED MACHINE LEARNING MODELS FOR REAL-TIME SPEECH FORM CONVERSION

  • Short Desription

    Abstract/Short Description of Invention

  • Problem Addressed

    Describe The Problem the Invention Addresses in Simple Terms

  • Solution to Solution

    The Invention Namely the Solution to the Problem

  • Prior Art Addressed

    Invention Addressed a Major Gap In Prior Art

Techniques are described for generating parallel data for real-time speech form conversion. In an embodiment, based at least in part on input speech data of an original form, a speech machine learning (ML) model generates parallel speech data. The parallel speech data includes the input speech data of the original form and temporally aligned output speech data of a target form different than the original form. Each frame of the input speech data temporally corresponds to the corresponding output speech frame of the target speech form and contains a same portion of the particular content. The techniques further include training a teacher machine learning model that is offline and is substantially larger than a student machine learning model for converting speech form. Transferring "knowledge" from the trained Teacher model for training the Production Student Model that performs the speech form conversion on an end-user computing device.

The challenge in speech conversion is not only to be able to convert speech audio from one form to another (e g , accented speech to native speech) but also to do so in realtime. In particular, the quest for efficient, high-quality accent conversion has long been an area of interest within speech and audio processing. Accent conversion has significant potential for various applications, notably in real-time audio communication. The challenge is to design an audio processing algorithm that may transform a given input audio signal with L2 (original speech format, e.g., English with accent speech) into LI (desired speech format, e.g., native English speech), maintaining the textual integrity and synchronicity, yet changing the accentual characteristics. Although the accent conversion for the English language is used as an example speech conversion task, the techniques described herein are equally applicable to other speech conversion tasks such as voice conversion, emotion conversion, bandwidth expansion, etc.

Application NumberUS2024/038305
Patent Number2025/024193
ApplicantTECHNOLOGIES, INC
Current StatusGranted
Country
IndustryAutomotive and Transportation
Patent TypeSingle Patent
Available ForSale

Complete Specification

Country Current Status Patent Application Number Patent Applicant Patent Number Title Google Patent Link
Granted US2024/038305 TECHNOLOGIES, INC 2025/024193 Single Patent Click to open