[ad_1]
Wizard Googlepreviously worked with voice commands, but thanks to a new feature called “Look and Talk” (“Look and Talk”), you can now launch it with your looks.
Today, Google Assistant works in over 95 countries and over 29 languages, and can only be activated with two commands: “OK, Google” and “Hey, Google.” When started, it listens for and executes user-supplied orders.
For human-machine contact, the company is exploring new ways of interacting, and commented on its progress in its “Look to Speak” presentation at the end of 2020.
So, The Mountain View company said the application is intended to help people with mobility and speech disabilities communicate with their devices through their eyes.you can also choose pre-designed races to recreate, not just look.
look: Google Maps: How to Add a Business on the Map
Then, within the framework of the Google I/O 2022 developer conference, the makers took it a step further with a “Look and Talk.” This technology can analyze audio, video, and text to determine if a user is accessing Google Nest Hub Max directly.
The technology company now provides updates on the technology on its Artificial Intelligence (AI) blog, disclosing in more detail how this recognition system works.
First, Google commented that “Look and Talk” uses an algorithm based on eight machine learning models (“machine learning”). This allows us to distinguish between intentional interaction and gaze up to 5 feet away to determine if the user is trying to touch the device.
Technology companies developed this algorithm by facing different variables and characteristicsSome of that is demographic in nature, such as age and skin color, different acoustic conditions and camera perspectives.
In real-time, the technology also faces an unusual camera perspective, as these smart displays are typically placed at moderately low heights in specific locations within the home.
Related item: Google updates requirements for using Android Auto: What updates are required?
The underlying process of “look and talk” consists of three phases. First, the assistant uses face detection technology to identify the presence of a person and locate the subject.
Thanks to Face Match technology, This solution determines if the person is registered in the system to communicate with the devicethe method used by other assistants such as Alexa.
During this first phase of recognition, the Assistant also relies on other visual cues, such as the angle at which the user’s gaze is set, to determine if the user is visually interacting with the device. I’m here.
Then the second stage begins where the assistant listens to the user’s query considering additional signals to determine if this speech is intended for him.
look: Android 13 arrives in September: what new features are included in device security?
To do this, we rely on technologies such as Voice Match. Voice Match validates and complements results previously generated by Face Match. “Look and Talk” then runs an automatic speech recognition model to transcribe the sender’s words and commands.
later, The assistant analyzes this transcription and information for non-lexical properties of the speech, such as pitch, speed of speech, or sound It can show the user’s indecision during the statement. It also determines the likelihood that an interaction is aimed at an assistant based on contextual visual cues.

Finally, when this intent understanding model determines that the user’s statement is intended for the Assistant, “Look and Talk” proceeds to the phase of processing the query and trying to answer it.
Finally, the company recognizes that each model supporting this system has been individually evaluated and improved, and tested in various environmental conditions allowing the introduction of customized parameters for its use. doing.
[ad_2]
Source link