Speech Recognition

By
Speech Recognition Blog

Speech recognition is simply required voice. Which means speech recognition convert our voice to simple and plain text.
Suppose, you speak some word that time speech recognition API returns the related many words, we can print those word to our text view.
This tutorial will give you a brief information of the Android Speech recognition API.
Normally this API develop for automation and Machine Learning.

we can do this using Speech Recognizer.
Speech Recognizer mainly used Recognition Listener.

Recognition Listener contains many methods, that are describe as under.

  1. ReadyForSpeech
  2. BeginningOfSpeech
  3. RmsChanged
  4. BufferReceived
  5. EndOfSpeech
  6. Error
  7. Results
  8. PartialResults
  9. Event

1) onReadyForSpeech

call this method when the endpointer is ready for the user to start speaking.

2) onBeginningOfSpeech

Service call this method when user has started to speak.

3) onRmsChanged

The service should call this method when the sound level in the audio stream has changed. There is no guarantee that this method will be called.

4) onBufferReceived

call this method when sound has been received.
The purpose of this function is to allow giving feedback to the user regarding the captured audio.

5) onEndOfSpeech

this method call when a network or recognition error occurred.

6) onError

The service should call this method when a network or recognition error occurred.
onError method contains many error code those all are listed are as under.

  • ERROR_AUDIO
  • ERROR_CLIENT
  • ERROR_INSUFFICIENT_PERMISSIONS
  • ERROR_NETWORK
  • ERROR_NETWORK_TIMEOUT
  • ERROR_NO_MATCH
  • ERROR_RECOGNIZER_BUSY
  • ERROR_SERVER
  • ERROR_SPEECH_TIMEOUT

7) onResults

The service should call this method when recognition results are ready.
this method returns the result as a array list.

8) onPartialResults

partialResults result should be call when partial recognition are available.
this method can be called at any time between the beginningOfSpeech and our results when partial results are ready.
This method may be called 0, 1 or multiple times for each call to speechrecognition.
returns the result as a array list.

9) onEvent

onEvent method is used to reserved for adding future events.

NOTE :

if you want to more details about this, click here

Download full source code,

First create activity_main.xml for layout. This layout for user interface that is used to user interaction.

<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_width="match_parent"
    android:layout_height="match_parent"
    android:background="@color/colorAccent"
    android:gravity="center"
    android:orientation="vertical"
    tools:context=".MainActivity">

    <TextView
        android:id="@+id/tvSpeechRecognizationTextTitle"
        android:layout_width="match_parent"
        android:layout_height="wrap_content"
        android:layout_marginTop="16dp"
        android:layout_marginBottom="8dp"
        android:gravity="center"
        android:text="@string/speech_recognition_title"
        android:textColor="#FFFFFF"
        android:textSize="26sp"
        android:textStyle="bold" />

    <ImageView
        android:id="@+id/ivMic"
        android:layout_width="wrap_content"
        android:layout_height="wrap_content"
        android:src="@mipmap/speech" />
    <TextView
        android:id="@+id/tvTapMic"
        android:layout_width="match_parent"
        android:visibility="visible"
        android:layout_height="wrap_content"
        android:layout_marginTop="16dp"
        android:gravity="center"
        android:text="@string/speech_recognition_tap_to_speak"
        android:textColor="#FFFFFF"
        android:textSize="20sp"
        android:textStyle="italic" />

    <TextView
        android:id="@+id/tvSpeechRecognizationText"
        android:layout_width="match_parent"
        android:visibility="invisible"
        android:layout_height="wrap_content"
        android:layout_marginTop="16dp"
        android:gravity="center"
        android:text="@string/speech_recognition_indicator_prepare"
        android:textColor="#FFFFFF"
        android:textSize="20sp"
        android:textStyle="italic" />
</LinearLayout>

above layout many strings, those are mentioned in string.xml. this file place inside the res->value->string.xml

<resources>
    <string name="app_name">SpeechRecog</string>

    <string name="speech_recognition_title">Speech Recognition</string>
    <string name="speech_recognition_tap_to_speak">Tap on Mic and speak</string>
    <string name="speech_recognition_indicator_prepare">Please wait, we are preparing...</string>
    <string name="speech_recognition_indicator_speak">ok, Speak now...</string>
</resources>

Now, we declare speech recognizer with some property,

        mSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
        mAudioManager = (AudioManager) getSystemService(Context.AUDIO_SERVICE);

        mSpeechRecognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, this.getPackageName());

above code create speech recognition object with perfect EXTRA_LANGUAGE_MODEL and EXTRA_CALLING_PACKAGE.
That are used respectively language and call package of speech recognition.

Let’s see the full source code, required to create MainActivity.java file,

import android.content.Context;
import android.content.Intent;
import android.media.AudioManager;
import android.os.Bundle;
import android.speech.RecognitionListener;
import android.speech.RecognizerIntent;
import android.speech.SpeechRecognizer;
import android.support.v7.app.AppCompatActivity;
import android.util.Log;
import android.view.View;
import android.widget.ImageView;
import android.widget.TextView;

import java.util.ArrayList;

public class MainActivity extends AppCompatActivity {
    private String TAG = "MainActivity";

    private TextView tvSpeechRecognizationText;

    Intent mSpeechRecognizerIntent;
    private SpeechRecognizer mSpeechRecognizer;
    private AudioManager mAudioManager;
    private ImageView ivMic;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        tvSpeechRecognizationText = (TextView) findViewById(R.id.tvSpeechRecognizationText);
        ivMic = (ImageView)findViewById(R.id.ivMic);

        ivMic.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View v) {
                speak();
            }
        });

    }

    private void speak() {

        mSpeechRecognizer = SpeechRecognizer.createSpeechRecognizer(this);
        mAudioManager = (AudioManager) getSystemService(Context.AUDIO_SERVICE);

        mSpeechRecognizerIntent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_MAX_RESULTS, 1);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
        mSpeechRecognizerIntent.putExtra(RecognizerIntent.EXTRA_CALLING_PACKAGE, this.getPackageName());

//        mAudioManager.setStreamMute(AudioManager.STREAM_MUSIC, true);
        mSpeechRecognizer.startListening(mSpeechRecognizerIntent);

        mSpeechRecognizer.setRecognitionListener(new RecognitionListener() {
            @Override
            public void onReadyForSpeech(Bundle params) {
                tvSpeechRecognizationText.setVisibility(View.VISIBLE);
                tvSpeechRecognizationText.setText("onReadyForSpeech");
                Log.e(TAG, "onReadyForSpeech");
            }

            @Override
            public void onBeginningOfSpeech() {
                tvSpeechRecognizationText.setText("onBeginningOfSpeech");
                Log.e(TAG, "onBeginningOfSpeech");
            }

            @Override
            public void onRmsChanged(float rmsdB) {
            }

            @Override
            public void onBufferReceived(byte[] buffer) {
                Log.e(TAG, "onBufferReceived1");
            }

            @Override
            public void onEndOfSpeech() {
                Log.e(TAG, "onEndOfSpeech");
            }

            @Override
            public void onError(int error) {
                Log.e(TAG, "onError1");
            }

            @Override
            public void onResults(Bundle results) {
                tvSpeechRecognizationText.setVisibility(View.VISIBLE);
                ArrayList<String> matches = results.getStringArrayList(SpeechRecognizer.RESULTS_RECOGNITION);

                String listString = "";
                for (String s : matches) {
                    listString += s + "\t";
                }
                tvSpeechRecognizationText.setText("Result is : " + listString);
            }

            @Override
            public void onPartialResults(Bundle results) {
            }

            @Override
            public void onEvent(int eventType, Bundle params) {

            }
        });
    }
}

When we tap on the mic icon, that time onReadyForSpeech method of the speech recognition is called.
Whenever we start the speaking some word that time speech recognition listening our voice continuously.
While onEndOfSpeech call, ending of the our speech.

Leave a Comment

Your email address will not be published.

Latest Blog