07/09/2018, 18:14

Text Recognition for Android using Google Mobile Vision

This is the my first post in Viblo, today I will create a simple Android app that uses Google Mobile Vision API’s for Optical character recognition(OCR). The first, you can visit here for Text Recognition API Overview The important you is Recognized Languages The Text API can recognize ...

This is the my first post in Viblo, today I will create a simple Android app that uses Google Mobile Vision API’s for Optical character recognition(OCR).

The first, you can visit here for Text Recognition API Overview The important you is Recognized Languages The Text API can recognize text in any Latin based language. This includes, but is not limited to:

  • Catalan
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • German
  • Hungarian
  • Italian
  • Latin
  • Norwegian
  • Polish
  • Portugese
  • Romanian
  • Spanish
  • Swedish
  • Tagalog
  • Turkish

Let’s go!

  1. Create new project, I think everybody can do it very fast =)).
  2. Add Mobile Vision dependencies like this, very easy!
dependencies {
    ...
    compile 'com.google.android.gms:play-services-vision:11.0.4'
    ...
}
  1. VERY IMPORTANT => To use this library, you may need to update your installed version of Google Repository in SDK tools .Do make sure your version of Google Repository is up to date. It should be at least version 26.

  2. Don’t forget to add permissions for accessing CAMERA and also meta-data for using OCR in AndroidManifest.xml file like this

 <uses-permission android:name="android.permission.CAMERA"/>
  1. Next, create a simple layout activity that will look like this
<?xml version="1.0" encoding="utf-8"?>
<LinearLayout xmlns:android="http://schemas.android.com/apk/res/android"
    xmlns:tools="http://schemas.android.com/tools"
    android:layout_awidth="match_parent"
    android:layout_height="match_parent"
    android:orientation="vertical"
    android:weightSum="5"
    tools:context="com.tuts.prakash.simpleocr.MainActivity">
    <SurfaceView
        android:id="@+id/surfaceView"
        android:layout_awidth="match_parent"
        android:layout_height="0dp"
        android:layout_weight="4" />
    <TextView
        android:id="@+id/text_view"
        android:layout_awidth="match_parent"
        android:layout_height="0dp"
        android:layout_margin="8dp"
        android:layout_weight="1"
        android:gravity="center"
        android:textStyle="bold"
        android:text="@string/txt_message"
        android:textColor="@android:color/black"
        android:textSize="20sp" />
</LinearLayout>
  1. Understand Text using OCR with Mobile Vision Text API for Android Inside the onCreate() method of the MainActivity.java, we will call a startCameraSource() method.
private void startCameraSource() {

        //Create the TextRecognizer
        final TextRecognizer textRecognizer = new TextRecognizer.Builder(getApplicationContext()).build();

        if (!textRecognizer.isOperational()) {
            Log.w(TAG, "Detector dependencies not loaded yet");
        } else {

            //Initialize camerasource to use high resolution and set Autofocus on.
            mCameraSource = new CameraSource.Builder(getApplicationContext(), textRecognizer)
                    .setFacing(CameraSource.CAMERA_FACING_BACK)
                    .setRequestedPreviewSize(1280, 1024)
                    .setAutoFocusEnabled(true)
                    .setRequestedFps(2.0f)
                    .build();

            /**
             * Add call back to SurfaceView and check if camera permission is granted.
             * If permission is granted we can start our cameraSource and pass it to surfaceView
            */
            mCameraView.getHolder().addCallback(new SurfaceHolder.Callback() {
                @Override
                public void surfaceCreated(SurfaceHolder holder) {
                    try {

                        if (ActivityCompat.checkSelfPermission(getApplicationContext(),
                                Manifest.permission.CAMERA) != PackageManager.PERMISSION_GRANTED) {

                            ActivityCompat.requestPermissions(MainActivity.this,
                                    new String[]{Manifest.permission.CAMERA},
                                    requestPermissionID);
                            return;
                        }
                        mCameraSource.start(mCameraView.getHolder());
                    } catch (IOException e) {
                        e.printStackTrace();
                    }
                }

                @Override
                public void surfaceChanged(SurfaceHolder holder, int format, int awidth, int height) {
                }

                /**
                * Release resources for cameraSource
                */    
                @Override
                public void surfaceDestroyed(SurfaceHolder holder) {
                    mCameraSource.stop();
                }
            });

            //Set the TextRecognizer's Processor.
            textRecognizer.setProcessor(new Detector.Processor<TextBlock>() {
                @Override
                public void release() {
                }

                /**
                 * Detect all the text from camera using TextBlock and the values into a stringBuilder
                 * which will then be set to the textView.
                 * */
                @Override
                public void receiveDetections(Detector.Detections<TextBlock> detections) {
                    final SparseArray<TextBlock> items = detections.getDetectedItems();
                    if (items.size() != 0 ){

                        mTextView.post(new Runnable() {
                            @Override
                            public void run() {
                                StringBuilder stringBuilder = new StringBuilder();
                                for(int i=0;i<items.size();i++){
                                    TextBlock item = items.valueAt(i);
                                    stringBuilder.append(item.getValue());
                                    stringBuilder.append("
");
                                }
                                mTextView.setText(stringBuilder.toString());
                            }
                        });
                    }
                }
            });
        }
    }

So what’s going on in the above code? Here are the important parts.

TextRecognizer: This object processes images and determines what text appears within them. Once it’s initialized, it can be used to detect text in all types of images. Do note that before we start using it to recognize text, we should check that it’s ready. This is done using textRecognizer.isOperational() method.

CameraSource: This is a camera manager pre-configured for Vision processing. Here we will set the resolution to 1280*1024 and turn auto-focus on, because it will help in recognizing smaller text much faster. Also, we set the cameraSource to use rear camera by default.

mCameraSource = new CameraSource.Builder(getApplicationContext(), textRecognizer)
                    .setFacing(CameraSource.CAMERA_FACING_BACK)
                    .setRequestedPreviewSize(1280, 1024)
                    .setAutoFocusEnabled(true)
                    .setRequestedFps(2.0f)
                    .build();

Detector.Processor<TextBlock> : For TextRecognizer to read text straight from the camera, we have to implement a Detector Processor, which will handle detections as often as they become available.

textRecognizer.setProcessor(new Detector.Processor<TextBlock>() {
                @Override
                public void release() {
                }

                /**
                 * Detect all the text from camera using TextBlock and the save values into a stringBuilder
                 * which will then be set to the textView.
                 * */
                @Override
                public void receiveDetections(Detector.Detections<TextBlock> detections) {
                    final SparseArray<TextBlock> items = detections.getDetectedItems();
                    if (items.size() != 0 ){
                        mTextView.post(new Runnable() {
                            @Override
                            public void run() {
                                StringBuilder stringBuilder = new StringBuilder();
                                for(int i=0;i<items.size();i++){
                                    TextBlock item = items.valueAt(i);
                                    stringBuilder.append(item.getValue());
                                    stringBuilder.append("
");
                                }
                                // set the stringBuilder value to textView
                                mTextView.setText(stringBuilder.toString());
                            }
                        });
                    }
                }
            });

There are two methods be implemented here. The first, receiveDetections(), will receive TextBlock from the Detector as they become available. The second, release(), can be used to cleanly get rid of resources.

  1. Final step

    We take the values of TextBlock and create StringBuilder object and add the values to the textView, which will be updated every time there is a text in camera view.

  2. Fire up the app Finally, you can fire up the app and see a live view of text from camera view

And that’s it. We now have a simple OCR app ready with few lines of code!

Till next time ? Happy Learning to all.

0