Textual content Recognition with ML Package for Android: Getting Began

June 18, 2025

31

ML Package is a cellular SDK from Google that makes use of machine studying to resolve issues comparable to textual content recognition, textual content translation, object detection, face/pose detection, and a lot extra!

The APIs can run on-device, enabling you to course of real-time use circumstances with out sending knowledge to servers.

ML Package offers two teams of APIs:

Imaginative and prescient APIs: These embrace barcode scanning, face detection, textual content recognition, object detection, and pose detection.
Pure Language APIs: You employ them each time you should determine languages, translate textual content, and carry out good replies in textual content conversations.

This tutorial will give attention to Textual content Recognition. With this API you’ll be able to extract textual content from pictures, paperwork, and digital camera enter in actual time.

On this tutorial, you’ll be taught:

What a textual content recognizer is and the way it teams textual content components.
The ML Package Textual content Recognition options.
The way to acknowledge and extract textual content from a picture.

Getting Began

All through this tutorial, you’ll work with Xtractor. This app helps you to take an image and extract the X usernames. You would use this app in a convention each time the speaker reveals their contact knowledge and also you’d prefer to search for them later.

Use the Obtain Supplies button on the high or backside of this tutorial to obtain the starter challenge.

As soon as downloaded, open the starter challenge in Android Studio Meerkat or newer. Construct and run, and also you’ll see the next display screen:

Clicking the plus button will allow you to select an image out of your gallery. However, there received’t be any textual content recognition.

Earlier than including textual content recognition performance, you should perceive some ideas.

Utilizing a Textual content Recognizer

A textual content recognizer can detect and interpret textual content from varied sources, comparable to pictures, movies, or scanned paperwork. This course of is named OCR, which stands for: Optical Character Recognition.

Some textual content recognition use circumstances may be:

Scanning receipts or books into digital textual content.
Translating indicators from static pictures or the digital camera.
Computerized license plate recognition.
Digitizing handwritten types.

Right here’s a breakdown of what a textual content recognizer usually does:

Detection: Finds the place the textual content is situated inside a picture, video, or doc.
Recognition: Converts the detected characters or handwriting into machine-readable textual content.
Output: Returns the acknowledged textual content.

ML Package Textual content Recognizer segments textual content into blocks, traces, components, and symbols.

Right here’s a short clarification of every one:

Block: Reveals in pink, a set of textual content traces, e.g. a paragraph or column.
Line: Reveals in blue, a set of phrases.
Ingredient: Reveals in inexperienced, a set of alphanumeric characters, a phrase.
Image: Single alphanumeric character.

ML Package Textual content Recognition Options

The API has the next options:

Acknowledge textual content in varied languages. Together with Chinese language, Devanagari, Japanese, Korean, and Latin. These have been included within the newest (V2) model. Examine the supported languages right here.
Can differentiate between a personality, a phrase, a set of phrases, and a paragraph.
Establish the acknowledged textual content language.
Return bounding packing containers, nook factors, rotation data, confidence rating for all detected blocks, traces, components, and symbols
Acknowledge textual content in real-time.

Bundled vs. Unbundled

All ML Package options make use of Google-trained machine studying fashions by default.

Notably, for textual content recognition, the fashions might be put in both:

Unbundled: Fashions are downloaded and managed by way of Google Play Providers.
Bundled: Fashions are statically linked to your app at construct time.

Utilizing bundled fashions implies that when the consumer installs the app, they’ll even have all of the fashions put in and will probably be usable instantly. Every time the consumer uninstalls the app, all of the fashions will probably be deleted. To replace the fashions, first the developer has to replace the fashions, publish the app, and the consumer has to replace the app.

Then again, if you happen to use unbundled fashions, they’re saved in Google Play Providers. The app has to first obtain them earlier than use. When the consumer uninstalls the app, the fashions is not going to essentially be deleted. They’ll solely be deleted if all apps that depend upon these fashions are uninstalled. Every time a brand new model of the fashions are launched, they’ll be up to date for use within the app.

Relying in your use case, chances are you’ll select one choice or the opposite.

It’s advised to make use of the unbundled choice if you need a smaller app measurement and automatic mannequin updates by Google Play Providers.

Nonetheless, it’s best to use the bundled choice if you need your customers to have full function performance proper after putting in the app.

Including Textual content Recognition Capabilities

To make use of ML Package Textual content Recognizer, open your app’s construct.gradle file of the starter challenge and add the next dependency:


implementation("com.google.mlkit:text-recognition:16.0.1")
implementation("org.jetbrains.kotlinx:kotlinx-coroutines-play-services:1.10.2")

Right here, you’re utilizing the text-recognition bundled model.

Now, sync your challenge.

Notice: To get the most recent model of text-recognition, please test right here.
To get the most recent model of kotlinx-coroutines-play-services, test right here. And, to assist different languages, use the corresponding dependency. You may test them right here.

Now, substitute the code of recognizeUsernames with the next:


val picture = InputImage.fromBitmap(bitmap, 0)
val recognizer = TextRecognition.getClient(TextRecognizerOptions.DEFAULT_OPTIONS)
val end result = recognizer.course of(picture).await()

return emptyList()

You first get a picture from a bitmap. Then, you get an occasion of a TextRecognizer utilizing the default choices, with Latin language assist. Lastly, you course of the picture with the recognizer.

You’ll must import the next:


import com.google.mlkit.imaginative and prescient.textual content.TextRecognition
import com.google.mlkit.imaginative and prescient.textual content.latin.TextRecognizerOptions
import com.kodeco.xtractor.ui.theme.XtractorTheme
import kotlinx.coroutines.duties.await

Notice: To assist different languages move the corresponding choice. You may test them right here.

You would acquire blocks, traces, and components like this:


// 1
val textual content = end result.textual content

for (block in end result.textBlocks) {
 // 2
 val blockText = block.textual content
 val blockCornerPoints = block.cornerPoints
 val blockFrame = block.boundingBox

 for (line in block.traces) {
 // 3
 val lineText = line.textual content
 val lineCornerPoints = line.cornerPoints
 val lineFrame = line.boundingBox

 for (factor in line.components) {
 // 4
 val elementText = factor.textual content
 val elementCornerPoints = factor.cornerPoints
 val elementFrame = factor.boundingBox
 }
 }
}

Right here’s a short clarification of the code above:

First, you get the total textual content.
Then, for every block, you get the textual content, the nook factors, and the body.
For every line in a block, you get the textual content, the nook factors, and the body.
Lastly, for every factor in a line, you get the textual content, the nook factors, and the body.

Nonetheless, you solely want the weather that signify X usernames, so substitute the emptyList() with the next code:


return end result.textBlocks
 .flatMap { it.traces }
 .flatMap { it.components }
 .filter { factor -> factor.textual content.isXUsername() }
 .mapNotNull { factor ->
 factor.boundingBox?.let { boundingBox ->
 UsernameBox(factor.textual content, boundingBox)
 }
 }

You transformed the textual content blocks into traces, for every line you get the weather, and for every factor, you filter these which are X usernames. Lastly, you map them to UsernameBox which is a category that accommodates the username and the bounding field.

The bounding field is used to attract rectangles over the username.

Now, run the app once more, select an image out of your gallery, and also you’ll get the X usernames acknowledged:

Congratulations! You’ve simply realized find out how to use Textual content Recognition.

Textual content Recognition with ML Package for Android: Getting Began

Getting Began

Utilizing a Textual content Recognizer

ML Package Textual content Recognition Options

Bundled vs. Unbundled

Including Textual content Recognition Capabilities

Related Articles

A Retrospective on Workload Safety

Who’s the Kimwolf Botmaster “Dort”? – Krebs on Safety

LET’S FLY PRODUCTION – Drone Pilot in Lille

LEAVE A REPLY Cancel reply

Latest Articles

A Retrospective on Workload Safety

Who’s the Kimwolf Botmaster “Dort”? – Krebs on Safety

LET’S FLY PRODUCTION – Drone Pilot in Lille

A flash of laser mild flips a magnet in main light-control breakthrough

Radar Tendencies to Watch: March 2026 – O’Reilly

ABOUT US