Developer Guide

Add Amazon Comprehend to Spring Boot Project

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in the text. No machine learning experience required. Check out Amazon Comprehend via Link. There are cases like when you need to scan a document and extract data, so this service helps in that and automatically extracts text from any image, document.

Use Cases

1. Bills

2. Medical Receipts

3. Forms

4. Images with written text

5. Feedback forms

6. Tables in the document

and more using OCR based scanning and NLP processing

Steps for Integration

1. AWS Comprehend SDK

Add below dependencies to pom.xml to add AWS Comprehend Classes.

<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-comprehend -->
<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-comprehend</artifactId>
    <version>1.11.759</version>
</dependency>

2. Java Service

Create A java service and name it as you like. let say aws-comprehendService.java and write below methods. Or use in your own service.

3. Initialize Comprehend Client

AmazonComprehend comprehendClient() {
    log.debug("Intialize Comprehend Client");
    BasicAWSCredentials awsCreds = new BasicAWSCredentials(awsAccessKey, awsSecretKey);
    AWSStaticCredentialsProvider awsStaticCredentialsProvider = new AWSStaticCredentialsProvider(awsCreds);
    return AmazonComprehendClientBuilder.standard().withCredentials(awsStaticCredentialsProvider)
    .withRegion(awsRegion).build();
}

 

[su_note note_color=”#f5f5d4″ radius=”6″]| Also Read | The right way to code and syntax you need to know [/su_note]

4. Detect entities method

Method for getting entities by Text

public List<Entity> detectEntitiesWithComprehend(String text) {
    log.debug("Method to Detect Entities With Amazon Comprehend {}", text);
    DetectEntitiesRequest detectEntitiesRequest = new DetectEntitiesRequest().withText(text).withLanguageCode("en");
    DetectEntitiesResult detectEntitiesResult = comprehendClient().detectEntities(detectEntitiesRequest);
    entitiesList = detectEntitiesResult.getEntities();
    return entitiesList;
}

Note: The text Limit for Using this way is 5000 bytes. So if you need to trim, see below method.

/***Text to trim */text = trimByBytes(text, 5000);
String trimByBytes(String str, int lengthOfBytes) {
    byte[] bytes = str.getBytes(StandardCharsets.UTF_8);
    ByteBuffer buffer = ByteBuffer.wrap(bytes);
    if (lengthOfBytes < buffer.limit()) {
        buffer.limit(lengthOfBytes);
    }
    CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
    decoder.onMalformedInput(CodingErrorAction.IGNORE);
    try {
        return decoder.decode(buffer).toString();
    } catch (CharacterCodingException e) {
        // We will never get here.
    throw new RuntimeException(e);
    }
}<span id="mce_marker" data-mce-type="bookmark" data-mce-fragment="1">​</span>

[su_note note_color=”#f5f5d4″ radius=”6″]| Also Read | The alphabet of programming language [/su_note]

5. Output Result

Now we got the entities in Form of the list. The List<Entity> is the list of entities processed from the text we passed. Sample output is below

[
    {
        "score": 0.4398592,
        "type": "ORGANIZATION",
        "text": "JSON",
        "beginOffset": 4930,
        "endOffset": 4934
    },
    {
        "score": 0.98848945,
        "type": "ORGANIZATION",
        "text": "Apple",
        "beginOffset": 4960,
        "endOffset": 4965
    }
]

Below is the snippet with all methods and imports. Visit the link for full code on Github

Link to Code

We used the synchronous method for processing now. Will add the asynchronous one next. So stay connected for more and please share. Please do share your views in the comments below.

Follow my blog with Bloglovin

This post was last modified on August 29, 2022 11:53 pm

Balvinder Singh

Founder And Editor at Tekraze.com. Loves to write about technology, gaming, business, tips and tricks. Working as a Senior Software Engineer in Infosys India. Exploring different blockchains as well.

Leave a Comment
Share
Published by
Balvinder Singh

Recent Posts

Unlocking Freedom: Remove KG Lock – Your Trusted Partner for Samsung Device Unlocking

Struggling with a locked Samsung Galaxy or Note smartphone? Remove KG Lock, a U.S.-based company,…

3 days ago

How to Use Photoshop Online Free Editor Photopea

When you open Adobe Photoshop for the first time, it's easy to click around in…

3 days ago

8 Social Media Apps for Influencers need to know

Social Media Influencer apps for mobile With the increasing number of social media platforms and…

5 days ago

Microsoft Dynamics 365 you need to know

Microsoft Dynamics 365 is a suite of Business Applications from Microsoft that covers the traditional…

7 days ago

Safe Downloading in South Korea: Alternatives to Torrents on pa2010.com

While torrents and peer-to-peer (P2P) file sharing have long been popular for data transfer, concerns…

1 week ago

Remote Satellite Systems International: Securing Your Communications in Any Situation

In today's increasingly interconnected world, reliable communication is paramount. But what happens when emergencies strike…

1 week ago