Skip to content
  • About Us
  • Contact Us
  • Privacy Policy
  • Disclaimer
  • Corona Virus Stats (Covid-19)
  • Work with us
  • FB
  • LinkedIn
  • Twitter
  • Instagram.com
Tekraze

Tekraze

Dive Into Technology

  • Guides
    • Developer Guide
    • PC Guide
    • Web Guide
    • Android Guide
    • Music
    • Tutorials
  • Feed
    • Tech News
    • Shared Tech
    • Gaming Videos
    • Unboxing videos
  • Forums
    • Android Apps
    • Angular Npm Packages
    • Useful Site Links
    • Tech Queries
    • Windows OS Help
    • Web Guide and Help
    • Android Os And Rooting
    • Jhipster Discussion
    • Git & GitHub forum
    • Open Source Forum
  • Work with us
  • Toggle search form
Add Amazon Comprehend to Java

Add Amazon Comprehend to Spring Boot Project

Posted on April 6, 2020August 29, 2022 By Balvinder Singh No Comments on Add Amazon Comprehend to Spring Boot Project

Amazon Comprehend is a natural language processing (NLP) service that uses machine learning to find insights and relationships in the text. No machine learning experience required. Check out Amazon Comprehend via Link. There are cases like when you need to scan a document and extract data, so this service helps in that and automatically extracts text from any image, document.

Table of Contents

    • Use Cases
  • Steps for Integration
      • 1. AWS Comprehend SDK
      • 2. Java Service
      • 3. Initialize Comprehend Client
      • 4. Detect entities method
      • 5. Output Result

Use Cases

1. Bills

2. Medical Receipts

3. Forms

4. Images with written text

5. Feedback forms

6. Tables in the document

and more using OCR based scanning and NLP processing

Steps for Integration

1. AWS Comprehend SDK

Add below dependencies to pom.xml to add AWS Comprehend Classes.

<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk-comprehend -->
<dependency>
    <groupId>com.amazonaws</groupId>
    <artifactId>aws-java-sdk-comprehend</artifactId>
    <version>1.11.759</version>
</dependency>

2. Java Service

Create A java service and name it as you like. let say aws-comprehendService.java and write below methods. Or use in your own service.

3. Initialize Comprehend Client

AmazonComprehend comprehendClient() {
    log.debug("Intialize Comprehend Client");
    BasicAWSCredentials awsCreds = new BasicAWSCredentials(awsAccessKey, awsSecretKey);
    AWSStaticCredentialsProvider awsStaticCredentialsProvider = new AWSStaticCredentialsProvider(awsCreds);
    return AmazonComprehendClientBuilder.standard().withCredentials(awsStaticCredentialsProvider)
    .withRegion(awsRegion).build();
}

 

| Also Read | The right way to code and syntax you need to know

4. Detect entities method

Method for getting entities by Text

public List<Entity> detectEntitiesWithComprehend(String text) {
    log.debug("Method to Detect Entities With Amazon Comprehend {}", text);
    DetectEntitiesRequest detectEntitiesRequest = new DetectEntitiesRequest().withText(text).withLanguageCode("en");
    DetectEntitiesResult detectEntitiesResult = comprehendClient().detectEntities(detectEntitiesRequest);
    entitiesList = detectEntitiesResult.getEntities();
    return entitiesList;
}

Note: The text Limit for Using this way is 5000 bytes. So if you need to trim, see below method.

/***Text to trim */
text = trimByBytes(text, 5000);
String trimByBytes(String str, int lengthOfBytes) {
    byte[] bytes = str.getBytes(StandardCharsets.UTF_8);
    ByteBuffer buffer = ByteBuffer.wrap(bytes);
    if (lengthOfBytes < buffer.limit()) {
        buffer.limit(lengthOfBytes);
    }
    CharsetDecoder decoder = StandardCharsets.UTF_8.newDecoder();
    decoder.onMalformedInput(CodingErrorAction.IGNORE);
    try {
        return decoder.decode(buffer).toString();
    } catch (CharacterCodingException e) {
        // We will never get here.
    throw new RuntimeException(e);
    }
}<span id="mce_marker" data-mce-type="bookmark" data-mce-fragment="1">​</span>
| Also Read | The alphabet of programming language

5. Output Result

Now we got the entities in Form of the list. The List<Entity> is the list of entities processed from the text we passed. Sample output is below

[
    {
        "score": 0.4398592,
        "type": "ORGANIZATION",
        "text": "JSON",
        "beginOffset": 4930,
        "endOffset": 4934
    },
    {
        "score": 0.98848945,
        "type": "ORGANIZATION",
        "text": "Apple",
        "beginOffset": 4960,
        "endOffset": 4965
    }
]

Below is the snippet with all methods and imports. Visit the link for full code on Github

Link to Code

We used the synchronous method for processing now. Will add the asynchronous one next. So stay connected for more and please share. Please do share your views in the comments below.

Follow my blog with Bloglovin

Content Protection by DMCA.com
Developer Guide, Tutorials Tags:Amazon, aws, code

Post navigation

Previous Post: Docking Stations – The Port Extension you need
Next Post: Add AWS Transcribe to Spring boot App

Related Posts

  • How to use NPM module with Browserify in the browser
    How to use any NPM module with Browserify in the browser Developer Guide
  • Visual Studio Code IDE
    IDE’s The Integrated Development Environment you need to know Developer Guide
  • Tekraze front end design components
    Front End Design components Developer Guide
  • Logo of the programming language ChucK
    A Little Chuck Script for a Pluck Sound Music
  • 8 Women in Cryptocurrency Tekraze
    Top 8 Women in Crypto Currency Web Guide
  • 4 Ways Performance Testing Can Streamline Digital Transformation tekraze
    4 Ways Performance Testing Can Streamline Digital Transformation Guest posts

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Advertisements

Subscribe to updates

Enter your email below to subscribe





Posts by Categories

Advertisements
  • Brave Browser Feature Image
    Brave Browser Be Brave to say no to ads and secure privacy Web Guide
  • How to Earn Money Through Social Media 1
    How to Earn Money Through Social Media Guest posts
  • Linux Terminology basics you need to know
    Developer Terminology – you need to know Developer Guide
  • Top Linux Distros - 2018 | Tekraze 2
    Top Linux Distros – 2018 | Tekraze Developer Guide
  • tekraze intellicus
    Adaptive Security for Embedded Analytics in 2019 Guest posts
  • Everything You Need To Know About Tasker Profiles Tekraze
    Everything You Need To Know About Tasker Profiles Android Guide
  • Editing Tools to level up your social media Banner
    4 Editing Tools to Level Up Your Social Media Posts Web Guide
  • 5 great Online Vocabulary Tools you must know banner
    5 Great Online Vocabulary Tools you must know in 2022 Web Guide

Affliate Links

Sell with Payhip

Earn with Magenet

Sell and Buy with Adsy

GainRock affiliate Program

Automatic Backlinks

Advertise with Anonymous Ads

accessily tekraze verificationIndian Blog Directory

Copyright © 2023 Tekraze.

Powered by PressBook News WordPress theme