Amazon Rekognition でラベル検出した画像に境界ボックスを描画する

こんにちは、広野です。

Amazon Rekognition という AWS の画像・動画分析サービスに、ラベル検出という機能があります。画像・動画の中にどんな物体や意味があるか分析してくれるのですが、処理結果は JSON 形式のラベル情報しか戻してくれません。見つかったものが視覚的にわかるように加工した画像を戻してくれたらなぁ、と思い、そんな機能をつくってみましたので紹介します。

境界ボックスとは
参考情報
処理フロー
1. DetectLabels ステップ
2. DrawBoundingBoxes ステップ
AWS CloudFormation テンプレート
まとめ

境界ボックスとは

元の画像を分析して見つけたラベルに対して、それが画像内のどこにあるのか示すために枠線で囲ったものを境界ボックス (Bounding Box) と呼ぶようです。

以下の画像を見て頂けたらわかると思います。実際にこのような加工をしてくれる機能を作りました。

分析前

分析後

見つかった物体は「りんご」・「包丁」・「ハサミ」です。

参考情報

今回は動画ではなく画像に対して境界ボックスを描画する機能を作りました。

Amazon Rekognition の API の 1 つである、DetectLabels という API を活用します。以下のドキュメントにある通り、分析結果を JSON データとして返してくれます。検出されたラベル情報の中には、そのラベルが画像内のどの位置にあるかを示す座標情報を BoundingBox というキーで持つものがあります。

Detecting labels in an image - Amazon Rekognition

You can use the DetectLabels operation to detect labels (objects and concepts) in an image and retrieve information abou...

AWS 公式ドキュメントに、この座標情報をどのように画像に描画するか、サンプルとなるコードが紹介されていました。別の API である DetectFaces のサンプルなのですが、結果の JSON データが似ているのでかなり参考になりました。境界ボックスの座標に関する説明もここでされています。

Displaying bounding boxes - Amazon Rekognition

Amazon Rekognition Image operations can return bounding boxes coordinates for items that are detected in images. For exa...

処理フロー

上記参考情報から、アプリと連動するようバックエンドで動く AWS Step Functions ステートマシンを作りました。AWS Lambda 関数だけで作れるのですが、前後に行う処理があるのでそのようにしています。本記事では前後の処理は記事のテーマからは外れるので割愛します。

ステートマシンに以下のフォーマットの JSON データを渡して実行します。

{
  "inputbucket": "ここに分析したい画像を保存したS3バケット名を入れる",
  "inputkey": "ここに分析したい画像のキー名を入れる",
  "outputbucket": "ここに境界ボックス描画後の画像を保存したいS3バケット名を入れる",
  "outputkey": "ここに境界ボックス描画後の画像のキー名を入れる"
}

処理が完了すると、outputbucket、outputkey で指定した Amazon S3 バケット、キーで画像が保存されます。

DetectLabels ステップ

単純に DetectLabels API に inputbucket、inputkey の情報を渡して分析をさせています。本記事の構成では、信頼度 (Confidence) が 90% 以上のラベルのみ出力するようにしています。(でないとやたら検出ラベル数が多くなる)

指定するパラメータは多くなく、詳細は以下の公式ドキュメントをご覧頂けたらと思います。

Detecting labels in an image - Amazon Rekognition

You can use the DetectLabels operation to detect labels (objects and concepts) in an image and retrieve information abou...

DrawBoundingBoxes ステップ

DetectLabels ステップの結果を受けて、input の画像ファイルを読み込み境界ボックスを描画した後、S3 に保存します。本記事のメインの話です。

以下の Lambda 関数 (Python 3.9) を実行しています。解説は後述します。

import boto3
import io
from PIL import Image, ImageDraw
def lambda_handler(event, context):
  try:
    print(event)
    # Load image from S3 bucket
    s3_connection = boto3.resource('s3')
    s3_object = s3_connection.Object(event['inputbucket'], event['inputkey'])
    s3_response = s3_object.get()
    stream = io.BytesIO(s3_response['Body'].read())
    image=Image.open(stream)
    imgWidth, imgHeight = image.size
    draw = ImageDraw.Draw(image)
    # Calculate and draw bounding boxes
    for label in event['result1']['Labels']:
      for instance in label['Instances']:
        box = instance['BoundingBox']
        left = imgWidth * box['Left']
        top = imgHeight * box['Top']
        width = imgWidth * box['Width']
        height = imgHeight * box['Height']
        points = (
          (left, top),
          (left + width, top),
          (left + width, top + height),
          (left, top + height),
          (left, top)
        )
        draw.line(points, fill='#00d400', width=2)
    # Save image in S3 bucket
    tmpfile = '/tmp/result.png'
    image.save(tmpfile)
    s3_connection.meta.client.upload_file(tmpfile, event['outputbucket'], event['outputkey'])
    # Return result
    return {
      "labelCount": len(event['result1']['Labels'])
    }
  except Exception as e:
    print(e)
    return e

1. 必要なモジュール準備

最初に必要なモジュールを import しています。AWS Lambda 関数に標準で用意されていないモジュール Pillow を使用して画像処理をするので、これを Lambda Layer 経由で読み込ませるためモジュールファイルを作成しておく必要があります。

以下の記事を参考にして、pillow940.zip を作成します。
※名前は何でもいいです。執筆時点でバージョンが 9.4.0 だったので本記事ではそのような名前にしました。

Python 3.9 用のモジュールファイル、Lambda Layer を作成する方法

AWS Lambda (Python 3.12) で使用可能な pandas の Lambda Layer を準備する

データ分析や加工でよく使われるライブラリに、pandas があると思います。本記事では、AWS Lambda (Python 3.12) で動作する pandas の Lambda Layer を準備する手順を紹介します。

Pillow のインストール

Pillow

Pillow is the friendly PIL fork by Jeffrey A. Clark and contributors. PIL is the Python Imaging Library by Fredrik Lundh...

モジュールファイルが出来上がれば、Lambda Layer を作成して Lambda 関数にモジュールを読み込ませることができます。本記事では、記事の最下部に掲載している AWS CloudFormation テンプレートで Lambda Layer、Lambda 関数をプロビジョニングしています。

2. 境界ボックスを描画するコード

以降は、冒頭の参考情報でも紹介した AWS ドキュメントを参考に書いただけです。

Displaying bounding boxes - Amazon Rekognition

Amazon Rekognition Image operations can return bounding boxes coordinates for items that are detected in images. For exa...

ただし、DetectFaces と DetectLabels では API 実行後の結果フォーマットが異なるため、その違いを確認してコードを書き換えました。

サンプルコードでは Pillow で境界ボックスを描画した後ファイルとして保存しないコードになっていたため、一旦 Lambda 関数の /tmp ディレクトリにファイルとして保存し、それを Amazon S3 バケット (outputbucket、outputkey で指定した場所) にアップロードするように書いています。

最後に Lambda 関数の戻り値として、サンプルコードの通り検出したラベルの数が戻るようになっています。

AWS CloudFormation テンプレート

詳細なパラメータは、以下のテンプレートをご覧下さい。AWS アカウントをお持ちでしたら、スタックを作成して出来上がったリソースを見て頂けると理解が早くなると思います。

AWSTemplateFormatVersion: 2010-09-09
Description: The CloudFormation template that creates an AWS Step Functions state machine with Amazon Rekognition DetectLabels API.

# ------------------------------------------------------------#
# Input Parameters
# ------------------------------------------------------------#
Parameters:
  SubName:
    Type: String
    Description: System sub name of EXAMPLE. (e.g. prod or test)
    Default: test
    MaxLength: 10
    MinLength: 1

  S3BucketNameSdk:
    Type: String
    Description: S3 bucket name in which you uploaded sdks for Lambda Layers.
    Default: example-bucket-name-sdks
    MaxLength: 50
    MinLength: 1

  S3KeyPillowSdk:
    Type: String
    Description: S3 key of pillow.zip. Fill the exact key name if you renamed. (e.g. sdk/Python3.9/pillow940.zip)
    Default: sdk/Python3.9/pillow940.zip
    MaxLength: 50
    MinLength: 1

Resources:
# ------------------------------------------------------------#
# State Machine
# ------------------------------------------------------------#
  StateMachineEXAMPLEdlb:
    Type: AWS::StepFunctions::StateMachine
    Properties:
      StateMachineName: !Sub EXAMPLE-dlb-${SubName}
      StateMachineType: STANDARD
      DefinitionSubstitutions:
        DSSubName: !Sub ${SubName}
        DSDrawBoundingboxesLambdaArn: !GetAtt LambdaDrawBoundingboxesDLB.Arn
      DefinitionString: |-
        {
          "Comment": "State machine to detect labels in an image for EXAMPLE-${DSSubName}",
          "StartAt": "DetectLabels",
          "States": {
            "DetectLabels": {
              "Type": "Task",
              "Parameters": {
                "Image": {
                  "S3Object": {
                    "Bucket.$": "$.inputbucket",
                    "Name.$": "$.inputkey"
                  }
                },
                "MinConfidence": 90
              },
              "Resource": "arn:aws:states:::aws-sdk:rekognition:detectLabels",
              "Comment": "Detect Labels",
              "ResultPath": "$.result1",
              "TimeoutSeconds": 300,
              "Next": "DrawBoundingBoxes"
            },
            "DrawBoundingBoxes": {
              "Type": "Task",
              "Resource": "arn:aws:states:::lambda:invoke",
              "Parameters": {
                "Payload.$": "$",
                "FunctionName": "${DSDrawBoundingboxesLambdaArn}"
              },
              "Retry": [
                {
                  "ErrorEquals": [
                    "Lambda.ServiceException",
                    "Lambda.AWSLambdaException",
                    "Lambda.SdkClientException",
                    "Lambda.TooManyRequestsException"
                  ],
                  "IntervalSeconds": 2,
                  "MaxAttempts": 6,
                  "BackoffRate": 2
                }
              ],
              "ResultPath": "$.result2",
              "End": true
            }
          },
          "TimeoutSeconds": 1200
        }
      LoggingConfiguration:
        Destinations:
          - CloudWatchLogsLogGroup:
              LogGroupArn: !GetAtt LogGroupStateMachineEXAMPLEdlb.Arn
        IncludeExecutionData: true
        Level: ERROR
      RoleArn: !GetAtt StateMachineExecutionRoleDLB.Arn
      TracingConfiguration:
        Enabled: false
      Tags:
        - Key: Cost
          Value: !Sub EXAMPLE-${SubName}
    DependsOn:
      - LogGroupStateMachineEXAMPLEdlb
      - StateMachineExecutionRoleDLB

# ------------------------------------------------------------#
# Lambda
# ------------------------------------------------------------#
  LambdaDrawBoundingboxesDLB:
    Type: AWS::Lambda::Function
    Properties:
      FunctionName: !Sub EXAMPLE-DrawBoundingboxesDLB-${SubName}
      Description: !Sub Lambda Function to draw bounding boxes in the image for EXAMPLE-${SubName}
      Runtime: python3.9
      Timeout: 600
      MemorySize: 128
      Role: !GetAtt LambdaS3InvocationRoleDLB.Arn
      Handler: index.lambda_handler
      Layers:
        - !Ref LambdaLayerPillow
      Tags:
        - Key: Cost
          Value: !Sub EXAMPLE-${SubName}
      Code:
        ZipFile: !Sub |
          import boto3
          import io
          from PIL import Image, ImageDraw
          def lambda_handler(event, context):
            try:
              print(event)
              # Load image from S3 bucket
              s3_connection = boto3.resource('s3')
              s3_object = s3_connection.Object(event['inputbucket'], event['inputkey'])
              s3_response = s3_object.get()
              stream = io.BytesIO(s3_response['Body'].read())
              image=Image.open(stream)
              imgWidth, imgHeight = image.size
              draw = ImageDraw.Draw(image)
              # Calculate and draw bounding boxes
              for label in event['result1']['Labels']:
                for instance in label['Instances']:
                  box = instance['BoundingBox']
                  left = imgWidth * box['Left']
                  top = imgHeight * box['Top']
                  width = imgWidth * box['Width']
                  height = imgHeight * box['Height']
                  points = (
                    (left, top),
                    (left + width, top),
                    (left + width, top + height),
                    (left, top + height),
                    (left, top)
                  )
                  draw.line(points, fill='#00d400', width=2)
              # Save image in S3 bucket
              tmpfile = '/tmp/result.png'
              image.save(tmpfile)
              s3_connection.meta.client.upload_file(tmpfile, event['outputbucket'], event['outputkey'])
              # Return result
              return {
                "labelCount": len(event['result1']['Labels'])
              }
            except Exception as e:
              print(e)
              return e
    DependsOn:
      - LambdaS3InvocationRoleDLB
      - LambdaLayerPillow

# ------------------------------------------------------------#
# State Machine LogGroup (CloudWatch Logs)
# ------------------------------------------------------------#
  LogGroupStateMachineEXAMPLEdlb:
    Type: AWS::Logs::LogGroup
    Properties:
      LogGroupName: !Sub /aws/vendedlogs/states/EXAMPLE-dlb-${SubName}
      RetentionInDays: 365
      Tags:
        - Key: Cost
          Value: !Sub EXAMPLE-${SubName}

# ------------------------------------------------------------#
# State Machine Execution Role (IAM)
# ------------------------------------------------------------#
  StateMachineExecutionRoleDLB:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Sub EXAMPLE-StateMachineExecutionRoleDLB-${SubName}
      Description: This role allows State Machines to call specified AWS resources.
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
                states.amazonaws.com
            Action:
              - sts:AssumeRole
      Path: /service-role/
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaRole
        - arn:aws:iam::aws:policy/AmazonRekognitionFullAccess
        - arn:aws:iam::aws:policy/CloudWatchLogsFullAccess
        - arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess

# ------------------------------------------------------------#
# Lambda Layer
# ------------------------------------------------------------#
  LambdaLayerPillow:
    Type: AWS::Lambda::LayerVersion
    Properties:
      LayerName: !Sub EXAMPLE-${SubName}-pillow
      Description: !Sub Pillow 9.4.0 for Python to load in EXAMPLE-${SubName}
      CompatibleRuntimes:
        - python3.9
      Content:
        S3Bucket: !Sub ${S3BucketNameSdk}
        S3Key: !Sub ${S3KeyPillowSdk}
      LicenseInfo: HPND

# ------------------------------------------------------------#
# Lambda S3 Invocation Role (IAM)
# ------------------------------------------------------------#
  LambdaS3InvocationRoleDLB:
    Type: AWS::IAM::Role
    Properties:
      RoleName: !Sub EXAMPLE-LambdaS3InvocationRoleDLB-${SubName}
      Description: This role allows Lambda functions to invoke S3 for DLB.
      AssumeRolePolicyDocument:
        Version: 2012-10-17
        Statement:
          - Effect: Allow
            Principal:
              Service:
              - lambda.amazonaws.com
            Action:
              - sts:AssumeRole
      Path: /
      ManagedPolicyArns:
        - arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole
        - arn:aws:iam::aws:policy/AWSXRayDaemonWriteAccess
        - arn:aws:iam::aws:policy/AmazonS3FullAccess

※ IAM ロールの権限は適宜必要な権限のみに絞り込むよう書き換えて下さい。