AWS Lambda

Provision an AWS Lambda instance via the AWS Management Console and deploy the main end-to-end POB data pipeline Spring Boot application to it.

Last Updated: 26 May 2022 • Page Author: Jillur Quddus

Overview

AWS Lambda is the native serverless, event-driven compute service offered by the AWS cloud computing platform, enabling applications and backend services to be run without provisioning or managing any servers. This page provides instructions on how to provision an AWS Lambda instance and then deploy the main end-to-end POB data pipeline Spring Boot application to it. The main POB data pipeline instantiates and executes all the enabled procurement framework parser classes and publisher classes that have been registered in application.yml.

For further information regarding AWS Lambda, please visit https://aws.amazon.com/lambda/.

It is recommended that you configure and integrate the steps described in this page into a CI/CD pipeline to automate build, testing and deployment.

Main Data Pipeline

POB provides an out-of-the-box AWS Lambda Spring Boot application that wraps around the main POB data pipeline, enabling it to be deployed to an AWS Lambda instance and triggered via an AWS EventBridge (CloudWatch Events) CRON schedule. The AWS Lambda Spring Boot application for the main POB data pipeline may be found in the $POB_BASE/pob-apps/pob-apps-aws/pob-aws-lambda-app-data-pipelines-scheduler Maven module.

Setup

Build from Source

In order to compile and build the main POB data pipeline AWS Lambda Spring Boot application in preparation for deployment to an AWS Lambda instance, please follow the instructions detailed in Build from Source.

AWS CLI

We shall use the AWS Command Line Interface (CLI) to deploy the main POB data pipeline AWS Lambda Spring Boot JAR file, that was created in the Build from Source stage above, to an AWS Lambda instance. To install the AWS CLI, please follow the instructions below:

The instructions below are for Ubuntu 20.04. Installation instructions for other Linux distributions and other operating systems such as Windows may be found at https://aws.amazon.com/cli.

# Install the required dependencies
$ sudo apt-get update 
$ sudo apt-get install glibc groff less

# Install the AWS CLI from a ZIP file
$ curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
$ unzip awscliv2.zip
$ sudo ./aws/install

Assuming that the AWS CLI has installed successfully, we can configure it with the Access Key ID and Secret Access Token of an IAM user with privileges to programmatically manage AWS Lambda instances (such as an IAM user provisioned with the AWSLambda_FullAccess AWS managed policy, or similar) as follows:

# Configure the AWS CLI
$ aws configure --profile pob

	AWS Access Key ID [None]: AKIA123456789
	AWS Secret Access Key [None]: abcdefg987654321hijklmnop
	Default region name [None]: eu-west-2
	Default output format [None]: json

AWS Lambda

We shall use the AWS Management Console to provision an AWS Lambda instance. To do so, navigate to the AWS Lambda service via the AWS Management Console, select "Create function" and follow the instructions below:

  1. Function Name - enter a custom function name that describes the purpose of the function, for example pob-main-data-pipeline.

  2. Runtime - the main POB data pipeline is a Java Spring Boot application. Thus please select "Java 11 (Corretto)" as the runtime environment.

Once configured, select "Create function" to create the new AWS Lambda instance.

EventBridge Trigger

The main POB data pipeline is periodically executed when deployed to a AWS Lambda instance by configuring an Amazon EventBridge (CloudWatch Events) trigger with a defined CRON schedule. To do this, open the AWS Lambda instance via the AWS Management Console and select "Add Trigger". From the list of trigger types, select "EventBridge (CloudWatch Events)" and check "Create a new rule". Next enter a rule name, rule description and select "Schedule expression" as the rule type. In the schedule expression text box, enter a CRON or rate expression. For example, to invoke and run the main POB data pipeline AWS Lambda instance every 30 minutes using a rate expression, you would enter rate(30 minutes). Finally press "Add" to add the new trigger.

For further information regarding AWS CloudWatch Events schedule expressions, please visit Amazon CloudWatch Scheduled Events.

Deployment

Please follow the instructions below to deploy the main POB data pipeline Spring Boot application to an AWS Lambda instance:

1) Create a new (empty) AWS Lambda instance configured with the Java 11 (Corretto) runtime via the AWS Management Console as detailed in the Setup section above. We shall call this AWS Lambda instance pob-main-data-pipeline for the purposes of these instructions. Once created, open this new AWS Lambda instance via the AWS Management Console, navigate to Configuration > General configuration, set its memory to 1024 MB and set its timeout to 5 min 0 sec.

2) Since we are deploying a Java Spring Boot application that utilises the Spring Cloud Function project, we need to configure the AWS Lambda instance with details of the main Java class to invoke as well as the name of the Java function that will be executed. To do this, open the AWS Lambda instance via the AWS Management Console, navigate to Configuration > Environment variables and set the following environment variables:

Environment Variable Name
Value

MAIN_CLASS

ai.hyperlearning.pob.apps.aws.lambda.data.pipelines.scheduler.MainPipelineSchedulerAwsLambdaApp

spring_cloud_function_definition

mainPipelineFunction

3) Next we need to configure the AWS Lambda instance with the fully qualified class name and method of the function handler. To do this, open the AWS Lambda instance via the AWS Management Console, navigate to Code and select the "Edit" button belonging to the "Runtime settings" section. In the "Handler" box enter org.springframework.cloud.function.adapter.aws.FunctionInvoker::handleRequest, and then press "Save".

4) Assuming that you are integrating POB with the AWS Secrets Manager as detailed here, we need to provide permission for the AWS Lambda instance to read secrets managed by AWS Secrets Manager. To do this, open the AWS Lambda instance via the AWS Management Console, navigate to Configuration > Permissions and select the execution role name (for example pob-main-data-pipeline-role-abc123). This will take you to the IAM Management Console for this role. Select Add permissions > Attach policies and attach the SecretsManagerReadWrite AWS managed policy to this role (or equivalent custom policy). Now when the AWS Lambda instance is invoked, externalised sensitive properties defined in the POB application configuration will be automatically loaded from AWS Secrets Manager.

5) We are now ready to deploy the packaged Java Spring application artifact to the AWS Lambda instance. Assuming that you have followed the instructions detailed in the Setup section above, navigate to $POB_BASE/pob-apps/pob-apps-aws/pob-aws-lambda-app-data-pipelines-scheduler and execute the following commands via your command line:

# Navigate to the relevant project folder
$ cd $POB_BASE/pob-apps/pob-apps-aws/pob-aws-lambda-app-data-pipelines-scheduler/target

# Upload the packaged JAR file to an Amazon S3 bucket
$ aws s3 cp pob-aws-lambda-app-data-pipelines-scheduler-2.0.0-aws.jar s3://pob-apps --profile pob

# Deploy the function code from Amazon S3 to the relevant AWS Lambda instance
$ aws lambda update-function-code --function-name pob-main-data-pipeline --s3-bucket pob-apps --s3-key pob-aws-lambda-app-data-pipelines-scheduler-2.0.0-aws.jar --profile pob

The main POB data pipeline will now be periodically triggered based on the CRON schedule defined in the EventBridge (CloudWatch Events) trigger.

Last updated