ML code vs AWS Lambda limits

ML code vs AWS Lambda limitsSergeiBlockedUnblockFollowFollowingMar 13I was recently involved in a project where we had to solve the problem of incorporating ML usage — so decided to try the more modern approach of ‘serverless architecture’.

With a serverless approach all you need to do is write a function and then specify whether its name will be triggered on an event or an API request.

Everything else needed to make a scalable HA infrastructure is something for the service provider to solve and AWS has a specific such service called ‘Lambda’.

On one hand, AWS Lambda looks (and is) pretty easy to simply write a function code, but on other hand, AWS Lambda also has limitation in the way of resources usage.

It’s difficult to imagine a production code without any third-party dependencies and ML dependencies are usually very heavy, so the limits of AWS Lambda’s 250Mb package size (including layers) may not be enough to accommodate all that is required for ML code to work.

We also found an issue where we were stuck getting the deployment error: “An error occurred: Function code combined with layers exceeds the maximum allowed size of 262144000 bytes.

”.

We couldn’t abandon the use of dependence in favor of any lightweight analogue since lightweight analogues aren’t as powerful.

In order to avoid the package size limitations we decided to use a (case-based) workaround where AWS Lambda would try to reuse the same container for the function involvement — this keeps the launched container with loaded-to-memory objects for several minutes and only afterwards starts a new container.

AWS Lambda suggests using 500Mb of storage for the/tmp directory, and we decided to install heavy dependencies in function runtime in case they were absent – but to install these from scratch via pip can be a long process since it requires a native modules compilation in each installation.

We created a zip archive with preinstalled & precompiled dependencies which AWS Lambda downloads and unzips each time each it is necessary.

Now, let’s see how this can be done using the NLP library ‘spacy’.

You should know, first of all, that you will need to make archives in the same environment used by AWS Lambda because native modules should be compiled in the same OS where they will be executed.

The AWS Lambda docker containers can be found in ‘lambci/lambda’.

# launch aws lambda docker$ docker run –rm -v /tmp/spacy:/spacy -w /spacy -it lambci/lambda:build-python3.

7 /bin/bash# use virtual environmentbash-4.

2# virtualenv .

venvbash-4.

2# .

.

venv/bin/activate# install spacy & english model(.

venv) bash-4.

2# pip install https://github.

com/explosion/spacy-models/releases/download/en_core_web_sm-2.

0.

0/en_core_web_sm-2.

0.

0.

tar.

gz# remove python-compiled modules(.

venv) bash-4.

2# find .

-name *.

pyc -exec rm -f {} ;(.

venv) bash-4.

2# find .

-name __pycache__ -exec rm -rf {} ;# get total size of spacy dependencies(.

venv) bash-4.

2# du -h -d0 .

venv/lib/python3.

7/site-packages/348M .

venv/lib/python3.

7/site-packages/# pack dependencies to zip archive(.

venv) bash-4.

2# cd .

venv/lib/python3.

7/site-packages/(.

venv) bash-4.

2# zip -9r /spacy/deps.

zip .

# get total size dependencies archive(.

venv) bash-4.

2# ls -sh /spacy/deps.

zip97M /spacy/deps.

zipNow, the archive should be uploaded to S3 where from it can be downloaded to Lambda.

I would recommend automating this part with a ‘terraform’ or ‘serverless’ toolkit.

Now let’s see how the usage code may look like:And the spacy activation is:It also should be noted that Lambda performance depends on RAM settings — by default 128Mb, but it can be increased to 3008Mb.

The CPU allocated to function is proportional to the memory configured.

After several experiments with RAM values we found that 2GB does the trick, and increasing this value barely affects the dependencies installation time.

Currently the installation time is less than 5 seconds.

Let’s take a look at the Lambda duration chart:Peaks indicate the Lambda containers switching and the cold start of new one.

If involvement occurs frequently, Lambda reuses the same container as long as possible.

Only when idle for approximately 5 minutes does it begin a new container, so we found this solution seems reasonable for cases when Lambda function calls occur frequently.

This method does still has size limitations for extra heavy libraries like ‘tensorflow’, where the size is nearly 500Mb, because 500Mb of /tmp should fit zipped archive files and unzipped files.

Many kudos for text review & comments to David Lorbiecke.

.

. More details

Leave a Reply