提问者:小点点

使用read()方法从Amazon S3读取压缩形状文件时出现MemoryError


我是新来的AWS和Shapefiles所以请耐心听我说。

我目前正在尝试将Amazon S3中的形状文件读取到RDSPostgres数据库中。我正在努力读取该文件,因为它给我带来了内存错误。到目前为止,我已经尝试过了:

from io import StringIO, BytesIO
from zipfile import ZipFile
from urllib.request import urlopen

import shapefile
import geopandas as gpd
from shapely.geometry import shape  

import pandas as pd
import requests
import boto3
def lambda_handler(event,context):


    """Accessing the S3 buckets using boto3 client"""
    s3_client =boto3.client('s3')
    s3_bucket_name='myshapeawsbucket'
    s3 = boto3.resource('s3',aws_access_key_id="my_id", aws_secret_access_key="my_key")
        
    my_bucket=s3.Bucket(s3_bucket_name)
    bucket_list = []
    for file in my_bucket.objects.filter():
        print(file.key)
        bucket_list.append(file.key)
    for file in bucket_list:
        obj = s3.Object(s3_bucket_name,file)
        data=obj.get()['Body'].read()


    return {
        
        'message':"Success!"
    }

一旦代码尝试执行obj. get()['Body'].read()我就会收到以下错误:

Response
{
  "errorMessage": "",
  "errorType": "MemoryError",
  "stackTrace": [
    "  File \"/var/task/lambda_function.py\", line 27, in lambda_handler\n    data=obj.get()['Body'].read()\n",
    "  File \"/var/runtime/botocore/response.py\", line 82, in read\n    chunk = self._raw_stream.read(amt)\n",
    "  File \"/opt/python/urllib3/response.py\", line 518, in read\n    data = self._fp.read() if not fp_closed else b\"\"\n",
    "  File \"/var/lang/lib/python3.8/http/client.py\", line 472, in read\n    s = self._safe_read(self.length)\n",
    "  File \"/var/lang/lib/python3.8/http/client.py\", line 613, in _safe_read\n    data = self.fp.read(amt)\n"
  ]
}

Function Logs
START RequestId: 7b70c331-ad7a-4964-b841-91da345b5174 Version: $LATEST
Roads_shp.zip
[ERROR] MemoryError
Traceback (most recent call last):
  File "/var/task/lambda_function.py", line 27, in lambda_handler
    data=obj.get()['Body'].read()
  File "/var/runtime/botocore/response.py", line 82, in read
    chunk = self._raw_stream.read(amt)
  File "/opt/python/urllib3/response.py", line 518, in read
    data = self._fp.read() if not fp_closed else b""
  File "/var/lang/lib/python3.8/http/client.py", line 472, in read
    s = self._safe_read(self.length)
  File "/var/lang/lib/python3.8/http/client.py", line 613, in _safe_read
    data = self.fp.read(amt)END RequestId: 7b70c331-ad7a-4964-b841-91da345b5174
REPORT RequestId: 7b70c331-ad7a-4964-b841-91da345b5174  Duration: 3980.11 ms    Billed Duration: 3981 ms    Memory Size: 128 MB Max Memory Used: 128 MB Init Duration: 2334.01 ms

Request ID
7b70c331-ad7a-4964-b841-91da345b5174

我在这里遵循教程:从URL读取Shapefile到GeoPandas

我已经调查了这些问题,但找不到特定于ShapeFiles的答案。

我查看的链接:

  1. 使用read()方法从Amazon S3读取大JSON文件时出现MemoryError
  2. 从Amazon S3导入大尺寸的压缩JSON文件到AWSRDS-PostgreSQL使用Python

共1个答案

匿名用户

我自己已经解决了这个问题。这个问题与逻辑无关,而是AWS为lambda设置的RAM限制。https://aws.amazon.com/lambda/pricing/?icmpid=docs_console_unmapped.将内存从默认的128MB增加到您想要的任何内存