Why do my boto3 / python AWS Glue Jobs Time Out?
Luna Ricci
Last Update 2 年前
A Connection timeout occurs when running an AWS Glue Job
ConnectionTimeoutError: Connect timeout on endpoint URL: https://glue.region.amazonaws.com
This error may occur when a python ETL job written using the boto 3 libraries tries to create a Glue session object e.g.
session = boto3.Session(region_name='eu-west-2')
glue = session.client('glue')
Solution
Examine the AWS Glue Job Log and Error Log in Cloudwatch logs to determine whether there is any other alternative reason why the ETL job may have failed.
1. Check the IAM role that is configured on the job has the appropriate permissions to access AWS Glue services
2. Check the subnet ID of the ETL job corresponds to a public subnet. Communication with the Glue APIs often happens over the public internet (especially if using a Glue Development Endpoint) - check that the ETL is running in a public subnet if it needs access to public services or APIs. If it cannot be in a public subnet, use a NAT gateway or NAT instance instead. If your endpoint definitely needs to remain in a private subnet, you must ensure that all services it must call are accessible within the private subnet of the VPC.
This solution is for Python Boto 3 ETL scripts run inside AWS Glue
Want to know more about StackZone and how to make your cloud management simple and secure?
Check our how it works section with easy to follow videos or just create your own StackZone Account here.