cloud

What is Glue Job Bookmark?

An AWS Glue feature that tracks which data has already been processed so subsequent job runs only process new or changed data, enabling efficient incremental loads.

Detailed Explanation

Job Bookmarks work by recording the S3 object ETags and modification timestamps (or JDBC sequence numbers) of data processed in the last successful run. On the next run, Glue reads only objects newer than the bookmark position. Bookmarks can be reset via the AWS CLI or console when you need to reprocess all data.

Code Example

Examplebash
aws glue reset-job-bookmark --job-name my-etl-job
AWS GlueincrementalETLS3