cloud
What is Glue Job Bookmark?
An AWS Glue feature that tracks which data has already been processed so subsequent job runs only process new or changed data, enabling efficient incremental loads.
Detailed Explanation
Job Bookmarks work by recording the S3 object ETags and modification timestamps (or JDBC sequence numbers) of data processed in the last successful run. On the next run, Glue reads only objects newer than the bookmark position. Bookmarks can be reset via the AWS CLI or console when you need to reprocess all data.
Code Example
Examplebash
aws glue reset-job-bookmark --job-name my-etl-job
AWS GlueincrementalETLS3