DataFlow Hub
Concepts
Glossary
Tools
Interview Prep
Cheatsheet
Roadmap
Home
›
Interview Prep
›
AWS Glue
AWS Glue Interview Questions
15 curated questions with detailed answers, covering beginner to advanced concepts.
4 Beginner
8 Intermediate
3 Advanced
all
beginner
intermediate
advanced
all
cost-optimization
concepts
coding
performance
system-design
orchestration
Showing 15 of 15 questions
Q1.
What are the cost optimization strategies for AWS Glue?
HOT
advanced
⌄
Q2.
What is the AWS Glue Data Catalog and why is it important?
HOT
beginner
⌄
Q3.
What is a Glue Job Bookmark and when would you use it?
HOT
intermediate
⌄
Q4.
What is the difference between a DynamicFrame and a Spark DataFrame in Glue? When would you use each?
HOT
intermediate
⌄
Q5.
How do you optimize an AWS Glue job that is running slowly?
HOT
intermediate
⌄
Q6.
Design a CDC pipeline using AWS Glue that replicates changes from RDS PostgreSQL to S3 in near real-time.
HOT
advanced
⌄
Q7.
How do you handle small file problems in AWS Glue and S3?
HOT
advanced
⌄
Q8.
What is AWS Glue and how is it different from traditional ETL tools like Informatica or SSIS?
HOT
beginner
⌄
Q9.
How do you connect AWS Glue to an RDS or on-premise database?
intermediate
⌄
Q10.
What is a Glue Crawler and what does it do?
beginner
⌄
Q11.
What is a DPU in AWS Glue and how do you choose the right number?
beginner
⌄
Q12.
What are Glue Workflows and how are they different from Glue Triggers?
intermediate
⌄
Q13.
What is Glue Studio and when would you use it over writing PySpark scripts manually?
intermediate
⌄
Q14.
How does AWS Glue handle schema evolution? What happens when a new column is added to the source?
intermediate
⌄
Q15.
How do you pass parameters to a Glue job and access them in the script?
intermediate
⌄