Data Engineer - Amazon - Hyderabad

Job description

DESCRIPTION

We’re looking for Data Engineers to help us grow our Data Lake, which is built using a serverless architecture, with 100% native AWS components including Redshift Spectrum, Athena, S3, Lambda, Glue, EMR, Kinesis, SNS, CloudWatch, and more!
Our Data Engineers build the ETL and analytics solutions for our internal customers to answer questions with data and drive critical improvements for the business. Our Data Engineers use best practices in software engineering, data management, data storage, data compute, and distributed systems. We are passionate about solving business problems with data!
Our team is part of the AWS Infrastructure organization, which is responsible for planning, building, and operating all of our data centers around the world. This includes the global supply chain for physical servers and components, networking gear, power equipment, etc. Lots of big and fascinating data if you’re interested in the world’s largest cloud computing infrastructure.
JOB DUTIES
· Develop and maintain automated ETL pipelines for big data using scripting languages such as Python, Spark, SQL and AWS services such as S3, Glue, Lambda, SNS, SQS, KMS. Example: ETL jobs that process a continuous flow of JSON source files and output the data in a business-friendly Parquet format that can be efficiently queried via Redshift Spectrum using SQL to answer business question.
· Develop and maintain automated ETL monitoring and alarming solutions using Python, Spark, SQL, and AWS services such as CloudWatch and Lambda.
· Implement and support reporting and analytics infrastructure for internal business customers using AWS, services such Athena, Redshift, Spectrum, EMR, and QuickSight.
· Develop and maintain data security and permissions solutions for enterprise scale data warehouse and data lake implementations including data encryption and database user access controls and logging.
· Develop data objects for business analytics using data modeling techniques.
· Develop and optimize data warehouse and data lake tables using best practice for DDL, physical and logical tables, data partitioning, compression, and parallelization.
· Develop and maintain data warehouse and data lake metadata, data catalog, and user documentation for internal business customers.
· Help internal business customers develop, troubleshoot, and optimize complex SQL and ETL solutions to solve reporting, metrics, and analytics problems.
· Work with internal business customers and software development teams to gather and document requirements for data publishing and data consumption via data warehouse, data lake, and analytics solutions.
· Develop, test, and deploy code using internal software development toolsets. This includes the code for deploying infrastructure and solutions for secure data storage, ETL pipelines, data catalog, and data query.

PREFERRED QUALIFICATIONS

· Experience building enterprise-scale data warehouse and data lake solutions end-to-end.
· Knowledgeable about a variety of strategies for ingesting, modelling, processing, and persisting data.
· Experience with native AWS technologies for data and analytics such as Redshift Spectrum, Athena, S3, Lambda, Glue, EMR, Kinesis, SNS, CloudWatch, etc.
· Write secure, stable, testable, maintainable code with minimal defects.
· Meets/exceeds Amazon’s functional/technical depth and complexity for this role
· Meets/exceeds Amazon’s leadership principles requirements for this role

Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer, and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, disability, age, or other legally protected status.

Desired profile

BASIC QUALIFICATIONS

· Bachelor's degree in Computer Science, Info Systems, Business, or related field.
· 2+ years of experience in distributed system concepts from a data storage and compute perspective (e.g. data lake architectures).
· 2+ years of experience with one or more query languages (e.g. SQL), schema definition languages (e.g. DDL), and scripting languages (e.g. Python) to build data solutions.

Offers “Amazon”

Job description

Desired profile