Enlighten HR Consulting

Lead Data Engineer - Python/Azure Databricks

Click Here to Apply

Job Location

Pune, India

Job Description

Are you passionate about building state-of-the-art data platforms and powering the next generation of compute and AI applications, we'd love to hear from you. This is an exciting opportunity to leverage your expertise in distributed computing frameworks to make a significant impact as we push the boundaries of supply chain planning and eventually adoption of Gen AI. JOB BRIEF : We are seeking an experienced Data Engineer to join our team and lead the development of a cutting-edge data platform. The platform will leverage distributed computing frameworks such as Apache Spark, Databricks, and Snowflake to enable near real time supply chain planning, eventually leading to advanced analytics, insights into data with the adoption of Generative AI (GenAI) technologies across our product base. KEY RESPONSIBILITIES : As a Lead Data Engineer, the candidate would be responsible for : - Design and build a highly scalable, fault-tolerant data platform optimized for distributed computing and large-scale data processing - Implement data pipelines and ETL/ELT processes using distributed computing frameworks to efficiently ingest, transform, and load massive datasets from various sources - Leverage cloud data platforms to enable seamless data sharing, near-zero maintenance, and fast analytics on structured and semi-structured data - Collaborate with data scientists, machine learning engineers, and software developers to understand data requirements and build solutions to power GenAI applications - Optimize distributed computing jobs and queries for maximum performance and cost efficiency - Implement data governance, security, and compliance best practices - Provide guidance on distributed computing architecture and mentor junior data engineers QUALIFICATIONS - 5 years of experience as a Data Engineer working with big data technologies. - Strong proficiency in SQL, Python programming and data modeling techniques. - Deep expertise in distributed computing principles and frameworks (e.g., Apache Spark), including SQL, streaming, and optimizing jobs for scale and efficiency. - Hands-on experience with developing and deploying distributed computing applications using cloud-based platforms (e.g., AWS EMR, Azure HDInsight, or equivalent). - Strong understanding of cloud data platform architectures and best practices for ELT/ETL, data sharing, and query optimization (e.g., AWS Athena, AWS Glue, Azure Synapse Analytics, or equivalent). - Experience enabling application engineers to build applications leveraging the data platform through APIs and abstractions. - Experience with orchestration frameworks like Apache Airflow and data streaming technologies like Kafka - Experience building and optimizing data pipelines for machine learning applications. - Knowledge of data modelling, data warehousing, and schema design. - Familiarity with public cloud platforms such as AWS, Azure, or GCP. - Excellent problem-solving and communication skills. - Bachelor's or Master's degree in Computer Science (Preferred), Engineering, or a related field EXPERIENCE : 5 Years (ref:hirist.tech)

Location: Pune, IN

Posted Date: 10/9/2024
Click Here to Apply
View More Enlighten HR Consulting Jobs

Contact Information

Contact Human Resources
Enlighten HR Consulting

Posted

October 9, 2024
UID: 4884800116

AboutJobs.com does not guarantee the validity or accuracy of the job information posted in this database. It is the job seeker's responsibility to independently review all posting companies, contracts and job offers.