Hi,
Hope you are doing well!!
I have an urgent requirement with one of my clients. Please find the job details below and forward me your updated resume along with your contact details at ujjwal.k@noviainfotech.com or Call Me at 972-903-8535
Position – Lead PySpark Engineer
Duration – 6 months
Location – Owings Mills, MD - onsite
MOI – Virtual Round
Job Description:
- We are seeking a highly experienced Lead PySpark Engineer to design, develop, and optimize large- scale distributed data processing systems.
- The ideal candidate will lead data engineering initiatives, mentor team members, and drive best practices for building scalable, high-performance data pipelines using PySpark and modern big data technologies.
- This role requires deep expertise in PySpark, distributed computing, and enterprise data architecture.
Key Responsibilities:
- Lead the design and development of scalable data pipelines using PySpark
- Architect and implement large-scale distributed data processing solutions
- Optimize Spark jobs for performance, scalability, and cost-efficiency
- Collaborate with data scientists, analysts, and business stakeholders
- Drive data modeling, transformation, and ETL/ELT best practices
- Provide technical leadership and mentor junior engineers
- Ensure data quality, governance, and security compliance
- Conduct code reviews and enforce engineering standards
- Troubleshoot and resolve complex production issues
- Contribute to CI/CD and DevOps practices for data engineering
Essential Skills (Must Have):
- 10+ years of overall IT experience
- 5+ years of hands-on experience with PySpark
- Strong expertise in Spark Core, Spark SQL, and Spark DataFrames
- Deep understanding of distributed computing concepts
- Experience with big data ecosystems (Hadoop, Hive, HDFS)
- Proficiency in Python programming
- Strong SQL skills
- Experience with data warehousing concepts and ETL processes
- Cloud platform experience (AWS / Azure / GCP)
- Experience working with large datasets (TB to PB scale)
- Strong debugging and performance tuning skills
Desirable Skills (Good to Have):
- Experience with Delta Lake / Iceberg
- Knowledge of Kafka or real-time streaming frameworks
- Experience with Databricks
- Exposure to Airflow or other orchestration tools
- CI/CD and DevOps pipeline experience
- Experience in data lake architecture
- Knowledge of containerization (Docker, Kubernetes)
- Understanding of data governance and compliance standards
Soft Skills:
- Strong leadership and mentoring capability
- Excellent problem-solving skills
- Ability to work in fast-paced environments
- Strong communication and stakeholder management skills
- Ownership mindset with attention to detail
You received this message because you are subscribed to the Google Groups "NoviaJobs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to noviajobs+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/noviajobs/CAOVi2ru8tA6_JynJjHg5aENkckYBKYdmB4EMj%3DpRZS-gun3P_Q%40mail.gmail.com.
No comments:
Post a Comment