Thursday, February 26, 2026

Lead PySpark Engineer for Owings Mills, MD-onsite

Hi,

 

Hope you are doing well!!

 

I have an urgent requirement with one of my clients. Please find the job details below and forward me your updated resume along with your contact details at ujjwal.k@noviainfotech.com or Call Me at 972-903-8535

 

Position – Lead PySpark Engineer

Duration – 6 months

Location – Owings Mills, MD - onsite

MOI – Virtual Round

 

Job Description:

 

  • We are seeking a highly experienced Lead PySpark Engineer to design, develop, and optimize large- scale distributed data processing systems.
  • The ideal candidate will lead data engineering initiatives, mentor team members, and drive best practices for building scalable, high-performance data pipelines using PySpark and modern big data technologies.
  • This role requires deep expertise in PySpark, distributed computing, and enterprise data architecture.

 

Key Responsibilities:

 

  • Lead the design and development of scalable data pipelines using PySpark
  • Architect and implement large-scale distributed data processing solutions
  • Optimize Spark jobs for performance, scalability, and cost-efficiency
  • Collaborate with data scientists, analysts, and business stakeholders
  • Drive data modeling, transformation, and ETL/ELT best practices
  • Provide technical leadership and mentor junior engineers
  • Ensure data quality, governance, and security compliance
  • Conduct code reviews and enforce engineering standards
  • Troubleshoot and resolve complex production issues
  • Contribute to CI/CD and DevOps practices for data engineering

 

Essential Skills (Must Have):

 

  • 10+ years of overall IT experience
  • 5+ years of hands-on experience with PySpark
  • Strong expertise in Spark Core, Spark SQL, and Spark DataFrames
  • Deep understanding of distributed computing concepts
  • Experience with big data ecosystems (Hadoop, Hive, HDFS)
  • Proficiency in Python programming
  • Strong SQL skills
  • Experience with data warehousing concepts and ETL processes
  • Cloud platform experience (AWS / Azure / GCP)
  • Experience working with large datasets (TB to PB scale)
  • Strong debugging and performance tuning skills

 

Desirable Skills (Good to Have):

 

  • Experience with Delta Lake / Iceberg
  • Knowledge of Kafka or real-time streaming frameworks
  • Experience with Databricks
  • Exposure to Airflow or other orchestration tools
  • CI/CD and DevOps pipeline experience
  • Experience in data lake architecture
  • Knowledge of containerization (Docker, Kubernetes)
  • Understanding of data governance and compliance standards

 

Soft Skills:

 

  • Strong leadership and mentoring capability
  • Excellent problem-solving skills
  • Ability to work in fast-paced environments
  • Strong communication and stakeholder management skills
  • Ownership mindset with attention to detail

--
You received this message because you are subscribed to the Google Groups "NoviaJobs" group.
To unsubscribe from this group and stop receiving emails from it, send an email to noviajobs+unsubscribe@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/noviajobs/CAOVi2ru8tA6_JynJjHg5aENkckYBKYdmB4EMj%3DpRZS-gun3P_Q%40mail.gmail.com.

No comments:

Post a Comment

(Need local to GA only) - Java Senior Developer

No HTML content