This job has been archived and is no longer active.
Senior Data Engineer
5000 - 8000$
As we transition to a modern big data infrastructure, PySpark plays a critical role in powering high-performance data processing. We are seeking a Data engineer/ PySpark expert to optimize data pipelines, enhance processing efficiency, and drive cost-effective cloud operations. This role will have a direct impact on scalability, performance, and real-time data processing, ensuring the company remains competitive in data-driven markets.
You’ll be working closely with a Data Platform Architect and a newly formed team of four Data Engineers based in India (GMT+5:30) and one Data Engineer in Uzbekistan (GMT+5). Additionally, we're planning to hire two more Senior Data Engineers in Georgialater this year.In this role, you’ll report to the CTO, who is based in the GMT-8 time zone, and the VP of Engineering (EDT/EST).
Position Details
- Role: Senior Data Engineer
- Location: Remote (We’re looking for candidates based in Georgia, Romania, and the Czech Republic only)
- Employment: Service Agreement (B2B contract; you’ll need a legal entity to sign)
- Start Date: ASAP
- Salary: $5,500 - $8,000 USD per month GROSS (fixed income, paid via SWIFT)
- Working Hours: 11 AM to 7 PM local time. No night or weekend work is expected
- Time Overlaps: Sync ups with RnD( Puna, India) in GMT+5:30 and devs in GMT-5, plus occasional meetings with the VP of Engineering in EST/EDT and the CTO in GMT-8.
- Equipment: The company will provide a laptop.
What You’ll Be Doing
- Optimize Data Processing Pipelines: Fine-tune PySpark jobs for maximum performance, scalability, and cost efficiency, enabling smooth real-time and batch data processing.
- Modernize Legacy Systems: Drive the migration from traditional .NET, C#, and relational database systems to a modern big data tech stack.
- Build Scalable ETL Pipelines: Design and maintain robust ETL/ELT workflows capable of handling large volumes of data within our Bronze/Silver/Gold data lake architecture.
- Enhance Apache Spark Workloads: Apply best practices such as memory tuning, efficient partitioning, and caching to optimize Spark jobs.
- Leverage Cloud Platforms: Use AWS EMR, Databricks, and other cloud services to support scalable, low-maintenance, high-performance analytics environments.
- Balance Cost & Performance: Continuously monitor resource usage, optimize Spark cluster configurations, and manage cloud spend without compromising availability.
- Support Real-Time Data Streaming: Contribute to event-driven architectures by developing and maintaining real-time streaming data pipelines.
- Collaborate Across Teams: Partner closely with data scientists, ML engineers, integration specialists, and developers to prepare and optimize data assets.
- Enforce Best Practices: Implement strong data governance, security, and compliance policies to ensure data integrity and protection.
- Drive Innovation: Participate in global initiatives to advance supply chain technology and real-time decision-making capabilities.
- Mentor Junior Engineers: Share your knowledge of PySpark, distributed systems, and scalable architectures to help develop the team’s capabilities.
Experience & Expertise:
- 5+ years as a Data Engineer, with solid experience in big data ecosystems.
- 7+ years of hands-on AWS experience is a must, including deep familiarity with EMR, IAM, VPC, EKS, ALB, and Lambda.
- Cloud experience beyond AWS (GCP or Azure) is a strong plus.
- Proficiency with Python (including data structures and algorithms), SQL, and data modeling.
- Strong expertise in distributed computing frameworks, particularly Apache Spark and Airflow.
- Experience with streaming technologies such as Kafka.
- Proven track record optimizing Spark jobs for scalability, reliability, and performance.
- Familiarity with cloud-native ETL/ELT workflows, data sharing techniques, and query optimization (e.g., AWS Athena, Glue, Databricks).
- Experience with complex business logic implementation and enabling application engineers through APIs and abstractions.
- Solid understanding of data modeling, warehousing, and schema design.
Soft Skills:
- Strong problem-solving skills and proactive communication.
- Fluent English - B2 and higher (both written and verbal).
Preferred Skills & Certifications:
- Familiarity with .NET applications structure and deployment.
- Relevant cloud certifications (AWS Solutions Architect, Developer, Big Data Specialty).
- Certifications or proven experience in Databricks, Apache Spark, Apache Airflow, and data modeling are a plus.
Recruitment Process
- # 1 Initial Interview: Up to 1 hour with HR or/and including a self-assessment form (Click to fill out the form). If you prefer, you can skip the call and discuss all questions and details in writing instead. Just let us know!
- # 2 Managerial Interview (Optional):30-60 minutes (You will meet with the CTO to learn more about the company, the position, and future plans directly from the source.)
- # 3 Test Assignment: up to 113 minutes on iMocha platform (Data Structures - Graph data structure, Array and String manipulation - All in Python, with a few MCQ questions on Spark)
- # 4 Technical Interview: Platform/Application Architect: up to 1h..FAQ – Technical Interview Format + Key Domains Covered - Show more
- # 5 Offer & Paperwork: Up to 30 minutes with the CTO to finalize conditions and complete necessary paperwork.
- # 6 Onboarding: Get ready to join the team and start your journey!
Published on: 6/6/2025

Blue Ridge Global
An American SaaS company that has been helping large retailers, distributors, and manufacturers for over 10 years to forecast demand, manage inventory, and improve operational processes.
- 220+ clients worldwide (primarily in the US and Norway)
- 20%+ annual growth
- A leader in G2 rankings in categories such as Demand Planning, Supply Chain Planning, and others




