Case studiesInsights

All Jobs

Data Engineer - Remote

ITFull-timePermanent

About Us

Halo Labs is a future focused, end-to-end data solutions firm - transforming tomorrow, today. Our intelligent and secure technology systems and data driven solutions drive meaningful outcomes and unlock lasting value.

Why work at Halo Labs?

Leading Innovation: We don’t just solve problems; we illuminate a never-ending stream of innovation.

Exceptional Perks: Enjoy outstanding perks like a dedicated learning budget, performance bonuses, and comprehensive wellbeing support.

Remote-First Organisation: Experience the perks of remote work while having the flexibility to travel to client locations throughout Australia.

Inclusive and Engaging: Celebrate diversity in a welcoming space that thrives on new ideas and open conversations, all in a respectful environment.

Inspiring Origins: With a compelling founder story, we are a customer-focused, culture-first organisation.

About the role

• Design, develop, and maintain reusable in-house PySpark frameworks to enforce standardised data engineering patterns across the SaaS platform

• Architect and implement scalable, production-grade ETL/ELT pipelines across Azure and AWS environments

• Build distributed data processing solutions using Python and PySpark on Databricks

• Develop batch and near real-time ingestion pipelines integrating third-party clinical systems, healthcare APIs, and external enterprise platforms

• Design secure data integration patterns (REST APIs, SFTP, event-driven ingestion, webhooks) ensuring compliance and data integrity

• Work closely with Software Engineers to embed data services directly into the SaaS product architecture

• Contribute to backend system design discussions to ensure data layer scalability, observability, and performance

• Implement CI/CD pipelines using Azure DevOps, Git, and Azure Pipelines for automated deployment and testing of data workloads

• Apply infrastructure-as-code and environment management best practices across Azure and AWS

• Optimise Spark jobs, cluster configurations, and storage strategies for performance and cost efficiency

• Design and maintain robust data models, including dimensional models and SaaS-oriented data schemas

• Implement data validation, monitoring, and alerting to ensure pipeline reliability and production stability

• Provide technical mentorship and enforce engineering standards across the analytics and data engineering team

About you

• Strong hands-on experience with Azure services including Databricks, Azure Data Factory, Azure SQL, Azure Storage, and Azure DevOps

• Practical experience with AWS services relevant to modern data platforms (S3, Lambda, RDS, Glue, IAM, etc.)

• Advanced proficiency in Python, SQL, and PySpark for large-scale distributed data processing

• Deep experience configuring and managing Databricks clusters for scalable big data workloads

• Experience building production-ready data pipelines in a SaaS or product-led engineering environment

• Strong understanding of cloud-native data architecture, including data lakes, lakehouse architecture, and modular pipeline design

• Experience integrating with third-party systems via APIs and secure data exchange mechanisms

• Exposure to healthcare or regulated data environments, including handling sensitive data securely

• Strong knowledge of data modelling, metadata management, and data governance principles

• Experience implementing automated testing frameworks for data pipelines

• Solid understanding of DevOps practices including Git workflows, branching strategies, and CI/CD automation

• Degree in Computer Science, Engineering, Data Science, or related technical field


Apply for this job
Join our team

Do you have what it
takes to change the world

We are always looking for passionate, talented individuals to join our team and contribute 
to our vision. If you’re passionate about driving innovation, creating exceptional digital experiences, and making a real impact, we’d love to hear from you.