-
Creating complex data processing pipelines, as part of diverse, high-energy teams designing scalable implementations of the models developed by our Data Scientists.
-
Hands-on programming based on TDD, usually in a pair programming environment.
-
Deploying data pipelines in production based on Continuous Delivery practices create and maintains clear documentation on data models/schemas as well as transformation/validation rules.
-
Troubleshoot and remediate data quality issues raised by pipeline alerts or downstream consumers.
-
Engage with stakeholders to gather requirements to deliver data solutions.
-
Advising clients on the usage of different distributed storage and computing technologies from the plethora of options available in the ecosystem