Data Engineer in Numlabs - Data Science Services

Posted more than 30 days ago

2 views

Numlabs - Data Science Services

0 reviews

Without experience

Full-time work

Experience : Medior with 3+ years of relevant experience Responsibilities: Pipeline Development: Design, build, and optimize data pipelines for model training and inference systems. Collaborate with data scientists and machine learning engineers to ensure efficient data preparation and feature engineering. Scaled System Development: Architect and implement scalable systems for model inference, data retrieval, and acquisition to support high-performance AI applications.. Optimize performa

Experience : Medior with 3+ years of relevant experience

Responsibilities:

Pipeline Development:

Design, build, and optimize data pipelines for model training and inference systems.
Collaborate with data scientists and machine learning engineers to ensure efficient data preparation and feature engineering.

Scaled System Development:

Architect and implement scalable systems for model inference, data retrieval, and acquisition to support high-performance AI applications..
Optimize performance and reliability to handle large-scale data processing.

Observability:

Develop robust logging and monitoring solutions for AI systems.
Ensure traceability, debugging, and performance monitoring across the AI platform.

Data Management:

Work seamlessly with both structured and unstructured data sources to support diverse AI initiatives.
Ensure all data engineering practices comply with company policies and industry regulations for data security and privacy.

Innovation :

Stay abreast of the latest advancements in data engineering and AI technologies to continuously improve our systems and processes.

Requirements:

Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field.
Advanced Python knowledge for data processing and scripting.
Hands-on experience with one or more cloud computing platforms (Azure, AWS, GCP).
Hands-on experience with big data technologies and distributed computing frameworks.
Proficiency in RDBMS/NoSQL data stores and appropriate use cases.
Experience with Data as Code ; version control, small and regular commits, unit tests, CI/CD, packaging, familiarity with containerization tools such as Docker and Kubernetes is a plus.
Understanding of AI/ML principles and practices, including model training, inference, and deployment.
Experience with Infrastructure as Code is a plus.
Strong problem-solving skills and attention to detail.
Good communication skills, fluent English.

Expected start date: September 1st, 2024

Remote vs Onsite: Remote, with possible occasional in person team sessions / workshops / gatherings (i.e. 1x quarter) likely to take place in Prague

HackerRank challenge: Yes

Working overlap needed: 9-6/10-7 CET possibility of a wider overlap (flexibility) appreciated

Salary : 100-150/h B2B

Bachelor’s or Master’s degree in Computer Science, Data Science, or a related field.
Advanced Python knowledge for data processing and scripting.
Hands-on experience with one or more cloud computing platforms (Azure, AWS, GCP).
Hands-on experience with big data technologies and distributed computing frameworks.
Proficiency in RDBMS/NoSQL data stores and appropriate use cases.
Experience with Data as Code ; version control, small and regular commits, unit tests, CI/CD, packaging, familiarity with containerization tools such as Docker and Kubernetes is a plus.
Understanding of AI/ML principles and practices, including model training, inference, and deployment.
Experience with Infrastructure as Code is a plus.
Strong problem-solving skills and attention to detail.
Good communication skills, fluent English.