2 views
Recruit Alliance
About the Company
Headquartered and Israeli-based technology company with R&D in Kyiv Ukraine.
The company ensures automated, real-time discovery, mapping, and tracking of sensitive personal data flow.
The company has designed an AI-based sustainable data discovery and management platform, which is called Inventa,
to ensure the privacy, security, and governance of data. Our target market is large distributed, hybrid customers that hold
petabytes of information in different structures and forms in different locations — on-prem and cloud.
About the product The Product has the ability to deliver a very accurate master catalog of sensitive data usage
to allow businesses to manage data security/compliance to complement their infrastructure-based security/compliance programs.
It is a fully automated solution that covers data in any format, be it structured or unstructured, data-in-motion or data-at-rest,
both known or unknown. It covers all aspects of data processing in one place and aggregates that into a master catalog
containing all the customers’ or employee’s information.
Description:
Our Data Trainer / Data Scientist generates and maintains high quality datasets across different business domains and ensure that
all samples are well-written, technically sounds, and useful for the end users. By researching, and creating targeted content for AI/software developers,
QA teams, and field engineers he/she will be an essential component of improving our data driven solutions and expanding our business offer.
Responsibilities
• Сreate data sets that represent customers’ data for training of modules, QA team,developers, and field engineers
• Develop sensitive data elements extractions for the product and custom ones per customer needs
Requirements:
• Extensive experience in developing complex ETL pipelines for data analytics and dataset preparation for machine learning development and QA testing,
ideally containing natural language text and patterns
• Experience in creating, maintaining, and serving well organised datasets containing both tabular data and unstructured documents
• Proficient in Python and industry standard NLP and analytics tools such as pandas, numpy, Gensim, spaCy, NLTK, SQL and NoSQL databases
• Meticulous attention for data quality, keen understanding of business needs across different industry domains, and drive for continuous improvement of data-driven solutions
• Ability to write high quality modular code in a collaborative environment, and contributing to code reviews within the team
• Experience in working with software developers, product managers, and business stakeholder towards integration of data solutions and refinement of business requirements
• Effective communication and ability to write clear and well organized software and data documentation
Nice to have
• machine learning and artificial intelligence expertise, as well as an understanding of the software development process and the integration of machine learning models into the company’s product line.
• have an understanding of the software development process, including version control, testing and implementation through CI/CD pipelines (Continuous Integration/Continuous Deployment).
• engage in the development, implementation and optimization of machine learning models, as well as the infrastructure for their training and deployment
Benefits
• Friendly and highly professional atmosphere, laptop or workstation.
• Benefits package including competitive salary and option plan
• Unlimited access to the Well-being platform “Rozumiu” for the employee and thei
• Paid 14 sick leave days, 20 vacation and national holiday days.