This job has expired and no longer accepts applications.

ML / AI Engineer (Video Generation)

KazakhstanOn-site

We are building Cursor for Video Generation — an AI-native product that reimagines how videos are created, edited, and produced using natural language and intelligent models.

We’re looking for an ML/AI Engineer who is not only technically excellent, but also creative, collaborative, and deeply curious about how video production actually works. This role sits at the intersection of machine learning, creative workflows, and video production.

You will work closely with video editors, content creators, prompt engineers, and product teams to translate creative intent and post-production workflows into scalable, intelligent systems.

What you will work on

Product & Model Development

Design, train, and iterate on ML models for video generation, editing, and transformation, including:
- Video synthesis and generation
- Scene detection, cuts, transitions
- Style, pacing, and narrative coherence
- Text-to-video and prompt-based editing workflows
Own the end-to-end ML lifecycle: research → prototyping → training → evaluation → deployment.
Collaborate with product and engineering to turn models into fast, reliable, user-facing features.

Creative & Cross-Functional Collaboration

Work hands-on with creative teams (video editors, motion designers, producers) to:
- Understand real-world video production and post-production workflows
- Learn how editors think about timing, rhythm, storytelling, and visual language
Translate creative feedback and artistic intent into technical requirements, model improvements, and system design.
Act as a bridge between creativity and engineering, ensuring the AI enhances — not replaces — creative control.

Research & Innovation

Stay current with advances in:
- Video diffusion models
- Multimodal models (text, video, audio)
- Temporal modeling and long-range coherence
Experiment rapidly and push the boundaries of what AI-assisted video creation can feel like.

Your must haves

***(**You don’t need to meet every single requirement to be a strong candidate — if you’re excellent in a few of these areas and eager to grow, we’d still love to hear from you. )

Technical Skills

Proven track record of training and deploying ML models into production (experience with large-scale vision (especially diffusion models), NLP, or multimodal systems is a big plus).
Strong skills in model training, optimization, and evaluation, with hands-on experience in distributed training and multi-GPU systems is a big plus.
Proficiency with Python and ML frameworks (PyTorch preferred).
Experience deploying ML systems into production environments.
Solid understanding of model evaluation, tradeoffs, and performance constraints.

Creative & Domain Understanding

Genuine interest in video creation, editing, and storytelling.
Familiarity with video concepts such as:
- Cuts, transitions, pacing, timelines
- Post-production workflows
- Social vs long-form video formats
Ability to reason about subjective qualities like “feel,” “flow,” and “style,” and translate them into technical approaches.

Soft Skills

Excellent communication and collaboration skills.
Comfortable working with non-technical creative stakeholders.
Ability to listen deeply, ask the right questions, and synthesize ambiguous input.
High creativity and product intuition — you care about how the tool feels to use, not just accuracy metrics.
Ownership mindset: you take ideas from zero to shipped.