Research Scientist - Image to Video

San Francisco
Permanent
150000 - 230000 per annum
Technology

Apply Now

About the Company

This company develops generative video models that allow you to create animated pictures with ease, incorporating your own existing audio or utilizing text-to-speech models. Having raised in excess of $10m and having caught the imagination with their first two foundational model releases, they are expanding the team in San Francisco.

About the Role

Generative AI Model Research and Development: Lead cutting-edge research in the development of generative AI models, particularly diffusion models, for applications in image, video, text, and audio synthesis. The role emphasizes creating models that deliver high fidelity, realism, and controllability in multi-modal outputs, such as dynamic avatars and visual effects.

Advanced Model Conditioning and Optimization: Develop and refine advanced conditioning techniques, leveraging methods such as prompt-tuning, latent space manipulation, and attention-based control mechanisms to enhance model precision, controllability, and adaptability for specific user-defined outputs.

High-Performance Distributed Training: Architect and manage large-scale training pipelines across distributed systems, utilizing frameworks like Ray, PyTorch Distributed, and Horovod. Handle multi-node, multi-GPU environments and optimize training for scalability on petabyte-scale datasets, ensuring efficient use of cloud and on-premise compute resources.

Core Responsibilities:

Generative Model Expertise: Deep expertise in training diffusion models, GANs, VAEs, or similar architectures, with a focus on video and image synthesis at high resolution and fidelity.
Experience working with transformer-based models and autoregressive architectures for generative tasks.
Knowledge of techniques for improving model convergence, stability, and generalization in large-scale environments.
Demonstrated ability to develop models that integrate user-driven customization, leveraging tools such as CLIP, T5, or similar language-to-vision models.

Technical Skills:

Mastery of deep learning libraries and frameworks like PyTorch, TensorFlow, Jax, or MXNet.
Strong background in advanced optimization techniques such as distributed stochastic gradient descent (SGD), mixed precision training, and model parallelism.
Experience with GPU acceleration and high-performance computing, including CUDA, TensorRT, and NCCL for optimizing large-scale model training.
Knowledge of computer vision techniques like super-resolution, motion estimation, and image segmentation, relevant to generative AI tasks.

Research and Innovation:

Experience publishing in top AI conferences (NeurIPS, CVPR, ICML) or contributing to open-source projects in the generative AI space.
A track record of applying state-of-the-art generative models in real-world applications, such as 3D rendering, visual content creation, or interactive avatars.

If you are working in generative AI and want to join an early-stage company with a lot of upside potential on the equity send your resume today.

This is an on-site position in San Francisco.

Apply Now