PhD, FHEA, RAEng Research Fellow

I am a Royal Academy of Engineering Research Fellow in the Centre for Vision, Speech and Signal Processing, University of Surrey working in 4D Vision for perceptive machines. The emergence of machines that interact with their environment has led to an increasing demand for automatic visual understanding of real-world scenes. My research exploits Artificial Intelligence (AI) to better understand complex scenes so that machines can efficiently model and interpret real-world for a range of socially beneficial applications including autonomous systems, augmented reality and healthcare.

I am currently a Lecturer in Computer Vision and AI at CVSSP, University of Surrey, where I finished PhD in 'General dynamic scene reconstruction from multi-view videos' in 2016, supervised by Prof. Adrian Hilton. I have previously worked at Samsung Research Institute, Bangalore, India for 3 years (2010 - 2013) in Computer Vision. In 2010 I received M.Tech. degree from the Indian Institute of Technology (IIT), Kanpur, India supervised by Prof. K.S. Venkatesh in Human Computer Interaction. Please refer to CV for more information.

Research interests: 3D/4D Computer Vision, Scene Understanding, Generative AI, LLMs, Multi-view Performance Capture and Immersive Technologies.


-->

Research


Projects

CoSTAR National Lab for R&D in Creative Technology
Arts and Humanities Research Council (AHRC)
Oct 2023 - Sep 2033

AI4ME: AI for Personalised Media Experiences
Prosperity Partnership with BBC Research and Development
Oct 2021 - Sep 2026
Website

4D Vision for Perceptive Machines
Royal Academy of Engineering funded project
Aug 2018 - Jul 2023

ALIVE: Live Action Light Fields for Immersive VR Experiences
Innovate UK project in collaboration with Figment Productions and Foundry
October 2016 - December 2017
Figment Productions Foundry Video

IMPART: Intelligent Management Platform for Advanced Real-Time media processes
EU FP7 project in collaboration with Filmlight, Double Negative, AUTH, UPF and BUT
July 2013 - October 2015
Website Video



Publications

CAD - Contextual Multi-modal Alignment for Dynamic AVQA
Asmar Nadeem, Adrian Hilton, Robert Dawes, Graham Thomas, and Armin Mustafa
at IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024
Website

DTF-AT: Decoupled Time-Frequency Audio Transformer for Event Classification
Tony Alex, Sara Ahmed, Armin Mustafa, Muhammad Rana and Philip Jackson
at The 38th Annual AAAI Conference on Artificial Intelligence 2024

UPGPT: Universal Diffusion Model for Person Image Generation, Editing and Pose Transfer
Soon Cheong, Armin Mustafa, and Andrew Gilbert
at International Conference in Computer Vision Workshops (ICCVW) 2023
Website Video

PAT: Position-Aware Transformer for Dense Multi-Label Action Detection
Faegheh Sardari, Armin Mustafa, Philip JB Jackson, and Adrian Hilto
at International Conference in Computer Vision Workshops (ICCVW) 2023
Website

SEM-POS: Grammatically and Semantically Correct Video Captioning
Asmar Nadeem, Adrian Hilton, Robert Dawes, Graham Thomas, and Armin Mustafa
at IEEE International Conference in Computer Vision & Pattern Recognition Workshops (CVPRW) 2023
Website

KPE: Keypoint Pose Encoding for Transformer-based Image Generation
Soon Cheong, Armin Mustafa, and Andrew Gilbert
at The 33rd British Machine Vision Conference(BMCV) 2022
Website Video

4D Temporally Coherent Multi-Person Semantic Reconstruction and Segmentation
Armin Mustafa, Chris Russell, and Adrian Hilton
at International Journal in Computer Vision (IJCV) 2022
Website

Multi-person Implicit Reconstruction from a Single Image
Armin Mustafa, Akin Caliskan, Lourdes Agapito, and Adrian Hilton
at IEEE International Conference in Computer Vision & Pattern Recognition (CVPR) 2021
Website Video

SILT: Self-supervised Lighting Transfer Using Implicit Image Decomposition
Nikolina Kubiak, Armin Mustafa, Graeme Phillipson, Stephen Jolly, and Simon Hadfield
at The 32nd British Machine Vision Conference(BMCV) 2021
Website

Temporal Consistency Loss for High Resolution Textured and Clothed 3D Human Reconstruction From Monocular Video
Akin Caliskan, Armin Mustafa, and Adrian Hilton
at IEEE International Conference in Computer Vision & Pattern Recognition Workshops (CVPRW) 2021
Website

Temporally Coherent General Dynamic Scene Reconstruction
Armin Mustafa, Marco Volino, Hansung Kim, Jean-Yves Guillemaut, and Adrian Hilton
at International Journal in Computer Vision (IJCV) 2021
Website Video

Multi-View Consistency Loss for Improved Single-Image 3D Reconstruction of Clothed People
Akin Caliskan, Armin Mustafa, Evren Imre, and Adrian Hilton
at The Asian Conference on Computer Vision (ACCV), 2020
Website

A* 3D Dataset: Towards Autonomous Driving in Challenging Environments
Quang-Hieu Pham, Pierre Sevestre, Ramanpreet Singh Pahwa, Huijing Zhan, Chun Ho Pang, Yuda Chen, Armin Mustafa, Vijay Chandrasekhar, and Jie Lin
at IEEE International Conference on Robotics and Automation (ICRA) 2020
Website

Light Field Video for Immersive Content Production (Book Chapter)
Marco Volino, Armin Mustafa, Jean-Yves Guillemaut and Adrian Hilton
in Real VR - Digitial Immersive Reality 2020
Website

U4D: Unsupervised 4D Dynamic Scene Understanding
Armin Mustafa, Chris Russell, and Adrian Hilton
at International Conference on Computer Vision (ICCV) 2019
Website Video

Semantically Coherent 4D Scene Flow of Dynamic Scenes
Armin Mustafa and Adrian Hilton
at International Journal in Computer Vision (IJCV) 2019
Website

Learning Dense Wide Baseline Stereo Matching for People
Akin Caliskan, Armin Mustafa, Evren Imre, and Adrian Hilton
at International Conference on Computer Vision Workshop (ICCVW) 2019
Website

Light Field Compression using Eigen Textures
Marco Volino, Armin Mustafa, Jean-Yves Guillemaut and Adrian Hilton
at 3D Vision (3DV) 2019
Website

MSFD: Multi-scale segmentation based feature detection for wide-baseline scene reconstruction
Armin Mustafa, Hansung Kim, and Adrian Hilton
at IEEE Transactions in Image Processing (TIP) 2018
Website

4D Temporally Coherent Dynamic Light-field Video (Spotlight)
Armin Mustafa, Marco Volino, Jean-Yves Guillemaut and Adrian Hilton
at 3D Vision (3DV) 2017
Website Video

Semantically Coherent Co-segmentation and Reconstruction of Dynamic Scenes
Armin Mustafa and Adrian Hilton
at IEEE International Conference in Computer Vision & Pattern Recognition (CVPR) 2017
Website Video

Temporally Coherent 4D Reconstruction of Complex Dynamic Scenes (Oral)
Armin Mustafa, Hansung Kim, Jean-Yves Guillemaut and Adrian Hilton
at IEEE International Conference in Computer Vision & Pattern Recognition (CVPR) 2016
Website Video

4D Match Trees for Non-rigid Surface Alignment
Armin Mustafa, Hansung Kim and Adrian Hilton
at The 14th European Conference on Computer Vision (ECCV) 2016
Website Video

General Dynamic Scene Reconstruction from Multiple View Video
Armin Mustafa, Hansung Kim, Jean-Yves Guillemaut and Adrian Hilton
at International Conference on Computer Vision (ICCV) 2015
Website Video

Segmentation based Features for Wide-baseline Multi-view Reconstruction (Oral)
Armin Mustafa, Hansung Kim, Evren Imre and Adrian Hilton
at 3D Vision (3DV) 2015
Website Video

Multi Finger Gesture Recognition and Classification in Dynamic Environment under Varying Illumination upon Arbitrary Background (Book chapter)
Armin Mustafa and K.S. Venkatesh
in Speech, Image, and Language Processing for Human Computer Interaction: Multi-Modal Advancements 2012
Website

Background Reflectance Modeling for Robust Finger Gesture Detection in Highly Dynamic Illumination
Armin Mustafa and K.S. Venkatesh
at In International Conference on Hybrid Information Technology, volume 6935 of Lecture Notes in Computer Science 2011
Website

Patents

Method and apparatus for converting 2d video to 3d video
U.S. and Korean Patent 2013
Website

A Method and System for Human Gesture based on Visual Contour Analysis
Armin Mustafa and K.S. Venkatesh
Indian Patent 2010
Website

Service

British Machine Vision Conference (BMVC 2023) , Aberdeen, UK
Workshop Chair

The 20th European Conference on Visual Media Production (CVMP 2023) , London, UK
Conference Co-Chair

DynaVis - The Fourth International Workshop on Dynamic Scene Reconstruction @ CVPR 2023 Vancouver, Canada
Organiser
Organisers: Armin Mustafa, Marco Volino, Dan Casas, Christian Richardt and Adrian Hilton

British Machine Vision Conference (BMVC 2022) , London, UK
Programme and Workshop Chair

The 19th European Conference on Visual Media Production (CVMP 2022) , London, UK
Full Papers Chair

International Conference on 3D Vision (3DV 2021) , Online
Publication Chair

The 18th European Conference on Visual Media Production (CVMP 2021) , London, UK
Short Papers Chair

DynaVis - The Third International Workshop on Dynamic Scene Reconstruction @ CVPR 2021 , Online
Organiser
Organisers: Armin Mustafa, Marco Volino, Michael Zollhofer, Dan Casas, Christian Richardt and Adrian Hilton

International Conference on 3D Vision (3DV 2020) , Online
Area Chair

The 17th European Conference on Visual Media Production (CVMP 2020) , Online
Public Relations Chair

DynaVis - The Second International Workshop on Dynamic Scene Reconstruction @ CVPR 2020 , Seattle, Washington
Organiser
Organisers: Armin Mustafa, Marco Volino, Michael Zollhofer, Dan Casas, Christian Richardt and Adrian Hilton

DynaVis - The First International Workshop on Dynamic Scene Reconstruction @ CVPR 2019 , Long Beach, California
Organiser
Organisers: Armin Mustafa, Marco Volino, Michael Zollhofer, Dan Casas and Adrian Hilton

BMVA symposium on Dynamic Scene Reconstruction , London
Organiser
Organisers: Jean-Yves Guillemaut, Armin Mustafa and Marco Volino

Reviewer
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
IEEE/CVF International Conference on Computer Vision (ICCV)
European Conference on Computer Vision (ECCV)
British Machine Vision Conference (BMVC)
International Conference on 3D Vision (3DV)
EUROGRAPHICS
IEEE Transactions in Image Processing (IEEE TIP)
IEEE Transactions in Multimedia
European Conference for Visual Media Production (CVMP)
Computer Vision and Image Understanding (CVIU)
IET Computer Vision
3D Reconstruction in the Wild (3DRW)

Talks

  4D Vision for Scene Understanding, 28 Feb 2020
  at University of Birmingham, UK

  Unsupervised 4D Dynamic Scene Understanding, 23 Jan 2020
  at Queen Mary University London, UK

  Unsupervised 4D Dynamic Scene Understanding, 28 Oct 2019
  at Moving Cameras Workshop, ICCV, Seoul, South Korea

  Semantically Coherent 4D Scene Flow of Dynamic Scenes, 19 Jul 2019
  at BMVA Meeting on Geometry and Deep Learning, London, UK

  Understanding real-world scenes for human-like machine perception, 03 Jul 2019
  at EPSRC Network+ Human-Like Computing Machine Intelligence Workshop (MI21-HLC), Windsor, UK

  4D Vision for Dynamic Scene Understanding, 24 May 2019
  at Agency for Science, Technology and Research (A*STAR), Singapore

  4D Vision for Dynamic Scene Understanding, 30 November 2018
  at Sixth ASLLA Symposium on Human Understanding through AI: Days of Future Present, by KIST, South Korea

  General Dynamic Scene Understanding, 12 November 2018
  at University of Portsmouth, UK

  4D Vision for Creative Industries, 07 November 2018
  at How can the creative industries use AI?, by Machine Intelligence Garage, Digital Catapult, UK

  Semantic Reconstruction of Dynamic Scenes, 26 February 2018
  at C Design Lab, Purdue University, Indiana, US

  General Dynamic Scene Reconstruction, 04 October 2017
  at Microsoft Research, Cambridge, UK

  General Dynamic Scene Reconstruction, 22 August 2017
  at Adobe Research, San Jose, US

  Semantic Scene Reconstruction, 21 June 2017
  at BMVA Meeting on Dynamic Scene Reconstruction, London, UK

  General 4D Scene Reconstruction, 25 April 2016
  at Symposium on Computer Vision and Video Effects Generation, by Rank Prize Funds, UK


CV

Experience

  Royal Academy of Engineering Research Fellow: Aug' 2018 - Present
  Centre for Vision, Speech and Signal Processing, University of Surrey, UK

  Lecturer in Computer Vision and Artificial Intelligence: Oct' 2021 - Present
  Centre for Vision, Speech and Signal Processing, University of Surrey, UK

  Visiting Researcher: Jan' 2020 - Nov' 2020, Host: Prof. Lourdes Agapito
  University College London (UCL), UK

  Visiting Researcher: May' 2019 - Oct' 2019, Host: Dr. Vijay Chandrasekhar
  Agency for Science, Technology and Research (A*STAR), Singapore

  Senior Research Fellow: Aug' 2018 - Sep' 2021
  Centre for Vision, Speech and Signal Processing, University of Surrey, UK

  Research Fellow: Oct' 2016 - Jul' 2018
  Centre for Vision, Speech and Signal Processing, University of Surrey, UK

  Lead Engineer: Apr’ 2012 – May’ 2013
  Samsung Research Institute India, Bangalore, India

  Senior Software Engineer: Aug’ 2010 – Mar' 2012
  Samsung Research Institute India, Bangalore, India

Education and Certificates

  Graduate Certificate of Learning and Teaching: Jan' 2018 – Sep’ 2019
  Higher Education Academy, University of Surrey, UK

  Oxford Artificial Intelligence Programme: May' 2019 – Jul’ 2019
  Said Business School, University of Oxford, UK

  PhD: Jul' 2013 – Dec’ 2016
  Centre for Vision, Speech and Signal Processing, University of Surrey, UK
  Thesis: General Dynamic 4D Scene Reconstruction from Multi-view Videos
  Advisor: Prof. Adrian Hilton

  M.Tech: Aug’ 2008 – June’ 2010
  Computer vision lab, Indian Institute of Technology, Kanpur, India
  Thesis: Finger Gesture Recognition in Dynamic Environment under Varying Illumination
  Advisor: Prof. K.S. Venkatesh

  B.E.: Aug’ 2004 – May’ 2008
  Institute of Engineering and Technology, Indore, India


For detailed CV please contact me.