Deep Learning-Based Mistake Detection in Assembly Tasks



M.Sc. Zeyun Zhong


zu vergeben

Möglicher Beginn:

ab sofort


Despite the rapid development of technology, human operations still play a significant role in the current industrial workplace, with a marked impact on the quality of assembly and maintenance tasks. These tasks are inherently complex and error-prone, affecting the quality of the final product and potentially leading to safety issues. For instance, a tired operator might incorrectly assemble parts, significantly increasing the likelihood of errors. Early detection of such mistakes is crucial, as errors need to be identified before they exacerbate, potentially leading to chaotic or dangerous situations in the assembly or maintenance processes.


This thesis explores the application of predictive analytics in manufacturing, focusing specifically on machine learning techniques that predict future assembly actions and monitor their execution in real-time. By leveraging historical data to predict and compare future correct actions against actual operations, the thesis aims to develop a proactive fault detection mechanism that alerts and corrects errors instantaneously, thus enhancing the quality control processes within the manufacturing industry.



  • Conduct a comprehensive review of existing research on mistake detection.
  • Design and train machine learning models to predict the sequence of correct assembly actions based on historical data.
  • Implement discrepancy detection algorithms that identify deviations from predicted correct actions and flag these as errors.
  • Validate the model's performance using standard metrics such as accuracy, precision, recall, and F1-score.



[1] Sener, F., Chatterjee, D., Shelepov, D., He, K., Singhania, D., Wang, R., & Yao, A. (2022). Assembly101: A large-scale multi-view video dataset for understanding procedural activities. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 21096-21106).

[2] Zheng, H., Lee, R., & Lu, Y. (2024). HA-ViD: A Human Assembly Video Dataset for Comprehensive Assembly Knowledge Understanding. Advances in Neural Information Processing Systems36.



  • Subject: computer science, mathematics, electrical engineering, applied physics, mechatronics with good programming skills
  • Willingness to familiarize yourself with new topics and enjoy bringing in your own ideas
  • Good English or German speaking and writing skills, ability to work independently, and strong analytical skills
  • Good understanding of deep learning basics and experience with DL projects


We offer

  • Intensive support and a pleasant working atmosphere in a creative team of motivated scientists
  • Possibility of a subsequent job as a research assistant in order to further deepen the knowledge acquired
  • Development of joint publications



Please send an email to with your transcript of records.