BMEKOALM702_EN: Subject Datasheet

Subject Datasheet

	Budapest University of Technology and Economics
	Faculty of Transportation Engineering and Vehicle Engineering

1. Subject name	Machine vision
2. Subject name in Hungarian	Gépi látás
3. Code	BMEKOALM702	4. Evaluation type	mid-term grade	5. Credits	4
6. Weekly contact hours	2 (28) Lecture	0 (0) Practice	2 (28) Lab
7. Curriculum	Autonomous Vehicle Control Engineering MSc (A)	8. Role	Mandatory (mc) at Autonomous Vehicle Control Engineering MSc (A)
9. Working hours for fulfilling the requirements of the subject					120
Contact hours	56	Preparation for seminars	16	Homework	20
Reading written materials	18	Midterm preparation	10	Exam preparation	0
10. Department	Department of Material Handling and Logistics Systems
11. Responsible lecturer	Dr. Szirányi Tamás
12. Lecturers	Dr. Szirányi Tamás, Rózsa Zoltán
13. Prerequisites
14. Description of lectures
Machine vision is the most important measure of intelligent road transport. Allows you to track the complex movement and traffic participants, continuously analyze situations and locations. The processing and semantic evaluation of the video stream extracted through the camera gives basic information to the autonomous driving. The subject is about capturing, analyzing and interpreting visual information: extracting high-level image descriptors from lower-level visual characteristics. - Machine vision in the society of autonomous robots (e.g. autonomous driving): technology, devices, system requirements, software tools and environment; overview of main tasks and related mathematical and algorithmic background; summary of basic image processing methods applied in the following. - Shape representation and description (regions, active contours, shape description, region decomposition, superpixel); definitions of shapes in 2D, 3D and 3D point-clouds. - Scale Space axioms of image understanding (Lindeberg’s edge/ridge definition: multiscale segmentation and sceletonization, SIFT and similar feature detectors, anisotropic diffusion, RANSAC fitting) - Energy optimization based image analysis (Markov Random Field, simulated annealing, region segmentation) for remote sensing and change detection; MRF as preprocessing in motion segmentation and active layer in Deep Convolutional Neural Nets. - Deconvolution: Wiener filter, iteration based deconvolution, and Bayesian-based Lucy-Richardson blind-deconvolution, super-resolution. - Video processing and analysis; Background/ foreground/ Shadow segmentation (mixture of Gaussian models, shadow models, foreground fitting); Motion Analysis (Optical flow, interest point detection and tracking, video tracking); - Pattern recognition in 2D and 3D (Statistical-, Neural-, Syntactic- pattern recognition, graph based comparison); Principal Component Analysis; Kernel Methods; - Biometrical personal identification for human-computer interactions: face-, hand-, finger-, and gesture-recognition; camera-based eye-tracking and saliency definitions, attention detection in short; - Image- and video-features; Generating and using annotated data sets: training-, test-and validation-sets. Content based image- and video-analysis, -indexing and –retrieval; the curse of dimensionality; - Reconstruction of the scanned environment from monocular and multiple-view vision; Image based Simultaneous Localization and Mapping (I-SLAM) for automatic driving localization. - Multimodal/multiview fusion: fusion of sensors and cameras of different positions and spectra: optical-, infra- and depth-cameras. Motion tracking in multiple-view; Traffic surveillance and control from street cameras and on-board moving devices. - Hidden Markov Models: speech and motion based recognition; pedestrian- and vehicle- detection and tracking; event detection: behaviour of the surrounding pedestrians and vehicles. - Deep learning structures for image based driving assistance: Recurrent neural networks; Ways to make neural networks generalize better. Combining multiple neural networks to improve generalization. Learning issues. - Novel pattern recognition structures: Convolutional Neural Networks, Hopfield nets, Boltzmann machines, Deep Neural Networks with generative pre-training. Modeling hierarchical structures with neural nets. Examples: pedestrian detection and vehicle analysis. - Demonstration of the participants’ project development during the semester.
15. Description of practices

16. Description of labortory practices
Computer exercises; MATLAB programming
17. Learning outcomes
A. Knowledge knows advanced image processing algorithms knows three-dimensional shape recognition methods is familiar with environmental reconstruction technologies is familiar with modern, neural network-based approaches to image processing B. Skills design of image object and shape recognition algorithm can see the architectural issues of a machine vision system is able to select a suitable tool and algorithm for a given task. C. Attitudes open to learn about modern vision systems open to automatic use of machine vision in vehicle control D. Autonomy and Responsibility can participate in image processing projects independently or in a team is able to design a vision system that meets the given task and safety requirement
18. Requirements, way to determine a grade (obtain a signature)
Requirements: continuous comletion of lab tasks, two successful midterm tests and an accepted individual homework. Final grade is the average of the two midterm tests.
19. Opportunity for repeat/retake and delayed completion
One midterm test can be retried, the homework can be delayed completed.
20. Learning materials
Lecture Notes
Effective date	10 October 2019	This Subject Datasheet is valid for		Inactive courses