In Deep Learning in Action: Image and Video Processing for Practical Use, the journey begins with a comprehensive exploration of artificial intelligence and its evolution, laying the groundwork with fundamental concepts in deep learning. The power of deep learning unfolds through an indepth examination of neural networks, particularly convolutional neural networks (CNNs), pivotal for tasks like object detection and classification. Autoencoders and their role in computer vision applications, alongside the integration of deep learning into embedded systems, underscore the transformative potential in real-world scenarios. Understanding the nuances of image and video data, from formats to processing methods, sets the stage for practical applications across diverse industries, bridging theoretical concepts with pragmatic solutions. Each chapter unfolds with a strategic focus: from enhancing safety measures through AI-powered video analysis in scenarios like fire detection and social distancing enforcement to leveraging deep learning for tasks as nuanced as fingerprint image restoration and medical image analysis.

Convolutional neural network.
A CNN is a type of DL model designed for image processing and recognition tasks. At its core, a CNN employs convolutional layers to learn spatial hierarchies of features automatically and adaptively from input images (Yamashita et al., 2018). This hierarchical learning is crucial for capturing complex patterns and representations present in visual data. Unlike traditional neural networks, CNNs use convolutional filters to scan input images, allowing them to identify local patterns such as edges, textures, colors, and shapes. These filters are applied across the entire image, enabling the network to recognize these patterns irrespective of their location. Additionally, CNNs often include pooling layers to reduce spatial dimensions and computational complexity. The combination of convolutional and pooling layers makes CNNs robust to variations in scale, orientation, and position of the input features. The learned features are flattened and passed through fully connected layers to make predictions. CNNs have demonstrated remarkable success in various computer vision tasks, including image classification, object detection, and facial recognition. Their ability to automatically learn hierarchical representations makes them highly effective in extracting meaningful features from visual data, contributing to their widespread use in diverse applications across industries such as healthcare, autonomous vehicles, and security systems. As a powerful tool in the realm of DL, CNNs continue to drive advancements in image understanding and recognition.
CNNs represent a groundbreaking advancement in the field of DL, particularly tailored for tasks involving image analysis. The architecture of CNNs draws inspiration from the human visual system. implementing a sophisticated hierarchy of layers to automatically learn and discern intricate patterns within images. The core components of a CNN include convolutional layers, pooling layers, and fully connected layers. Convolutional layers employ filters that traverse input images, enabling the network to capture spatial hierarchies of features such as edges, textures, and complex patterns. This ability to extract meaningful features from raw pixel data distinguishes CNNs from traditional neural networks, making them exceptionally proficient in image-related tasks.
Contents.
Foreword.
Preface.
Acknowledgments.
Nomenclature.
CHAPTER 1 Introduction.
1.1 Overview.
1.1.1 Definition and evolution of artificial intelligence.
1.1.2 Machine learning fundamental.
1.1.3 Machine learning types.
1.2 The power of deep learning.
1.2.1 Deep learning fundamental.
1.2.2 Deep neural network.
1.2.3 Convolutional neural network.
1.2.4 Overview of object detection, classification, and segmentation.
1.2.5 Autoencoders.
1.2.6 Computer vision applications.
1.2.7 Deep learning for embedded systems.
1.3 Understanding image and video data.
1.3.1 Overview of the importance of image and video data in various real-world applications.
1.3.2 The evolution of visual processing.
1.3.3 The differentiating between image and video data.
1.3.4 The common image formats and their properties.
1.4 Importance of real-world applications.
1.4.1 Bridging the gap between theory and practice.
1.4.2 Solving real-world challenges.
1.4.3 Industry relevance and innovation.
1.5 Book outline.
References.
CHAPTER 2 Image analysis for surveillance: detecting fire and smoke incidents.
2.1 State-of-the-art video-based fire/smoke detection.
2.1.1 Extraction of the regional proposal.
2.1.2 Run-time object detection.
2.1.3 Experiment set up and results.
2.1.4 Size of detected regions.
2.1.5 Testing the R-CNN with Raspberry Pi.
2.2 Real-time fire and smoke detection.
2.2.1 Methodology and results.
2.2.2 Execution the protype on NVIDIA Jetson Nano.
References.
CHAPTER 3 Enhancing COVID-19 safety measures with AI-powered video analysis.
3.1 Introduction.
3.2 Related work.
3.3 Social distancing using YOLOv2.
3.3.1 Social distancing workflow.
3.3.2 YOLOv2 architecture.
3.3.3 Experimental setup.
3.3.4 Euclidean formula for measuring the distance.
3.3.5 Results and discussion.
3.4 Social distancing with YOLOv4-tiny.
3.4.1 The proposed approach.
3.4.2 Violation threshold.
3.4.3 Bird’s-eye view transformation.
3.4.4 Experiment setup and results.
3.5 Algorithms implementation on the embedded system.
3.5.1 Social distancing on NVIDIA devices.
3.5.2 Distributed video infrastructure for social distancing.
3.6 Integrated approach for monitoring social distancing, face mask, and facial temperature measurement.
3.6.1 Dataset and annotation of face masks.
3.6.2 Dataset and annotation of facial temperature.
3.6.3 Experimental setup and results.
3.6.4 Implementation of the integrated algorithms on NVIDIA platforms.
References.
CHAPTER 4 Deep learning approaches for fingerprint image restoration.
4.1 Background on biometric identification.
4.1.1 Deep learning related work for fingerprint.
4.2 Feature extraction.
4.3 Fingerprint dataset.
4.3.1 Dataset I.
4.3.2 Dataset II.
4.3.3 Dataset III.
4.3.4 Dataset IV.
4.4 Sparse autoencoder for image reconstruction.
4.4.1 Sparse autoencoder model.
4.4.2 Preprocessing the image.
4.4.3 Algorithm description.
4.4.4 Experiment setup.
4.4.5 Efficiency and parameter sensitivity.
4.5 Recreating fingerprint images by convolutional neural network.
4.5.1 Convolution neural network for image reconstruction.
4.5.2 Convolutional neural network algorithm design.
4.5.3 Training and validation.
4.6 Experiment results and discussion.
4.6.1 Evaluation.
4.6.2 Comparative analysis.
References.
CHAPTER 5 Deep learning for classification and localization of multiple abnormalities on chest X-ray images.
5.1 Overview of diagnosis on medical images.
5.2 Literature review.
5.3 Computer vision for medical image processing.
5.3.1 Disease detection and diagnosis.
5.3.2 Anomaly identification.
5.3.3 Automated radiological measurements.
5.3.4 Image enhancement and reconstruction.
5.3.5 Workflow optimization.
5.3.6 Convolutional neural network for feature extraction for medical images.
5.3.7 Role of deep learning in medical imaging.
5.4 Dataset description.
5.4.1 COVID-19 radiography database.
5.4.2 COVID-19 SIIM-FISABIO-RSNA COVID-19 dataset.
5.5 Method.
5.5.1 Multiclassification of abnormalities on chest X-ray images.
5.5.2 Localization of abnormalities on chest X-ray images.
5.5.3 Ensembled models for enhancing localization of abnormalities.
References.
CHAPTER 6 Real-time stroke detection based on deep learning and federated learning.
6.1 Background.
6.1.1 Stroke as a critical health issue.
6.1.2 Current challenges in stroke detection.
6.1.3 The need for real-time detection.
6.2 Federated learning for healthcare.
6.2.1 Understanding federated learning.
6.2.2 Privacy and security concerns in federated learning.
6.2.3 Benefits and challenges.
6.3 Real-time stroke detection system.
6.3.1 Design and architecture.
6.3.2 Data preprocessing and augmentation.
6.3.3 Distributed model and federated learning.
6.3.4 Training setup and optimization.
6.3.5 Experiment results and discussion.
6.4 Implementation on NVIDA platforms.
6.4.1 The importance of NVIDIA GPUs for real-time inference.
6.4.2 NVIDIA computation cost analysis.
References.
CHAPTER 7 Efficient identification of bag-breakup in continuous airflow via video analysis.
7.1 Overview of bag break detection.
7.1.1 Bag-breakup in continuous airflow.
7.1.2 Factors affecting bag-breakup.
7.1.3 Importance of bag-breakup detection in automotive safety.
7.1.4 Challenges in identifying bag-breakup.
7.2 Methodology.
7.2.1 Dataset collection and description.
7.2.2 Data preprocessing.
7.2.3 The proposed deep-learning models.
7.2.4 Training and validation.
7.3 Experimental results.
7.3.1 Evaluation metrices.
7.3.2 Comparative analysis.
7.3.3 Error analysis and misdetection cases.
References.
CHAPTER 8 Conclusions and recommendations.
Glossary.
Index.
Бесплатно скачать электронную книгу в удобном формате, смотреть и читать:
Скачать книгу Deep Learning in Action, Image and Video Processing for Practical Use, Elhanashi A., Saponara S., 2025 - fileskachat.com, быстрое и бесплатное скачивание.
Скачать pdf
Ниже можно купить эту книгу, если она есть в продаже, и похожие книги по лучшей цене со скидкой с доставкой по всей России.Купить книги
Скачать - pdf - Яндекс.Диск.
Дата публикации:
Теги: учебник по информатике :: информатика :: компьютеры :: Elhanashi :: Saponara
Смотрите также учебники, книги и учебные материалы:
Предыдущие статьи:








