Detailed Information

Cited 0 time in webofscience Cited 5 time in scopus
Metadata Downloads

Deepview: Deep-Learning-Based Users Field of View Selection in 360 degrees Videos for Industrial Environments

Authors
Irfan, M[Irfan, Muhammad]Muhammad, K[Muhammad, Khan]Sajjad, M[Sajjad, Muhammad]Malik, KM[Malik, Khalid Mahmood]Cheikh, FA[Cheikh, Faouzi Alaya]Rodrigues, JJPC[Rodrigues, Joel J. P. C.]de Albuquerque, VHC[de Albuquerque, Victor Hugo C.]
Issue Date
15-Feb-2023
Publisher
IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
Keywords
Videos; Cameras; Visualization; Sports; Bandwidth; Feature extraction; Object detection; Augmented reality (AR) industry; deep learning; immersive videos; industry 4.0; IoT; saliency; view selection; virtual reality (VR)
Citation
IEEE INTERNET OF THINGS JOURNAL, v.10, no.4, pp.2903 - 2912
Indexed
SCIE
SCOPUS
Journal Title
IEEE INTERNET OF THINGS JOURNAL
Volume
10
Number
4
Start Page
2903
End Page
2912
URI
https://scholarx.skku.edu/handle/2021.sw.skku/102964
DOI
10.1109/JIOT.2021.3118003
ISSN
2327-4662
Abstract
The industrial demands of immersive videos for virtual reality/augmented reality applications are crescendo, where the video stream provides a choice to the user viewing object of interest with the illusion of "being there." However, in industry 4.0, streaming of such huge-sized video over the network consumes a tremendous amount of bandwidth, where the users are only interested in specific regions of the immersive videos. Furthermore, for delivering full excitement videos and minimizing the bandwidth consumption, the automatic selection of the user's Region of Interest in a 360 degrees video is very challenging because of subjectivity and difference in contentment. To tackle these challenges, we employ two efficient convolutional neural networks for salient object detection and memorability computation in a unified framework to find the most prominent portion of a 360 degrees video. The proposed system is four-fold: 1) preprocessing; 2) intelligent visual interest predictor; 3) final viewport selection; and 4) virtual camera steerer. First, an input 360 degrees video frame is split into three Field of Views (FoVs), each with a viewing angle of 120 degrees. Next, each FoV is passed to the object detection and memorability prediction model for visual interestingness computation. Furthermore, the FoV is supplied as a viewport, containing the most salient and memorable objects. Finally, a virtual camera steerer is designed using enriched salient features from YOLO and LSTM that are forwarded to the dense optical flow to follow the salient object inside the immersive video. Performance evaluation of the proposed system over our own collected data from various Websites as well as on public data sets indicates the effectiveness for diverse categories of 360 degrees videos and helps in the minimization of the bandwidth usage, making it suitable for industry 4.0 applications.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Computing and Informatics > Convergence > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher MUHAMMAD, KHAN photo

MUHAMMAD, KHAN
Computing and Informatics (Convergence)
Read more

Altmetrics

Total Views & Downloads

BROWSE