Moving object detection is one of the challenging problems in video monitoring systems, especially when the illumination changes and shadow exists. Amethod for real-time moving object detection is described. Anew back...Moving object detection is one of the challenging problems in video monitoring systems, especially when the illumination changes and shadow exists. Amethod for real-time moving object detection is described. Anew background model is proposed to handle the illumination varition problem. With optical flow technology and background subtraction, a moving object is extracted quickly and accurately. An effective shadow elimination algorithm based on color features is used to refine the moving obj ects. Experimental results demonstrate that the proposed method can update the background exactly and quickly along with the varition of illumination, and the shadow can be eliminated effectively. The proposed algorithm is a real-time one which the foundation for further object recognition and understanding of video mum'toting systems.展开更多
Automatic video mosaicking is a challenging task in computer vision. Current researches consider either panoramic or mapping tasks on short videos. In this paper, an automatic mosaicking algorithm is proposed for both...Automatic video mosaicking is a challenging task in computer vision. Current researches consider either panoramic or mapping tasks on short videos. In this paper, an automatic mosaicking algorithm is proposed for both mapping and panoramic tasks based on the adapted key-frame on videos of any length.The speeded up robust features(SURF) and the grid motion statistic(GMS) algorithm are used for feature extraction and matching between consecutive frames, which are used to compute the transformation. In order to reduce the influence of the accumulated error during image stitching, an evaluation metric is put forward for the transformation matrix. Besides, a self-growth method is employed to stitch the global image for long videos. The algorithm is evaluated by using aerial-view and panoramic videos respectively on the graphic processing unit(GPU) device, which can satisfy the real-time requirement. The experimental results demonstrate that the proposed algorithm is able to achieve a better performance than the state-of-art.展开更多
Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are co...Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.展开更多
Vestibulo-ocular reflex(VOR) is an important biological reflex that controls eye movement to ensure clear vision while the head is in motion.Nowadays,VOR measurement is commonly done with a video head impulse test bas...Vestibulo-ocular reflex(VOR) is an important biological reflex that controls eye movement to ensure clear vision while the head is in motion.Nowadays,VOR measurement is commonly done with a video head impulse test based on a velocity gain algorithm or a position gain algorithm,in which velocity gain is a VOR calculation on head and eye velocity,whereas position gain is calculated from head and eye position.The aim of this work is first to compare the two algorithms' performance and to detect covert catch-up saccade,then to propose a stand-alone recommendation application for the patient's diagnosis.In the first experiment,for ipsilesional and contralesional sides,the calculated position gain(0.94±0.17) is higher than velocity gain(0.84±0.19).Moreover,gain asymmetry of both lesion and intact sides using velocity gain is mostly higher than that from using position gain(four out of five subjects).Consequently,for subjects who have unilateral vestibular neuritis diagnosed from clinical symptoms and a vestibular function test,vestibular weakness is depicted by velocity gain much better than by position gain.Covert catch-up saccade and position gain then are used as inputs for recommendation applications.展开更多
Multiplicative multifractal process could well modal video traffic. The multiplier distributions in the multiplicatire multifractal model for video traffic are investigated and it is found that Gaussian is not suitabl...Multiplicative multifractal process could well modal video traffic. The multiplier distributions in the multiplicatire multifractal model for video traffic are investigated and it is found that Gaussian is not suitable for describing the multipliers on the small time scales. A new statistical distribution-symmetric Pareto distribution is introduced. It is applied instead of Gaussian for the multipliers on those scales. Based on that, the algorithm is updated so that symmetric pareto distribution and Gaussian distribution are used to model video traffic but on different time scales. The simulation results demonstrate that the algorithm could model video traffic more accurately.展开更多
A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow ...A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow removal, tracking, and object classification. The Gaussian mixture model was utilized to extract the moving object from an image sequence segmented by the mean-shift technique in the pre-processing module. Shadow removal was used to alleviate the negative impact of the shadow to the detected objects. A model-free method was adopted to identify pedestrians. The maximum and minimum integration methods were developed to integrate multiple cues into the mean-shift algorithm and the initial tracking iteration with the competent integrated probability distribution map for object tracking. A simple but effective algorithm was proposed to handle full occlusion cases. The system was tested using real traffic videos from different sites. The results of the test confirm that the system is reliable and has an overall accuracy of over 85%.展开更多
The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduce...The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.展开更多
Video processing is one challenge in collecting vehicle trajectories from unmanned aerial vehicle(UAV) and road boundary estimation is one way to improve the video processing algorithms. However, current methods do no...Video processing is one challenge in collecting vehicle trajectories from unmanned aerial vehicle(UAV) and road boundary estimation is one way to improve the video processing algorithms. However, current methods do not work well for low volume road, which is not well-marked and with noises such as vehicle tracks. A fusion-based method termed Dempster-Shafer-based road detection(DSRD) is proposed to address this issue. This method detects road boundary by combining multiple information sources using Dempster-Shafer theory(DST). In order to test the performance of the proposed method, two field experiments were conducted, one of which was on a highway partially covered by snow and another was on a dense traffic highway. The results show that DSRD is robust and accurate, whose detection rates are 100% and 99.8% compared with manual detection results. Then, DSRD is adopted to improve UAV video processing algorithm, and the vehicle detection and tracking rate are improved by 2.7% and 5.5%,respectively. Also, the computation time has decreased by 5% and 8.3% for two experiments, respectively.展开更多
Automatic image classification is the first step toward semantic understanding of an object in the computer vision area.The key challenge of problem for accurate object recognition is the ability to extract the robust...Automatic image classification is the first step toward semantic understanding of an object in the computer vision area.The key challenge of problem for accurate object recognition is the ability to extract the robust features from various viewpoint images and rapidly calculate similarity between features in the image database or video stream.In order to solve these problems,an effective and rapid image classification method was presented for the object recognition based on the video learning technique.The optical-flow and RANSAC algorithm were used to acquire scene images from each video sequence.After the selection of scene images,the local maximum points on comer of object around local area were found using the Harris comer detection algorithm and the several attributes from local block around each feature point were calculated by using scale invariant feature transform (SIFT) for extracting local descriptor.Finally,the extracted local descriptor was learned to the three-dimensional pyramid match kernel.Experimental results show that our method can extract features in various multi-viewpoint images from query video and calculate a similarity between a query image and images in the database.展开更多
Streaming video is becoming increasingly popular among Internet multimedia applications. A robust coding scheme for DCT-based scalable video streaming over the Internet is proposed in this paper. Compared with convent...Streaming video is becoming increasingly popular among Internet multimedia applications. A robust coding scheme for DCT-based scalable video streaming over the Internet is proposed in this paper. Compared with conventional MPEG4-FGS (fine granular scalable) and progressive FGS(PFGS), the proposed method generates the base layer including some sub-base layers by DCT coefficient reordering and VLC reshuffling, which enables the video stream of to adapt itself to long-term bandwidth time-varying of channel. Furthermore, a novel end-to-end transmission architecture for scalable video streaming over the Internet is also presented, in which an adaptive unequal packet loss protection (AUPLP) strategy is proposed to determine the currently available network bandwidth and adjust the sending rates according to different situations, such as network congestion or unreliable transmission. Experimental results show that the proposed progressive scalable scheme can improve the average coding efficiency up to 1.2 dB compared with MPEG4-FGS and PFGS in lower bandwidth, and the AUPLP strategy can improve the transmitting performances not only of the proposed scheme, but also of MPEG4-FGS, PFGS system.展开更多
An approach to detection of moving objects in video sequences, with application to video surveillance is presented. The algorithm combines two kinds of change points, which are detected from the region-based frame dif...An approach to detection of moving objects in video sequences, with application to video surveillance is presented. The algorithm combines two kinds of change points, which are detected from the region-based frame difference and adjusted background subtraction. An adaptive threshold technique is employed to automatically choose the threshold value to segment the moving objects from the still background. And experiment results show that the algorithm is effective and efficient in practical situations. Furthermore, the algorithm is robust to the effects of the changing of lighting condition and can be applied for video surveillance system.展开更多
The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated ...The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated VBR video traffic is made. Different methods to estimate stability parameter a and self-similar parameter H are compared. Processes to generate the linear fractional stable noise (LFSN) and the alpha stable random variables are provided. Model construction and the quantitative comparisons with fractional Brown motion (FBM) and real traffic are also examined. Open problems and future directions are also given with thoughtful discussions.展开更多
To combat packet loss and realize robust video transmission over Intemet and wireless networks, a new multiple description (MD) video coding method is proposed. In the method, two descriptions for each video frame i...To combat packet loss and realize robust video transmission over Intemet and wireless networks, a new multiple description (MD) video coding method is proposed. In the method, two descriptions for each video frame is first created by group of blocks (GOB) alternation. Motion information is then duplicated in both the descriptions and a process called low quality macroblock update is designed to redundantly encode textures in each frame using standard bit stream syntax. In this way, the output bit streams are standard compliant and better trade-offs between redundancy and single charmel reconstruction distortion are achieved. The proposed method has much better performance than the well-known MD transform coding (MDTC) method both in terms of redundancy rate distortion, and in the packet loss scenario.展开更多
Small video satellites have unique advantages of short development cycle,agile attitude maneuver,real-time video imaging.They have broad application prospects in space debris,faulty spacecraft,and other space target d...Small video satellites have unique advantages of short development cycle,agile attitude maneuver,real-time video imaging.They have broad application prospects in space debris,faulty spacecraft,and other space target detection and tracking.However,when a space target first enters the camera’s visual field,it has a relatively large angular velocity relative to the satellite,which makes it easy to deviate from the visual field and cause off-target problems.This paper proposes a novel visual tracking control method based on potential function preventing missed targets in space.Firstly,a circular area in the image plane is designed as a mandatory restricted projection area of the target and a visual tracking controller based on image error.Then,a potential function is designed to ensure continuous and stable tracking of the target after entering the visual field.Finally,the stability of the control is proved using Barbarat’s lemma.By setting the same conditions and comparing with the simulation results of the proportion-derivative(PD)control method,the results show that when there is a large relative attitude motion angular velocity between the target and the satellite,the track-ing method based on potential function can ensure that the tar-get does not deviate from the field-of-view during the tracking control process,and the projection of target is controlled to the desired position.The proposed control method is effective in eliminating tracking error and preventing off-target simultane-ously.展开更多
基金This project was supported by the foundation of the Visual and Auditory Information Processing Laboratory of BeijingUniversity of China (0306) and the National Science Foundation of China (60374031).
文摘Moving object detection is one of the challenging problems in video monitoring systems, especially when the illumination changes and shadow exists. Amethod for real-time moving object detection is described. Anew background model is proposed to handle the illumination varition problem. With optical flow technology and background subtraction, a moving object is extracted quickly and accurately. An effective shadow elimination algorithm based on color features is used to refine the moving obj ects. Experimental results demonstrate that the proposed method can update the background exactly and quickly along with the varition of illumination, and the shadow can be eliminated effectively. The proposed algorithm is a real-time one which the foundation for further object recognition and understanding of video mum'toting systems.
基金supported by the National Science Foundation of China(61603040,61973036,61433003)。
文摘Automatic video mosaicking is a challenging task in computer vision. Current researches consider either panoramic or mapping tasks on short videos. In this paper, an automatic mosaicking algorithm is proposed for both mapping and panoramic tasks based on the adapted key-frame on videos of any length.The speeded up robust features(SURF) and the grid motion statistic(GMS) algorithm are used for feature extraction and matching between consecutive frames, which are used to compute the transformation. In order to reduce the influence of the accumulated error during image stitching, an evaluation metric is put forward for the transformation matrix. Besides, a self-growth method is employed to stitch the global image for long videos. The algorithm is evaluated by using aerial-view and panoramic videos respectively on the graphic processing unit(GPU) device, which can satisfy the real-time requirement. The experimental results demonstrate that the proposed algorithm is able to achieve a better performance than the state-of-art.
基金supported by the National Natural Science Foundation of China (60672073)the Program for New Century Excellent Talents in University (NCET-06-0537)+1 种基金the Natural Science Foundation of Ningbo (2008A610016)the K.C.Wong Magna Fund in Ningbo University.
文摘Color inconsistency between views is an important problem to be solved in multi-view video systems. A multi-view video color correction method using dynamic programming is proposed. Three-dimensional histograms are constructed with sequential conditional probability in HSI color space. Then, dynamic programming is used to seek the best color mapping relation with the minimum cost path between target image histogram and source image histogram. Finally, video tracking technique is performed to correct multi-view video. Experimental results show that the proposed method can obtain better subjective and objective performance in color correction.
基金supported by the MSIP (Ministry of Science,ICT and Future Planning),Korea,under the ITRC (Information Technology Research Center)support program (IITP-2016-H8501-16-1019) supervised by the IITP (Institute for Information & Communications Technology Promotion) and Inha University Research Grantsupported by the Basic Science Research Program through the National Research Foundation (NRF) of Korea funded by the Ministry of Education (2010-0020163)
文摘Vestibulo-ocular reflex(VOR) is an important biological reflex that controls eye movement to ensure clear vision while the head is in motion.Nowadays,VOR measurement is commonly done with a video head impulse test based on a velocity gain algorithm or a position gain algorithm,in which velocity gain is a VOR calculation on head and eye velocity,whereas position gain is calculated from head and eye position.The aim of this work is first to compare the two algorithms' performance and to detect covert catch-up saccade,then to propose a stand-alone recommendation application for the patient's diagnosis.In the first experiment,for ipsilesional and contralesional sides,the calculated position gain(0.94±0.17) is higher than velocity gain(0.84±0.19).Moreover,gain asymmetry of both lesion and intact sides using velocity gain is mostly higher than that from using position gain(four out of five subjects).Consequently,for subjects who have unilateral vestibular neuritis diagnosed from clinical symptoms and a vestibular function test,vestibular weakness is depicted by velocity gain much better than by position gain.Covert catch-up saccade and position gain then are used as inputs for recommendation applications.
文摘Multiplicative multifractal process could well modal video traffic. The multiplier distributions in the multiplicatire multifractal model for video traffic are investigated and it is found that Gaussian is not suitable for describing the multipliers on the small time scales. A new statistical distribution-symmetric Pareto distribution is introduced. It is applied instead of Gaussian for the multipliers on those scales. Based on that, the algorithm is updated so that symmetric pareto distribution and Gaussian distribution are used to model video traffic but on different time scales. The simulation results demonstrate that the algorithm could model video traffic more accurately.
基金Project(50778015)supported by the National Natural Science Foundation of ChinaProject(2012CB725403)supported by the Major State Basic Research Development Program of China
文摘A real-time pedestrian detection and tracking system using a single video camera was developed to monitor pedestrians. This system contained six modules: video flow capture, pre-processing, movement detection, shadow removal, tracking, and object classification. The Gaussian mixture model was utilized to extract the moving object from an image sequence segmented by the mean-shift technique in the pre-processing module. Shadow removal was used to alleviate the negative impact of the shadow to the detected objects. A model-free method was adopted to identify pedestrians. The maximum and minimum integration methods were developed to integrate multiple cues into the mean-shift algorithm and the initial tracking iteration with the competent integrated probability distribution map for object tracking. A simple but effective algorithm was proposed to handle full occlusion cases. The system was tested using real traffic videos from different sites. The results of the test confirm that the system is reliable and has an overall accuracy of over 85%.
基金Project(08Y29-7)supported by the Transportation Science and Research Program of Jiangsu Province,ChinaProject(201103051)supported by the Major Infrastructure Program of the Health Monitoring System Hardware Platform Based on Sensor Network Node,China+1 种基金Project(61100111)supported by the National Natural Science Foundation of ChinaProject(BE2011169)supported by the Scientific and Technical Supporting Program of Jiangsu Province,China
文摘The variable block-size motion estimation(ME) and disparity estimation(DE) are adopted in multi-view video coding(MVC) to achieve high coding efficiency. However, much higher computational complexity is also introduced in coding system, which hinders practical application of MVC. An efficient fast mode decision method using mode complexity is proposed to reduce the computational complexity. In the proposed method, mode complexity is firstly computed by using the spatial, temporal and inter-view correlation between the current macroblock(MB) and its neighboring MBs. Based on the observation that direct mode is highly possible to be the optimal mode, mode complexity is always checked in advance whether it is below a predefined threshold for providing an efficient early termination opportunity. If this early termination condition is not met, three mode types for the MBs are classified according to the value of mode complexity, i.e., simple mode, medium mode and complex mode, to speed up the encoding process by reducing the number of the variable block modes required to be checked. Furthermore, for simple and medium mode region, the rate distortion(RD) cost of mode 16×16 in the temporal prediction direction is compared with that of the disparity prediction direction, to determine in advance whether the optimal prediction direction is in the temporal prediction direction or not, for skipping unnecessary disparity estimation. Experimental results show that the proposed method is able to significantly reduce the computational load by 78.79% and the total bit rate by 0.07% on average, while only incurring a negligible loss of PSNR(about 0.04 d B on average), compared with the full mode decision(FMD) in the reference software of MVC.
基金Project(2009AA11Z220)supported by the National High Technology Research and Development Program of China
文摘Video processing is one challenge in collecting vehicle trajectories from unmanned aerial vehicle(UAV) and road boundary estimation is one way to improve the video processing algorithms. However, current methods do not work well for low volume road, which is not well-marked and with noises such as vehicle tracks. A fusion-based method termed Dempster-Shafer-based road detection(DSRD) is proposed to address this issue. This method detects road boundary by combining multiple information sources using Dempster-Shafer theory(DST). In order to test the performance of the proposed method, two field experiments were conducted, one of which was on a highway partially covered by snow and another was on a dense traffic highway. The results show that DSRD is robust and accurate, whose detection rates are 100% and 99.8% compared with manual detection results. Then, DSRD is adopted to improve UAV video processing algorithm, and the vehicle detection and tracking rate are improved by 2.7% and 5.5%,respectively. Also, the computation time has decreased by 5% and 8.3% for two experiments, respectively.
文摘Automatic image classification is the first step toward semantic understanding of an object in the computer vision area.The key challenge of problem for accurate object recognition is the ability to extract the robust features from various viewpoint images and rapidly calculate similarity between features in the image database or video stream.In order to solve these problems,an effective and rapid image classification method was presented for the object recognition based on the video learning technique.The optical-flow and RANSAC algorithm were used to acquire scene images from each video sequence.After the selection of scene images,the local maximum points on comer of object around local area were found using the Harris comer detection algorithm and the several attributes from local block around each feature point were calculated by using scale invariant feature transform (SIFT) for extracting local descriptor.Finally,the extracted local descriptor was learned to the three-dimensional pyramid match kernel.Experimental results show that our method can extract features in various multi-viewpoint images from query video and calculate a similarity between a query image and images in the database.
文摘Streaming video is becoming increasingly popular among Internet multimedia applications. A robust coding scheme for DCT-based scalable video streaming over the Internet is proposed in this paper. Compared with conventional MPEG4-FGS (fine granular scalable) and progressive FGS(PFGS), the proposed method generates the base layer including some sub-base layers by DCT coefficient reordering and VLC reshuffling, which enables the video stream of to adapt itself to long-term bandwidth time-varying of channel. Furthermore, a novel end-to-end transmission architecture for scalable video streaming over the Internet is also presented, in which an adaptive unequal packet loss protection (AUPLP) strategy is proposed to determine the currently available network bandwidth and adjust the sending rates according to different situations, such as network congestion or unreliable transmission. Experimental results show that the proposed progressive scalable scheme can improve the average coding efficiency up to 1.2 dB compared with MPEG4-FGS and PFGS in lower bandwidth, and the AUPLP strategy can improve the transmitting performances not only of the proposed scheme, but also of MPEG4-FGS, PFGS system.
文摘An approach to detection of moving objects in video sequences, with application to video surveillance is presented. The algorithm combines two kinds of change points, which are detected from the region-based frame difference and adjusted background subtraction. An adaptive threshold technique is employed to automatically choose the threshold value to segment the moving objects from the still background. And experiment results show that the algorithm is effective and efficient in practical situations. Furthermore, the algorithm is robust to the effects of the changing of lighting condition and can be applied for video surveillance system.
文摘The alpha stable self-similar stochastic process has been proved an effective model for high variable data traffic. A deep insight into some special issues and considerations on use of the process to model aggregated VBR video traffic is made. Different methods to estimate stability parameter a and self-similar parameter H are compared. Processes to generate the linear fractional stable noise (LFSN) and the alpha stable random variables are provided. Model construction and the quantitative comparisons with fractional Brown motion (FBM) and real traffic are also examined. Open problems and future directions are also given with thoughtful discussions.
文摘To combat packet loss and realize robust video transmission over Intemet and wireless networks, a new multiple description (MD) video coding method is proposed. In the method, two descriptions for each video frame is first created by group of blocks (GOB) alternation. Motion information is then duplicated in both the descriptions and a process called low quality macroblock update is designed to redundantly encode textures in each frame using standard bit stream syntax. In this way, the output bit streams are standard compliant and better trade-offs between redundancy and single charmel reconstruction distortion are achieved. The proposed method has much better performance than the well-known MD transform coding (MDTC) method both in terms of redundancy rate distortion, and in the packet loss scenario.
文摘Small video satellites have unique advantages of short development cycle,agile attitude maneuver,real-time video imaging.They have broad application prospects in space debris,faulty spacecraft,and other space target detection and tracking.However,when a space target first enters the camera’s visual field,it has a relatively large angular velocity relative to the satellite,which makes it easy to deviate from the visual field and cause off-target problems.This paper proposes a novel visual tracking control method based on potential function preventing missed targets in space.Firstly,a circular area in the image plane is designed as a mandatory restricted projection area of the target and a visual tracking controller based on image error.Then,a potential function is designed to ensure continuous and stable tracking of the target after entering the visual field.Finally,the stability of the control is proved using Barbarat’s lemma.By setting the same conditions and comparing with the simulation results of the proportion-derivative(PD)control method,the results show that when there is a large relative attitude motion angular velocity between the target and the satellite,the track-ing method based on potential function can ensure that the tar-get does not deviate from the field-of-view during the tracking control process,and the projection of target is controlled to the desired position.The proposed control method is effective in eliminating tracking error and preventing off-target simultane-ously.