摘要
本文阐述了作者对智慧城市建设和发展的主要观点:(1)如何实时聚合各类城市大数据,特别是来自视频监控网络的图像视频数据,并通过构建基于云计算的"城市大脑"来分析和挖掘大数据价值并服务于城市运营与管理,是智慧城市发展中亟待解决的一个关键问题.(2)现阶段智慧城市建设的现状是"有眼、有脑",但作为"眼睛"的摄像头功能过于单一使得"脑强眼弱",其根源在于传统监控摄像机网络所采用的技术体系是为存储而不是分析设计的.尽管近期有些智能摄像头具有车牌或人脸识别功能,但是这种单纯强调"边缘计算"的方案仍然无法解决"眼脑合一"的问题.(3)为了解决目前阻碍智慧城市系统功能快速演进的难题,我们应借鉴人类进化了数十万年的视觉系统之"人类视网膜同时具有影像编码与特征编码功能"这一特性,研究与设计数字视网膜,使之具有统一时间戳和精确地理位置,能同时进行高效视频编码和紧凑特征表达的联合优化,并有效支持云端大规模监控视频分析与快速视觉搜索等功能.(4)为利用数字视网膜来构筑智慧城市的"慧眼",应积极布局与推进相关标准制定、芯片与硬件实现、支撑软件开发与软硬件开源社区,并开展大规模测试与应用.
The primary viewpoints presented in this article are as follows:(1) The method to real-time gather and aggregate all kinds of urban big data,especially image and video data from video surveillance networks,and subsequently analyze and mine the value of these big data in the city brain to effectively support the urban operation and management is a key problem in the development of smart cities.(2) Recently,some city brains are established to mine the large visual data source to obtain valuable insights about the activities in the city(e.g.,the urban traffic status).However,it is recognized that compression will inevitably affect visual feature extraction,and consequently degrading the subsequent analysis and retrieval performance.More importantly,it is impractical to aggregate all video streams from hundreds of thousands of cameras distributed across the city into a city brain for big data analysis and retrieval.These issues and challenges are rooted in the camera framework currently in use.(3) To address these challenges,a new camera framework should be developed from the fact that retina can encode both pixels and features.Such a retina-like camera,or directly referred to as digital retina,is typically equipped with a globally unified timer and an accurate positioner,and can output two streams simultaneously,including a compressed video stream for online/offline viewing and data storage,and a compact feature stream extracted from the original image/video signals for visual analysis and search.By real-time feeding only the feature streams into the city brain,these digital cameras form a compound-eye camera system for the smart city.(4) To promote the wide application of digital retinas in the smart city,the relevant works should be addressed in the near future,including standardization,hardware implementation,open-source software development,and the deployment of large-scale testbeds.
作者
高文
田永鸿
王坚
Wen GAO;Yonghong TIAN;Jian WANG(School of Electronics Engineering and Computer Science,Peking University,Beijing 100871,China;Alibaba Group,Hangzhou 311121,China)
出处
《中国科学:信息科学》
CSCD
北大核心
2018年第8期1076-1082,共7页
Scientia Sinica(Informationis)
基金
国家重点研发计划"云计算与大数据"重点专项(批准号:2017YFB1002400)
国家重点基础研究发展计划(973)(批准号:2015CB351800)
国家自然科学基金大数据科学中心项目(批准号:U1611461)资助
作者简介
Wen GAO was born in 1956.He received his Ph.D.degree in electron-ics engineering from the University of Tokyo,in 1991.Currently,he is a Boya chair professor at the Peking University,and also serves as the president of CCF from February 2016.Professor Gao works in the areas of multimedia and computer vision,including video coding,video analysis,multimedia retrieval,face recognition,multimodal interfaces,and virtual reality.His most cited contributions are model-based video coding and face recognition.;通信作者:Yonghong TIAN,E-mail:yhtian@pku.edu.cn,was born in 1975.He received his Ph.D.degree in computer application technology from the Institute of Computing Technology,Chinese Academy of Sciences,in 2005.Currently,he is a full professor with the School of Electronics Engineering and Computer Science,Peking University.His research interests include machine learning,computer vision,and multimedia big data.