To address the contradiction between the explosive growth of wireless data and the limited spectrum resources,semantic communication has been emerging as a promising communication paradigm.In this paper,we thus design...To address the contradiction between the explosive growth of wireless data and the limited spectrum resources,semantic communication has been emerging as a promising communication paradigm.In this paper,we thus design a speech semantic coded communication system,referred to as Deep-STS(i.e.,Deep-learning based Speech To Speech),for the lowbandwidth speech communication.Specifically,we first deeply compress the speech data through extracting the textual information from the speech based on the conformer encoder and connectionist temporal classification decoder at the transmitter side of Deep-STS system.In order to facilitate the final speech timbre recovery,we also extract the short-term timbre feature of speech signals only for the starting 2s duration by the long short-term memory network.Then,the Reed-Solomon coding and hybrid automatic repeat request protocol are applied to improve the reliability of transmitting the extracted text and timbre feature over the wireless channel.Third,we reconstruct the speech signal by the mel spectrogram prediction network and vocoder,when the extracted text is received along with the timbre feature at the receiver of Deep-STS system.Finally,we develop the demo system based on the USRP and GNU radio for the performance evaluation of Deep-STS.Numerical results show that the ac-Received:Jan.17,2024 Revised:Jun.12,2024 Editor:Niu Kai curacy of text extraction approaches 95%,and the mel cepstral distortion between the recovered speech signal and the original one in the spectrum domain is less than 10.Furthermore,the experimental results show that the proposed Deep-STS system can reduce the total delay of speech communication by 85%on average compared to the G.723 coding at the transmission rate of 5.4 kbps.More importantly,the coding rate of the proposed Deep-STS system is extremely low,only 0.2 kbps for continuous speech communication.It is worth noting that the Deep-STS with lower coding rate can support the low-zero-power speech communication,unveiling a new era in ultra-efficient coded communications.展开更多
We consider an image semantic communication system in a time-varying fading Gaussian MIMO channel,with a finite number of channel states.A deep learning-aided broadcast approach scheme is proposed to benefit the adapt...We consider an image semantic communication system in a time-varying fading Gaussian MIMO channel,with a finite number of channel states.A deep learning-aided broadcast approach scheme is proposed to benefit the adaptive semantic transmission in terms of different channel states.We combine the classic broadcast approach with the image transformer to implement this adaptive joint source and channel coding(JSCC)scheme.Specifically,we utilize the neural network(NN)to jointly optimize the hierarchical image compression and superposition code mapping within this scheme.The learned transformers and codebooks allow recovering of the image with an adaptive quality and low error rate at the receiver side,in each channel state.The simulation results exhibit our proposed scheme can dynamically adapt the coding to the current channel state and outperform some existing intelligent schemes with the fixed coding block.展开更多
Degraded broadcast channels(DBC) are a typical multiuser communication scenario, Semantic communications over DBC still lack in-depth research. In this paper, we design a semantic communications approach based on mult...Degraded broadcast channels(DBC) are a typical multiuser communication scenario, Semantic communications over DBC still lack in-depth research. In this paper, we design a semantic communications approach based on multi-user semantic fusion for wireless image transmission over DBC. The transmitter extracts semantic features for two users separately and then effectively fuses them for broadcasting by leveraging semantic similarity. Unlike traditional allocation of time, power, or bandwidth, the semantic fusion scheme can dynamically control the weight of the semantic features of the two users to balance their performance. Considering the different channel state information(CSI) of both users over DBC,a DBC-Aware method is developed that embeds the CSI of both users into the joint source-channel coding encoder and fusion module to adapt to the channel.Experimental results show that the proposed system outperforms the traditional broadcasting schemes.展开更多
The concept of semantic communication provides a novel approach for applications in scenarios with limited communication resources.In this paper,we propose an end-to-end(E2E)semantic molecular communication system,aim...The concept of semantic communication provides a novel approach for applications in scenarios with limited communication resources.In this paper,we propose an end-to-end(E2E)semantic molecular communication system,aiming to enhance the efficiency of molecular communication systems by reducing the transmitted information.Specifically,following the joint source channel coding paradigm,the network is designed to encode the task-relevant information into the concentration of the information molecules,which is robust to the degradation of the molecular communication channel.Furthermore,we propose a channel network to enable the E2E learning over the non-differentiable molecular channel.Experimental results demonstrate the superior performance of the semantic molecular communication system over the conventional methods in classification tasks.展开更多
As a novel paradigm,semantic communication provides an effective solution for breaking through the future development dilemma of classical communication systems.However,it remains an unsolved problem of how to measure...As a novel paradigm,semantic communication provides an effective solution for breaking through the future development dilemma of classical communication systems.However,it remains an unsolved problem of how to measure the information transmission capability for a given semantic communication method and subsequently compare it with the classical communication method.In this paper,we first present a review of the semantic communication system,including its system model and the two typical coding and transmission methods for its implementations.To address the unsolved issue of the information transmission capability measure for semantic communication methods,we propose a new universal performance measure called Information Conductivity.We provide the definition and the physical significance to state its effectiveness in representing the information transmission capabilities of the semantic communication systems and present elaborations including its measure methods,degrees of freedom,and progressive analysis.Experimental results in image transmission scenarios validate its practical applicability.展开更多
In the future development direction of the sixth generation(6G)mobile communication,several communication models are proposed to face the growing challenges of the task.The rapid development of artificial intelligence...In the future development direction of the sixth generation(6G)mobile communication,several communication models are proposed to face the growing challenges of the task.The rapid development of artificial intelligence(AI)foundation models provides significant support for efficient and intelligent communication interactions.In this paper,we propose an innovative semantic communication paradigm called task-oriented semantic communication system with foundation models.First,we segment the image by using task prompts based on the segment anything model(SAM)and contrastive language-image pretraining(CLIP).Meanwhile,we adopt Bezier curve to enhance the mask to improve the segmentation accuracy.Second,we have differentiated semantic compression and transmission approaches for segmented content.Third,we fuse different semantic information based on the conditional diffusion model to generate high-quality images that satisfy the users'specific task requirements.Finally,the experimental results show that the proposed system compresses the semantic information effectively and improves the robustness of semantic communication.展开更多
To facilitate emerging applications and demands of edge intelligence(EI)-empowered 6G networks,model-driven semantic communications have been proposed to reduce transmission volume by deploying artificial intelligence...To facilitate emerging applications and demands of edge intelligence(EI)-empowered 6G networks,model-driven semantic communications have been proposed to reduce transmission volume by deploying artificial intelligence(AI)models that provide abilities of semantic extraction and recovery.Nevertheless,it is not feasible to preload all AI models on resource-constrained terminals.Thus,in-time model transmission becomes a crucial problem.This paper proposes an intellicise model transmission architecture to guarantee the reliable transmission of models for semantic communication.The mathematical relationship between model size and performance is formulated by employing a recognition error function supported with experimental data.We consider the characteristics of wireless channels and derive the closed-form expression of model transmission outage probability(MTOP)over the Rayleigh channel.Besides,we define the effective model accuracy(EMA)to evaluate the model transmission performance of both communication and intelligence.Then we propose a joint model selection and resource allocation(JMSRA)algorithm to maximize the average EMA of all users.Simulation results demonstrate that the average EMA of the JMSRA algorithm outperforms baseline algorithms by about 22%.展开更多
Recently,deep learning-based semantic communication has garnered widespread attention,with numerous systems designed for transmitting diverse data sources,including text,image,and speech,etc.While efforts have been di...Recently,deep learning-based semantic communication has garnered widespread attention,with numerous systems designed for transmitting diverse data sources,including text,image,and speech,etc.While efforts have been directed toward improving system performance,many studies have concentrated on enhancing the structure of the encoder and decoder.However,this often overlooks the resulting increase in model complexity,imposing additional storage and computational burdens on smart devices.Furthermore,existing work tends to prioritize explicit semantics,neglecting the potential of implicit semantics.This paper aims to easily and effectively enhance the receiver's decoding capability without modifying the encoder and decoder structures.We propose a novel semantic communication system with variational neural inference for text transmission.Specifically,we introduce a simple but effective variational neural inferer at the receiver to infer the latent semantic information within the received text.This information is then utilized to assist in the decoding process.The simulation results show a significant enhancement in system performance and improved robustness.展开更多
Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpe...Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.展开更多
Video transmission requires considerable bandwidth,and current widely employed schemes prove inadequate when confronted with scenes featuring prominently.Motivated by the strides in talkinghead generative technology,t...Video transmission requires considerable bandwidth,and current widely employed schemes prove inadequate when confronted with scenes featuring prominently.Motivated by the strides in talkinghead generative technology,the paper introduces a semantic transmission system tailored for talking-head videos.The system captures semantic information from talking-head video and faithfully reconstructs source video at the receiver,only one-shot reference frame and compact semantic features are required for the entire transmission.Specifically,we analyze video semantics in the pixel domain frame-by-frame and jointly process multi-frame semantic information to seamlessly incorporate spatial and temporal information.Variational modeling is utilized to evaluate the diversity of importance among group semantics,thereby guiding bandwidth resource allocation for semantics to enhance system efficiency.The whole endto-end system is modeled as an optimization problem and equivalent to acquiring optimal rate-distortion performance.We evaluate our system on both reference frame and video transmission,experimental results demonstrate that our system can improve the efficiency and robustness of communications.Compared to the classical approaches,our system can save over 90%of bandwidth when user perception is close.展开更多
As conventional communication systems based on classic information theory have closely approached Shannon capacity,semantic communication is emerging as a key enabling technology for the further improvement of communi...As conventional communication systems based on classic information theory have closely approached Shannon capacity,semantic communication is emerging as a key enabling technology for the further improvement of communication performance.However,it is still unsettled on how to represent semantic information and characterise the theoretical limits of semantic-oriented compression and transmission.In this paper,we consider a semantic source which is characterised by a set of correlated random variables whose joint probabilistic distribution can be described by a Bayesian network.We give the information-theoretic limit on the lossless compression of the semantic source and introduce a low complexity encoding method by exploiting the conditional independence.We further characterise the limits on lossy compression of the semantic source and the upper and lower bounds of the rate-distortion function.We also investigate the lossy compression of the semantic source with two-sided information at the encoder and decoder,and obtain the corresponding rate distortion function.We prove that the optimal code of the semantic source is the combination of the optimal codes of each conditional independent set given the side information.展开更多
Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semant...Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semantics of video for transmission,is a key aspect in the framework of multimedia semantic communication.In this paper,we propose a facial video semantic coding method with low bitrate based on the temporal continuity of video semantics.At the sender’s end,we selectively transmit facial keypoints and deformation information,allocating distinct bitrates to different keypoints across frames.Compressive techniques involving sampling and quantization are employed to reduce the bitrate while retaining facial key semantic information.At the receiver’s end,a GAN-based generative network is utilized for reconstruction,effectively mitigating block artifacts and buffering problems present in traditional codec algorithms under low bitrates.The performance of the proposed approach is validated on multiple datasets,such as VoxCeleb and TalkingHead-1kH,employing metrics such as LPIPS,DISTS,and AKD for assessment.Experimental results demonstrate significant advantages over traditional codec methods,achieving up to approximately 10-fold bitrate reduction in prolonged,stable head pose scenarios across diverse conversational video settings.展开更多
In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrou...In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrough beyond the Shannon paradigm and will play an essential role in future communications.To narrow the gap between current research and future vision,after an overview of semantic communications,this article presents and discusses ten fundamental and critical challenges in today’s semantic communication field.These challenges are divided into theory foundation,system design,and practical implementation.Challenges related to the theory foundation including semantic capacity,entropy,and rate-distortion are discussed first.Then,the system design challenges encompassing architecture,knowledge base,joint semantic-channel coding,tailored transmission scheme,and impairment are posed.The last two challenges associated with the practical implementation lie in cross-layer optimization for networks and standardization.For each challenge,efforts to date and thoughtful insights are provided.展开更多
Edge intelligence is anticipated to underlay the pathway to connected intelligence for 6G networks,but the organic confluence of edge computing and artificial intelligence still needs to be carefully treated.To this e...Edge intelligence is anticipated to underlay the pathway to connected intelligence for 6G networks,but the organic confluence of edge computing and artificial intelligence still needs to be carefully treated.To this end,this article discusses the concepts of edge intelligence from the semantic cognitive perspective.Two instructive theoretical models for edge semantic cognitive intelligence(ESCI)are first established.Afterwards,the ESCI framework orchestrating deep learning with semantic communication is discussed.Two representative applications are present to shed light on the prospect of ESCI in 6G networks.Some open problems are finally listed to elicit the future research directions of ESCI.展开更多
Semantic communication,as a critical component of artificial intelligence(AI),has gained increasing attention in recent years due to its significant impact on various fields.In this paper,we focus on the applications ...Semantic communication,as a critical component of artificial intelligence(AI),has gained increasing attention in recent years due to its significant impact on various fields.In this paper,we focus on the applications of semantic feature extraction,a key step in the semantic communication,in several areas of artificial intelligence,including natural language processing,medical imaging,remote sensing,autonomous driving,and other image-related applications.Specifically,we discuss how semantic feature extraction can enhance the accuracy and efficiency of natural language processing tasks,such as text classification,sentiment analysis,and topic modeling.In the medical imaging field,we explore how semantic feature extraction can be used for disease diagnosis,drug development,and treatment planning.In addition,we investigate the applications of semantic feature extraction in remote sensing and autonomous driving,where it can facilitate object detection,scene understanding,and other tasks.By providing an overview of the applications of semantic feature extraction in various fields,this paper aims to provide insights into the potential of this technology to advance the development of artificial intelligence.展开更多
The emerging new services in the sixth generation(6G)communication system impose increasingly stringent requirements and challenges on video transmission.Semantic communications are envisioned as a promising solution ...The emerging new services in the sixth generation(6G)communication system impose increasingly stringent requirements and challenges on video transmission.Semantic communications are envisioned as a promising solution to these challenges.This paper provides a highly-efficient solution to video transmission by proposing a scalable semantic transmission algorithm,named scalable semantic transmission framework for video(SST-V),which jointly considers the semantic importance and channel conditions.Specifically,a semantic importance evaluation module is designed to extract more informative semantic features according to the estimated importance level,facilitating high-efficiency semantic coding.By further considering the channel condition,a cascaded learning based scalable joint semanticchannel coding algorithm is proposed,which autonomously adapts the semantic coding and channel coding strategies to the specific signalto-noise ratio(SNR).Simulation results show that SST-V achieves better video reconstruction performance,while significantly reducing the transmission overhead.展开更多
基金supported in part by National Natural Science Foundation of China under Grants 62122069,62071431,and 62201507.
文摘To address the contradiction between the explosive growth of wireless data and the limited spectrum resources,semantic communication has been emerging as a promising communication paradigm.In this paper,we thus design a speech semantic coded communication system,referred to as Deep-STS(i.e.,Deep-learning based Speech To Speech),for the lowbandwidth speech communication.Specifically,we first deeply compress the speech data through extracting the textual information from the speech based on the conformer encoder and connectionist temporal classification decoder at the transmitter side of Deep-STS system.In order to facilitate the final speech timbre recovery,we also extract the short-term timbre feature of speech signals only for the starting 2s duration by the long short-term memory network.Then,the Reed-Solomon coding and hybrid automatic repeat request protocol are applied to improve the reliability of transmitting the extracted text and timbre feature over the wireless channel.Third,we reconstruct the speech signal by the mel spectrogram prediction network and vocoder,when the extracted text is received along with the timbre feature at the receiver of Deep-STS system.Finally,we develop the demo system based on the USRP and GNU radio for the performance evaluation of Deep-STS.Numerical results show that the ac-Received:Jan.17,2024 Revised:Jun.12,2024 Editor:Niu Kai curacy of text extraction approaches 95%,and the mel cepstral distortion between the recovered speech signal and the original one in the spectrum domain is less than 10.Furthermore,the experimental results show that the proposed Deep-STS system can reduce the total delay of speech communication by 85%on average compared to the G.723 coding at the transmission rate of 5.4 kbps.More importantly,the coding rate of the proposed Deep-STS system is extremely low,only 0.2 kbps for continuous speech communication.It is worth noting that the Deep-STS with lower coding rate can support the low-zero-power speech communication,unveiling a new era in ultra-efficient coded communications.
基金supported in part by the National Key R&D Project of China under Grant 2020YFA0712300National Natural Science Foundation of China under Grant NSFC-62231022,12031011supported in part by the NSF of China under Grant 62125108。
文摘We consider an image semantic communication system in a time-varying fading Gaussian MIMO channel,with a finite number of channel states.A deep learning-aided broadcast approach scheme is proposed to benefit the adaptive semantic transmission in terms of different channel states.We combine the classic broadcast approach with the image transformer to implement this adaptive joint source and channel coding(JSCC)scheme.Specifically,we utilize the neural network(NN)to jointly optimize the hierarchical image compression and superposition code mapping within this scheme.The learned transformers and codebooks allow recovering of the image with an adaptive quality and low error rate at the receiver side,in each channel state.The simulation results exhibit our proposed scheme can dynamically adapt the coding to the current channel state and outperform some existing intelligent schemes with the fixed coding block.
基金supported in part by National Key R&D Project of China (2023YFB2906201)the National Natural Science Foundation of China (62222111, 62125108 and 62431015)the Fundamental Research Funds for the Central Universities。
文摘Degraded broadcast channels(DBC) are a typical multiuser communication scenario, Semantic communications over DBC still lack in-depth research. In this paper, we design a semantic communications approach based on multi-user semantic fusion for wireless image transmission over DBC. The transmitter extracts semantic features for two users separately and then effectively fuses them for broadcasting by leveraging semantic similarity. Unlike traditional allocation of time, power, or bandwidth, the semantic fusion scheme can dynamically control the weight of the semantic features of the two users to balance their performance. Considering the different channel state information(CSI) of both users over DBC,a DBC-Aware method is developed that embeds the CSI of both users into the joint source-channel coding encoder and fusion module to adapt to the channel.Experimental results show that the proposed system outperforms the traditional broadcasting schemes.
基金supported by the Beijing Natural Science Foundation(L211012)the Natural Science Foundation of China(62122012,62221001)the Fundamental Research Funds for the Central Universities(2022JBQY004)。
文摘The concept of semantic communication provides a novel approach for applications in scenarios with limited communication resources.In this paper,we propose an end-to-end(E2E)semantic molecular communication system,aiming to enhance the efficiency of molecular communication systems by reducing the transmitted information.Specifically,following the joint source channel coding paradigm,the network is designed to encode the task-relevant information into the concentration of the information molecules,which is robust to the degradation of the molecular communication channel.Furthermore,we propose a channel network to enable the E2E learning over the non-differentiable molecular channel.Experimental results demonstrate the superior performance of the semantic molecular communication system over the conventional methods in classification tasks.
基金supported by the National Natural Science Foundation of China(No.62293481,No.62071058)。
文摘As a novel paradigm,semantic communication provides an effective solution for breaking through the future development dilemma of classical communication systems.However,it remains an unsolved problem of how to measure the information transmission capability for a given semantic communication method and subsequently compare it with the classical communication method.In this paper,we first present a review of the semantic communication system,including its system model and the two typical coding and transmission methods for its implementations.To address the unsolved issue of the information transmission capability measure for semantic communication methods,we propose a new universal performance measure called Information Conductivity.We provide the definition and the physical significance to state its effectiveness in representing the information transmission capabilities of the semantic communication systems and present elaborations including its measure methods,degrees of freedom,and progressive analysis.Experimental results in image transmission scenarios validate its practical applicability.
基金supported in part by the National Natural Science Foundation of China under Grant(62001246,62231017,62201277,62071255)the Natural Science Foundation of Jiangsu Province under Grant BK20220390+3 种基金Key R and D Program of Jiangsu Province Key project and topics under Grant(BE2021095,BE2023035)the Natural Science Research Startup Foundation of Recruiting Talents of Nanjing University of Posts and Telecommunications(Grant No.NY221011)National Science Foundation of Xiamen,China(No.3502Z202372013)Open Project of the Key Laboratory of Underwater Acoustic Communication and Marine Information Technology(Xiamen University)of the Ministry of Education,China(No.UAC202304)。
文摘In the future development direction of the sixth generation(6G)mobile communication,several communication models are proposed to face the growing challenges of the task.The rapid development of artificial intelligence(AI)foundation models provides significant support for efficient and intelligent communication interactions.In this paper,we propose an innovative semantic communication paradigm called task-oriented semantic communication system with foundation models.First,we segment the image by using task prompts based on the segment anything model(SAM)and contrastive language-image pretraining(CLIP).Meanwhile,we adopt Bezier curve to enhance the mask to improve the segmentation accuracy.Second,we have differentiated semantic compression and transmission approaches for segmented content.Third,we fuse different semantic information based on the conditional diffusion model to generate high-quality images that satisfy the users'specific task requirements.Finally,the experimental results show that the proposed system compresses the semantic information effectively and improves the robustness of semantic communication.
基金supported in part by the National Key R&D Program of China No.2020YFB1806905the National Natural Science Foundation of China No.62201079+1 种基金the Beijing Natural Science Foundation No.L232051the Major Key Project of Peng Cheng Laboratory(PCL)Department of Broadband Communication。
文摘To facilitate emerging applications and demands of edge intelligence(EI)-empowered 6G networks,model-driven semantic communications have been proposed to reduce transmission volume by deploying artificial intelligence(AI)models that provide abilities of semantic extraction and recovery.Nevertheless,it is not feasible to preload all AI models on resource-constrained terminals.Thus,in-time model transmission becomes a crucial problem.This paper proposes an intellicise model transmission architecture to guarantee the reliable transmission of models for semantic communication.The mathematical relationship between model size and performance is formulated by employing a recognition error function supported with experimental data.We consider the characteristics of wireless channels and derive the closed-form expression of model transmission outage probability(MTOP)over the Rayleigh channel.Besides,we define the effective model accuracy(EMA)to evaluate the model transmission performance of both communication and intelligence.Then we propose a joint model selection and resource allocation(JMSRA)algorithm to maximize the average EMA of all users.Simulation results demonstrate that the average EMA of the JMSRA algorithm outperforms baseline algorithms by about 22%.
基金supported in part by the National Science Foundation of China(NSFC)with grant no.62271514in part by the Science,Technology and Innovation Commission of Shenzhen Municipality with grant no.JCYJ20210324120002007 and ZDSYS20210623091807023in part by the State Key Laboratory of Public Big Data with grant no.PBD2023-01。
文摘Recently,deep learning-based semantic communication has garnered widespread attention,with numerous systems designed for transmitting diverse data sources,including text,image,and speech,etc.While efforts have been directed toward improving system performance,many studies have concentrated on enhancing the structure of the encoder and decoder.However,this often overlooks the resulting increase in model complexity,imposing additional storage and computational burdens on smart devices.Furthermore,existing work tends to prioritize explicit semantics,neglecting the potential of implicit semantics.This paper aims to easily and effectively enhance the receiver's decoding capability without modifying the encoder and decoder structures.We propose a novel semantic communication system with variational neural inference for text transmission.Specifically,we introduce a simple but effective variational neural inferer at the receiver to infer the latent semantic information within the received text.This information is then utilized to assist in the decoding process.The simulation results show a significant enhancement in system performance and improved robustness.
基金supported in part by the National Key Research and Development Program of China under Grant 2024YFE0200600in part by the National Natural Science Foundation of China under Grant 62071425+3 种基金in part by the Zhejiang Key Research and Development Plan under Grant 2022C01093in part by the Zhejiang Provincial Natural Science Foundation of China under Grant LR23F010005in part by the National Key Laboratory of Wireless Communications Foundation under Grant 2023KP01601in part by the Big Data and Intelligent Computing Key Lab of CQUPT under Grant BDIC-2023-B-001.
文摘Semantic communication(SemCom)aims to achieve high-fidelity information delivery under low communication consumption by only guaranteeing semantic accuracy.Nevertheless,semantic communication still suffers from unexpected channel volatility and thus developing a re-transmission mechanism(e.g.,hybrid automatic repeat request[HARQ])becomes indispensable.In that regard,instead of discarding previously transmitted information,the incremental knowledge-based HARQ(IK-HARQ)is deemed as a more effective mechanism that could sufficiently utilize the information semantics.However,considering the possible existence of semantic ambiguity in image transmission,a simple bit-level cyclic redundancy check(CRC)might compromise the performance of IK-HARQ.Therefore,there emerges a strong incentive to revolutionize the CRC mechanism,thus more effectively reaping the benefits of both SemCom and HARQ.In this paper,built on top of swin transformer-based joint source-channel coding(JSCC)and IK-HARQ,we propose a semantic image transmission framework SC-TDA-HARQ.In particular,different from the conventional CRC,we introduce a topological data analysis(TDA)-based error detection method,which capably digs out the inner topological and geometric information of images,to capture semantic information and determine the necessity for re-transmission.Extensive numerical results validate the effectiveness and efficiency of the proposed SC-TDA-HARQ framework,especially under the limited bandwidth condition,and manifest the superiority of TDA-based error detection method in image transmission.
基金supported by the National Natural Science Foundation of China(No.61971062)BUPT Excellent Ph.D.Students Foundation(CX2022153)。
文摘Video transmission requires considerable bandwidth,and current widely employed schemes prove inadequate when confronted with scenes featuring prominently.Motivated by the strides in talkinghead generative technology,the paper introduces a semantic transmission system tailored for talking-head videos.The system captures semantic information from talking-head video and faithfully reconstructs source video at the receiver,only one-shot reference frame and compact semantic features are required for the entire transmission.Specifically,we analyze video semantics in the pixel domain frame-by-frame and jointly process multi-frame semantic information to seamlessly incorporate spatial and temporal information.Variational modeling is utilized to evaluate the diversity of importance among group semantics,thereby guiding bandwidth resource allocation for semantics to enhance system efficiency.The whole endto-end system is modeled as an optimization problem and equivalent to acquiring optimal rate-distortion performance.We evaluate our system on both reference frame and video transmission,experimental results demonstrate that our system can improve the efficiency and robustness of communications.Compared to the classical approaches,our system can save over 90%of bandwidth when user perception is close.
基金partly supported by NSFC under grant No.62293481,No.62201505partly by the SUTDZJU IDEA Grant(SUTD-ZJU(VP)202102)。
文摘As conventional communication systems based on classic information theory have closely approached Shannon capacity,semantic communication is emerging as a key enabling technology for the further improvement of communication performance.However,it is still unsettled on how to represent semantic information and characterise the theoretical limits of semantic-oriented compression and transmission.In this paper,we consider a semantic source which is characterised by a set of correlated random variables whose joint probabilistic distribution can be described by a Bayesian network.We give the information-theoretic limit on the lossless compression of the semantic source and introduce a low complexity encoding method by exploiting the conditional independence.We further characterise the limits on lossy compression of the semantic source and the upper and lower bounds of the rate-distortion function.We also investigate the lossy compression of the semantic source with two-sided information at the encoder and decoder,and obtain the corresponding rate distortion function.We prove that the optimal code of the semantic source is the combination of the optimal codes of each conditional independent set given the side information.
基金supported by the National Natural Science Foundation of China (Nos. NSFC 61925105, 62322109, 62171257 and U22B2001)the Xplorer Prize in Information and Electronics technologiesthe Tsinghua University (Department of Electronic Engineering)-Nantong Research Institute for Advanced Communication Technologies Joint Research Center for Space, Air, Ground and Sea Cooperative Communication Network Technology
文摘Multimedia semantic communication has been receiving increasing attention due to its significant enhancement of communication efficiency.Semantic coding,which is oriented towards extracting and encoding the key semantics of video for transmission,is a key aspect in the framework of multimedia semantic communication.In this paper,we propose a facial video semantic coding method with low bitrate based on the temporal continuity of video semantics.At the sender’s end,we selectively transmit facial keypoints and deformation information,allocating distinct bitrates to different keypoints across frames.Compressive techniques involving sampling and quantization are employed to reduce the bitrate while retaining facial key semantic information.At the receiver’s end,a GAN-based generative network is utilized for reconstruction,effectively mitigating block artifacts and buffering problems present in traditional codec algorithms under low bitrates.The performance of the proposed approach is validated on multiple datasets,such as VoxCeleb and TalkingHead-1kH,employing metrics such as LPIPS,DISTS,and AKD for assessment.Experimental results demonstrate significant advantages over traditional codec methods,achieving up to approximately 10-fold bitrate reduction in prolonged,stable head pose scenarios across diverse conversational video settings.
基金supported in part by the National Key Research and Development Program of China under Grant 2021YFA1000500(4)in part by the Natural Science Foundation of China(NSFC)under Grant 62293484,Grant U22B2001,Grant 62425110,Grant 62227801,Grant 62442106.
文摘In recent years,deep learning-based semantic communications have shown great potential to enhance the performance of communication systems.This has led to the belief that semantic communications represent a breakthrough beyond the Shannon paradigm and will play an essential role in future communications.To narrow the gap between current research and future vision,after an overview of semantic communications,this article presents and discusses ten fundamental and critical challenges in today’s semantic communication field.These challenges are divided into theory foundation,system design,and practical implementation.Challenges related to the theory foundation including semantic capacity,entropy,and rate-distortion are discussed first.Then,the system design challenges encompassing architecture,knowledge base,joint semantic-channel coding,tailored transmission scheme,and impairment are posed.The last two challenges associated with the practical implementation lie in cross-layer optimization for networks and standardization.For each challenge,efforts to date and thoughtful insights are provided.
基金supported in part by the National Science Foundation of China under Grant 62101253the Natural Science Foundation of Jiangsu Province under Grant BK20210283+2 种基金the Jiangsu Provincial Inno-vation and Entrepreneurship Doctor Program under Grant JSSCBS20210158the Open Research Foun-dation of National Mobile Communications Research Laboratory under Grant 2022D08the Research Foundation of Nanjing for Returned Chinese Scholars.
文摘Edge intelligence is anticipated to underlay the pathway to connected intelligence for 6G networks,but the organic confluence of edge computing and artificial intelligence still needs to be carefully treated.To this end,this article discusses the concepts of edge intelligence from the semantic cognitive perspective.Two instructive theoretical models for edge semantic cognitive intelligence(ESCI)are first established.Afterwards,the ESCI framework orchestrating deep learning with semantic communication is discussed.Two representative applications are present to shed light on the prospect of ESCI in 6G networks.Some open problems are finally listed to elicit the future research directions of ESCI.
文摘Semantic communication,as a critical component of artificial intelligence(AI),has gained increasing attention in recent years due to its significant impact on various fields.In this paper,we focus on the applications of semantic feature extraction,a key step in the semantic communication,in several areas of artificial intelligence,including natural language processing,medical imaging,remote sensing,autonomous driving,and other image-related applications.Specifically,we discuss how semantic feature extraction can enhance the accuracy and efficiency of natural language processing tasks,such as text classification,sentiment analysis,and topic modeling.In the medical imaging field,we explore how semantic feature extraction can be used for disease diagnosis,drug development,and treatment planning.In addition,we investigate the applications of semantic feature extraction in remote sensing and autonomous driving,where it can facilitate object detection,scene understanding,and other tasks.By providing an overview of the applications of semantic feature extraction in various fields,this paper aims to provide insights into the potential of this technology to advance the development of artificial intelligence.
基金supported in part by the National Natural Science Founda⁃tion of China under Grant No.62293485the Fundamental Research Funds for the Central Universities under Grant No.2022RC18.
文摘The emerging new services in the sixth generation(6G)communication system impose increasingly stringent requirements and challenges on video transmission.Semantic communications are envisioned as a promising solution to these challenges.This paper provides a highly-efficient solution to video transmission by proposing a scalable semantic transmission algorithm,named scalable semantic transmission framework for video(SST-V),which jointly considers the semantic importance and channel conditions.Specifically,a semantic importance evaluation module is designed to extract more informative semantic features according to the estimated importance level,facilitating high-efficiency semantic coding.By further considering the channel condition,a cascaded learning based scalable joint semanticchannel coding algorithm is proposed,which autonomously adapts the semantic coding and channel coding strategies to the specific signalto-noise ratio(SNR).Simulation results show that SST-V achieves better video reconstruction performance,while significantly reducing the transmission overhead.