[Objective]Real-time monitoring of cow ruminant behavior is of paramount importance for promptly obtaining relevant information about cow health and predicting cow diseases.Currently,various strategies have been propo...[Objective]Real-time monitoring of cow ruminant behavior is of paramount importance for promptly obtaining relevant information about cow health and predicting cow diseases.Currently,various strategies have been proposed for monitoring cow ruminant behavior,including video surveillance,sound recognition,and sensor monitoring methods.How‐ever,the application of edge device gives rise to the issue of inadequate real-time performance.To reduce the volume of data transmission and cloud computing workload while achieving real-time monitoring of dairy cow rumination behavior,a real-time monitoring method was proposed for cow ruminant behavior based on edge computing.[Methods]Autono‐mously designed edge devices were utilized to collect and process six-axis acceleration signals from cows in real-time.Based on these six-axis data,two distinct strategies,federated edge intelligence and split edge intelligence,were investigat‐ed for the real-time recognition of cow ruminant behavior.Focused on the real-time recognition method for cow ruminant behavior leveraging federated edge intelligence,the CA-MobileNet v3 network was proposed by enhancing the MobileNet v3 network with a collaborative attention mechanism.Additionally,a federated edge intelligence model was designed uti‐lizing the CA-MobileNet v3 network and the FedAvg federated aggregation algorithm.In the study on split edge intelli‐gence,a split edge intelligence model named MobileNet-LSTM was designed by integrating the MobileNet v3 network with a fusion collaborative attention mechanism and the Bi-LSTM network.[Results and Discussions]Through compara‐tive experiments with MobileNet v3 and MobileNet-LSTM,the federated edge intelligence model based on CA-Mo‐bileNet v3 achieved an average Precision rate,Recall rate,F1-Score,Specificity,and Accuracy of 97.1%,97.9%,97.5%,98.3%,and 98.2%,respectively,yielding the best recognition performance.[Conclusions]It is provided a real-time and effective method for monitoring cow ruminant behavior,and the proposed federated edge intelligence model can be ap‐plied in practical settings.展开更多
The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is present...The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.展开更多
A new file assignment strategy of parallel I/O, which is named heuristic file sorted assignment algorithm was proposed on cluster computing system. Based on the load balancing, it assigns the files to the same disk ac...A new file assignment strategy of parallel I/O, which is named heuristic file sorted assignment algorithm was proposed on cluster computing system. Based on the load balancing, it assigns the files to the same disk according to the similar service time. Firstly, the files were sorted and stored at the set I in descending order in terms of their service time, then one disk of cluster node was selected randomly when the files were to be assigned, and at last the continuous files were taken orderly from the set I to the disk until the disk reached its load maximum. The experimental results show that the new strategy improves the performance by 20.2% when the load of the system is light and by 31.6% when the load is heavy. And the higher the data access rate, the more evident the improvement of the performance obtained by the heuristic file sorted assignment algorithm.展开更多
DFT is widely applied in the field of signal process and others. Most present rapid ways of calculation are either based on paralleled computers connected by such particular systems like butterfly network, hypercube e...DFT is widely applied in the field of signal process and others. Most present rapid ways of calculation are either based on paralleled computers connected by such particular systems like butterfly network, hypercube etc; or based on the assumption of instant transportation, non-conflict communication, complete connection of paralleled processors and unlimited usable processors. However, the delay of communication in the system of information transmission cannot be ignored. This paper works on the following aspects: instant transmission, dispatching missions, and the path of information through the communication link in the computer cluster systems; layout of the dynamic FFT algorithm under the different structures of computer clusters.展开更多
为了解决超万卡智算集群硬件故障多、任务训练故障率居高不下、跨域问题定位困难等稳定性保障问题,提出了一种基于数据和知识驱动的保障超万卡智算集群稳定性的方案。首先,通过异构资源一体化采集技术、分布式实时大数据抽取—转换—加...为了解决超万卡智算集群硬件故障多、任务训练故障率居高不下、跨域问题定位困难等稳定性保障问题,提出了一种基于数据和知识驱动的保障超万卡智算集群稳定性的方案。首先,通过异构资源一体化采集技术、分布式实时大数据抽取—转换—加载(extract-transform-load,ETL)技术采集集群性能数据;然后,基于改进的自注意力机制的双向长短期记忆(self-attention-based bidirectional long short-term memory,SABiLSTM)网络深度学习模型实现故障诊断;最后,通过知识图谱分析匹配诊断模型输出的结果,完成故障诊断报告的输出,提升诊断模型输出的可解释性。在深度学习模型提取时序性特征时引入特征权重系数,对不同尺度提取的特征加权融合,提高模型故障诊断精度。在基于1.8万卡智算集群故障诊断仿真实验中,损失值逐渐收敛并稳定在0.047,准确率达到了98.4%。实践表明,该稳定性保障方案能有效保障大模型训练,提升智算集群的可靠性,为未来更大规模的智算集群建设与大模型训练提供坚实的基础。展开更多
文摘[Objective]Real-time monitoring of cow ruminant behavior is of paramount importance for promptly obtaining relevant information about cow health and predicting cow diseases.Currently,various strategies have been proposed for monitoring cow ruminant behavior,including video surveillance,sound recognition,and sensor monitoring methods.How‐ever,the application of edge device gives rise to the issue of inadequate real-time performance.To reduce the volume of data transmission and cloud computing workload while achieving real-time monitoring of dairy cow rumination behavior,a real-time monitoring method was proposed for cow ruminant behavior based on edge computing.[Methods]Autono‐mously designed edge devices were utilized to collect and process six-axis acceleration signals from cows in real-time.Based on these six-axis data,two distinct strategies,federated edge intelligence and split edge intelligence,were investigat‐ed for the real-time recognition of cow ruminant behavior.Focused on the real-time recognition method for cow ruminant behavior leveraging federated edge intelligence,the CA-MobileNet v3 network was proposed by enhancing the MobileNet v3 network with a collaborative attention mechanism.Additionally,a federated edge intelligence model was designed uti‐lizing the CA-MobileNet v3 network and the FedAvg federated aggregation algorithm.In the study on split edge intelli‐gence,a split edge intelligence model named MobileNet-LSTM was designed by integrating the MobileNet v3 network with a fusion collaborative attention mechanism and the Bi-LSTM network.[Results and Discussions]Through compara‐tive experiments with MobileNet v3 and MobileNet-LSTM,the federated edge intelligence model based on CA-Mo‐bileNet v3 achieved an average Precision rate,Recall rate,F1-Score,Specificity,and Accuracy of 97.1%,97.9%,97.5%,98.3%,and 98.2%,respectively,yielding the best recognition performance.[Conclusions]It is provided a real-time and effective method for monitoring cow ruminant behavior,and the proposed federated edge intelligence model can be ap‐plied in practical settings.
基金This project was supported by the National Natural Science Foundation of China (60135020).
文摘The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.
文摘A new file assignment strategy of parallel I/O, which is named heuristic file sorted assignment algorithm was proposed on cluster computing system. Based on the load balancing, it assigns the files to the same disk according to the similar service time. Firstly, the files were sorted and stored at the set I in descending order in terms of their service time, then one disk of cluster node was selected randomly when the files were to be assigned, and at last the continuous files were taken orderly from the set I to the disk until the disk reached its load maximum. The experimental results show that the new strategy improves the performance by 20.2% when the load of the system is light and by 31.6% when the load is heavy. And the higher the data access rate, the more evident the improvement of the performance obtained by the heuristic file sorted assignment algorithm.
文摘DFT is widely applied in the field of signal process and others. Most present rapid ways of calculation are either based on paralleled computers connected by such particular systems like butterfly network, hypercube etc; or based on the assumption of instant transportation, non-conflict communication, complete connection of paralleled processors and unlimited usable processors. However, the delay of communication in the system of information transmission cannot be ignored. This paper works on the following aspects: instant transmission, dispatching missions, and the path of information through the communication link in the computer cluster systems; layout of the dynamic FFT algorithm under the different structures of computer clusters.
文摘为了解决超万卡智算集群硬件故障多、任务训练故障率居高不下、跨域问题定位困难等稳定性保障问题,提出了一种基于数据和知识驱动的保障超万卡智算集群稳定性的方案。首先,通过异构资源一体化采集技术、分布式实时大数据抽取—转换—加载(extract-transform-load,ETL)技术采集集群性能数据;然后,基于改进的自注意力机制的双向长短期记忆(self-attention-based bidirectional long short-term memory,SABiLSTM)网络深度学习模型实现故障诊断;最后,通过知识图谱分析匹配诊断模型输出的结果,完成故障诊断报告的输出,提升诊断模型输出的可解释性。在深度学习模型提取时序性特征时引入特征权重系数,对不同尺度提取的特征加权融合,提高模型故障诊断精度。在基于1.8万卡智算集群故障诊断仿真实验中,损失值逐渐收敛并稳定在0.047,准确率达到了98.4%。实践表明,该稳定性保障方案能有效保障大模型训练,提升智算集群的可靠性,为未来更大规模的智算集群建设与大模型训练提供坚实的基础。