A notable portion of cachelines in real-world workloads exhibits inner non-uniform access behaviors.However,modern cache management rarely considers this fine-grained feature,which impacts the effective cache capacity...A notable portion of cachelines in real-world workloads exhibits inner non-uniform access behaviors.However,modern cache management rarely considers this fine-grained feature,which impacts the effective cache capacity of contemporary high-performance spacecraft processors.To harness these non-uniform access behaviors,an efficient cache replacement framework featuring an auxiliary cache specifically designed to retain evicted hot data was proposed.This framework reconstructs the cache replacement policy,facilitating data migration between the main cache and the auxiliary cache.Unlike traditional cacheline-granularity policies,the approach excels at identifying and evicting infrequently used data,thereby optimizing cache utilization.The evaluation shows impressive performance improvement,especially on workloads with irregular access patterns.Benefiting from fine granularity,the proposal achieves superior storage efficiency compared with commonly used cache management schemes,providing a potential optimization opportunity for modern resource-constrained processors,such as spacecraft processors.Furthermore,the framework complements existing modern cache replacement policies and can be seamlessly integrated with minimal modifications,enhancing their overall efficacy.展开更多
P2P流媒体cache是一种有效减少带宽开销、提高对象利用率的技术,通常采用FIFO,LRU等算法置换内容.然而,流媒体不同于Web对象,P2P网络也有别于客户/服务器模式.在分布式应用中这些算法可能影响系统的性能,为此,分析了FIFO和LRU置换算法,...P2P流媒体cache是一种有效减少带宽开销、提高对象利用率的技术,通常采用FIFO,LRU等算法置换内容.然而,流媒体不同于Web对象,P2P网络也有别于客户/服务器模式.在分布式应用中这些算法可能影响系统的性能,为此,分析了FIFO和LRU置换算法,提出了基于供求关系的SD算法,以及基于分片副本数量的REP算法,并对其进行评估和比较.针对不同的节点到达间隔,将SD和REP同FIFO,LRU进行比较,发现在启动延迟、媒体副本数量和根节点依赖度方面SD和REP几乎均优于FIFO和LRU.同LSB(least sent bytes)算法相比,某些场景中SD的启动延迟减少了约40%,而REP在副本数量方面远远超过LSB的结果,说明在P2P网络流媒体服务中使用SD和REP缓存置换算法有助于提高系统性能.展开更多
LRU替换算法在单核处理器中得到了广泛应用,而多核环境大都采用多核共享最后一级Cache(LLC)的策略,随着LLC容量和相联度的增加以及多核应用的工作集增大,LRU替换算法和理论最优替换算法之间的差距越来越大。该文提出了一种平均划分下基...LRU替换算法在单核处理器中得到了广泛应用,而多核环境大都采用多核共享最后一级Cache(LLC)的策略,随着LLC容量和相联度的增加以及多核应用的工作集增大,LRU替换算法和理论最优替换算法之间的差距越来越大。该文提出了一种平均划分下基于频率的多核共享Cache替换算法(ALRU-F)。该算法将当前所需要的部分工作集保留在Cache内,逐出无用块,同时还提出了块粒度动态划分下基于频率的替换算法(BLRU-F)。该文提出的ALRU-F算法相比传统的LRU算法缺失率降低了26.59%,CPU每一时钟周期内所执行的指令数IPC(Instruction Per Clock)则提升了13.59%。在此基础上提出的块粒度动态划分下,基于频率的BLUR-F算法相比较传统的LRU算法性能提高更大,缺失率降低了33.72%,而IPC则提升了16.59%。提出的两种算法在性能提升的同时,并没有明显地增加能耗。展开更多
根据VOD的特点开发了两种基于访问频率的替换算法:LFRU(least frequency and recently used)和PLFU(period least frequency used)算法,它们都试图将访问频率大的视频数据保留在Cache中。LFRU算法结合了数据的访问频率和访问时间信息...根据VOD的特点开发了两种基于访问频率的替换算法:LFRU(least frequency and recently used)和PLFU(period least frequency used)算法,它们都试图将访问频率大的视频数据保留在Cache中。LFRU算法结合了数据的访问频率和访问时间信息,对访问模式的变化具有一定的适应性。PLFU算法用周期法和预测法解决了LFU算法中的Cache“污染”问题。展开更多
文摘A notable portion of cachelines in real-world workloads exhibits inner non-uniform access behaviors.However,modern cache management rarely considers this fine-grained feature,which impacts the effective cache capacity of contemporary high-performance spacecraft processors.To harness these non-uniform access behaviors,an efficient cache replacement framework featuring an auxiliary cache specifically designed to retain evicted hot data was proposed.This framework reconstructs the cache replacement policy,facilitating data migration between the main cache and the auxiliary cache.Unlike traditional cacheline-granularity policies,the approach excels at identifying and evicting infrequently used data,thereby optimizing cache utilization.The evaluation shows impressive performance improvement,especially on workloads with irregular access patterns.Benefiting from fine granularity,the proposal achieves superior storage efficiency compared with commonly used cache management schemes,providing a potential optimization opportunity for modern resource-constrained processors,such as spacecraft processors.Furthermore,the framework complements existing modern cache replacement policies and can be seamlessly integrated with minimal modifications,enhancing their overall efficacy.
文摘P2P流媒体cache是一种有效减少带宽开销、提高对象利用率的技术,通常采用FIFO,LRU等算法置换内容.然而,流媒体不同于Web对象,P2P网络也有别于客户/服务器模式.在分布式应用中这些算法可能影响系统的性能,为此,分析了FIFO和LRU置换算法,提出了基于供求关系的SD算法,以及基于分片副本数量的REP算法,并对其进行评估和比较.针对不同的节点到达间隔,将SD和REP同FIFO,LRU进行比较,发现在启动延迟、媒体副本数量和根节点依赖度方面SD和REP几乎均优于FIFO和LRU.同LSB(least sent bytes)算法相比,某些场景中SD的启动延迟减少了约40%,而REP在副本数量方面远远超过LSB的结果,说明在P2P网络流媒体服务中使用SD和REP缓存置换算法有助于提高系统性能.
文摘LRU替换算法在单核处理器中得到了广泛应用,而多核环境大都采用多核共享最后一级Cache(LLC)的策略,随着LLC容量和相联度的增加以及多核应用的工作集增大,LRU替换算法和理论最优替换算法之间的差距越来越大。该文提出了一种平均划分下基于频率的多核共享Cache替换算法(ALRU-F)。该算法将当前所需要的部分工作集保留在Cache内,逐出无用块,同时还提出了块粒度动态划分下基于频率的替换算法(BLRU-F)。该文提出的ALRU-F算法相比传统的LRU算法缺失率降低了26.59%,CPU每一时钟周期内所执行的指令数IPC(Instruction Per Clock)则提升了13.59%。在此基础上提出的块粒度动态划分下,基于频率的BLUR-F算法相比较传统的LRU算法性能提高更大,缺失率降低了33.72%,而IPC则提升了16.59%。提出的两种算法在性能提升的同时,并没有明显地增加能耗。
文摘根据VOD的特点开发了两种基于访问频率的替换算法:LFRU(least frequency and recently used)和PLFU(period least frequency used)算法,它们都试图将访问频率大的视频数据保留在Cache中。LFRU算法结合了数据的访问频率和访问时间信息,对访问模式的变化具有一定的适应性。PLFU算法用周期法和预测法解决了LFU算法中的Cache“污染”问题。