机器学习的版权规则:历史启示与当代方案被引量：61

Copyright Rules for Machine Learning:Historical Implications and Contemporary Solution

导出

摘要在人工智能时代,作品是机器学习的高质量数据资源。如何对机器学习的版权规则作出抉择以促进文化、技术两个领域的创新,是当前的一个重要问题。临时复制和自动钢琴的版权史提示我们:合理使用不是解决机器学习版权纠纷的唯一制度选择,非作品性使用和侵权责任对其有补充作用,应在分类讨论的基础上对机器学习版权规则进行梯度设置。具体来说,机器学习分为“非表达型”和“表达型”。前者属于非作品性使用,无侵权责任;后者进入专有权范围,推定为侵权:若学习大众表达则应设定合理使用免除侵权责任但允许权利保留,若模仿个别作者则未获许可应负侵权责任,若为科研活动则应认定合理使用免除侵权责任。我国应将作品性使用作为版权侵权成立要件之一,将大众表达型机器学习规定为附但书的合理使用情形,同时对算法训练数据版权信息披露义务作出规定。 Copyright disputes over machine learning(ML)have existed before,but now they become increasingly fierce.The reason why copyright disputes frequently arise in ML is that works are high-quality data resources that have algorithm-training value in the era of artificial intelligence(AI),and the use of works by ML,including but not limited to copying and assembling,may lead to copyright infringement.How to make a choice of the copyright rules of ML to promote innovation in the fields of literature,art and AI is a pressing issue at present.The contemporary choice of ML copyright rules has the function of awakening to the past and participating in the future.Currently,most scholars advocate applying the fair use rule to solve ML copyright problems,but copyright disputes over the player piano and temporary copying in the copyright history remind us that fair use is not the only choice to resolve ML copyright disputes and there are two issues worth further reflecting on.Firstly,are all ML behaviors within the scope of copyright and therefore require fair use as a defense?Secondly,can all ML behaviors that have entered the scope of copyright comply with the""three-step test"and therefore be judged as fair use and exempted from infringement liability?The answer is no.We should realize that fair use is effective in some cases of ML copyright disputes but not in others.In addition to fair use,solutions such as nonuse(not using works as works)and tort liability(the opposite of which is the license for use)should also be considered.There are many types and applications of ML,involving face recognition,painting,writing and many other scenarios.Different MLs have different ways of using works and different impacts on the interests of copyright holders.Therefore,it is inappropriate to generalize ML as fair use and gradient copyright rules should be designed for ML on the basis of categorical discussions of ML.Specifically,based on whether there is an expressive content output,ML is divided into"non-expressive"and"expressive"types.The former doesn't use works as works(nonuse),and therefore is not within the scope of copyright and thus does not incur tort liability.The latter uses works as works,thus falling into the scope of copyright and presumed as prima facie infringement:those aimed at learning public expression should be exempted from liability through the fair use rule,but an"opt-out"mechanism should be established;those aimed at imitating individual authors should be liable for unauthorized use of works;and those aimed at non-profit scientific research should be judged as research-oriented fair use and exempted from liability.China should consider"the use of works as works"as one of the conditions for establishing copyright infringement,provide for the use of works by non-individually expressive ML as a fair use case with a proviso,and at the same time provide for the obligation to disclose copyright information about training data in ML.

作者李安 Li An

机构地区中南财经政法大学知识产权研究中心

出处《环球法律评论》 CSSCI 北大核心 2023年第6期97-113,共17页 Global Law Review

基金 2021年度国家社会科学基金青年项目“文本和数据挖掘的著作权合理使用制度研究”(21CFXO81)的研究成果。

关键词人工智能 ChatGPT 文本数据挖掘合理使用非作品性使用

分类号 D923.41 [政治法律—民商法学] D997.1 [政治法律—国际法学]

作者简介李安,中南财经政法大学知识产权研究中心讲师。