Please wait a minute...
img

官方微信

遥感技术与应用  2016, Vol. 31 Issue (4): 748-755    DOI: 10.11873/j.issn.1004-0323.2016.4.0748
数据与图像处理     
样本特征对参数/非参数分类器分类精度的影响分析
朱爽1,2,张锦水2
(1.北京工业职业技术学院,北京 100042;
2.北京师范大学资源学院/地表过程与资源生态国家重点实验室,北京 100875)
Influence on the Accuracies of Parametric and Non-parametric Classifiers from Sample Characteristics
Zhu Shuang1,2,Zhang Jinshui2
(1.Beijing Polytechnic College,Beijing 100042,China;
2.College of Resources Science and Technology/State Key Laboratory of Earth Surface Processes
and Resource Ecology,Beijing Normal University,Beijing 100875,China)
 全文: PDF(8912 KB)  
摘要:

为验证理论训练数量(10~30 p)对参数分类器(如最大似然分类)、非参数分类器(如支撑向量机)的适用性以及样本特征(光谱统计、空间分布特征)对分类器分类精度的影响,选择不同规模的训练样本进行最大似然分类和支撑向量机分类,分析分类精度与样本之间的关系。实验结果表明:随着样本量的增加,最大似然、支撑向量机分类精度均随样本量增多而提高并趋于稳定,最大似然分类精度的增长速度要快于支撑向量机。MLC受样本量的影响较大,在小样本的时候(5个),分类精度不稳定,超过30个样本的时候,分类精度稳定下来;对于SVM分类器,在小样本的时候(5个),分类精度较高且稳定,因此SVM分类适合于小样本分类,不受限于理论样本量的影响。当样本量超过最小理论样本量值(30个)的时候,最大似然分类精度要优于支撑向量机,主要是由于当样本量增加后,最大似然更易于获得有效的信息量样本,而对于支撑向量机边缘信息样本的增加数量不大。研究结果为进一步优化样本进行分类打下前期的实验基础。

关键词: 样本特征分类精度光谱离散重叠度最大似然分类支撑向量机    
Abstract:

It is of great significance for parametric and non\|parametric classifiers to assess their classification accuracy and performance influenced from the training sample size.The theoretical training sample size (10~30 p,p denotes the bands number of remote sensing image) is widely used as a criteria for training sample selection.The principals of classifiers,such as parameter and non\|parameter classifiers,are different,and the theoretical training may be not universal and suitable for all the parameters.This paper carried out a study focusing on the analysis of classification accuracy with different training sample size,and the maximum likelihood classification (MLC) as parametric classifier and support vector machines (SVM) as non\|parametric classifier are the typical and popular classifiers were introduced.The results demonstrated that the accuracies of MLC and SVM are improved and tend to be stable accompanying with the sample amount increment.It was interesting that the increasing speed of MLC is higher than that of SVM because there are more informative training samples which can describe the land cover information for MLC,while the edge pixels of land cover feature space is the informative training sample for SVM.For MLC,the accuracy fluctuation with 5 training samples is obvious,while stable results with more than 30 training samples can be achieved,which represents the MLC classifier is sensitive to the training sample amount.For SVM as non\|parameters classifier,the higher stable accuracy compared to MLC could be also obtained with little sample,even with 5 samples,representing small training sample is suitable for SVM and break the limitation of theoretical training sample size.MLC could achieve higher accuracy than that of SVM when theoretical training samples as more than 30 were used.Under such condition,the training sample set can describe the normal spectral feature space for MLC,while the sampled selected randomly from the training sample collection has not enough informative pixels to construct the support vectors which is the basis for SVM.Analysis on the principle of different classifier,the classification accuracy for land cover mapping is different influenced from the different training sample size,and the theory of theoretical training sample is not the sole criteria for training sample size determination.The different optimized training sample selection according to classifier’s principle is further explored based on above research results.

Key words: Sample characteristic    Classification accuracy    Spectral Discrete Overlap Degree (SDOD)    Maximum Likelihood Classification (MLC)    Support Vector Machine (SVM)
收稿日期: 2015-03-06 出版日期: 2016-10-14
:  TP 79  
基金资助:

国家自然科学基金青年项目(41301444),北京市教育委员会北京市高等学校“青年英才计划”项目,北京工业职业技术学院校内一般课题(bgzyky201518),国家重大专项高分辨率对地观测系统专项重大科技工程资助。

通讯作者: 张锦水(1978-),男,河北沧州人,博士,副教授,主要从事资源与环境遥感方面的研究。Email:zhangjs@bnu.edu.cn。    
作者简介: 朱爽(1981-),女,甘肃金昌人,博士,讲师,主要从事资源与环境遥感方面的研究。Email:zhushuang@mail.bnu.edu.cn。
服务  
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章  
朱爽
张锦水

引用本文:

朱爽,张锦水. 样本特征对参数/非参数分类器分类精度的影响分析[J]. 遥感技术与应用, 2016, 31(4): 748-755.

Zhu Shuang,Zhang Jinshui. Influence on the Accuracies of Parametric and Non-parametric Classifiers from Sample Characteristics. Remote Sensing Technology and Application, 2016, 31(4): 748-755.

链接本文:

http://www.rsta.ac.cn/CN/10.11873/j.issn.1004-0323.2016.4.0748        http://www.rsta.ac.cn/CN/Y2016/V31/I4/748

[1]Lu D S,Weng Q H.A Survey of Image Classification Methods and Techniques for Improving Classification Performance[J].International Journal of Remote Sensing,2007,28(5):823-870.

[2]Lu D S,Mausel P,Batistella M,et al.Comparison of Land-cover Classification Methods in the Brazilian Amazon Basin[J].Photogrammetric Engineering and Remote Sensing,2004,70(6):723-732.

[3]Mathur A,Foody G M.Crop Classification by Support Vector Machine with Intelligently Selected Training Data for an Operational Application[J].International Journal of Remote Sensing,2008,29(8):2227-2240.

[4]Kavzoglu T.Increasing the Accuracy of Neural Network Classification Using Refined Training Data[J].Environmental Modelling & Software,2009,24(7):850-858.

[5]Foody G M,Mathur A.The Use of Small Training Sets Containing Mixed Pixels for Accurate Hard Image Classification:Training on Mixed Spectral Responses for Classification by a SVM[J].Remote Sensing of Environment,2006,103(2):179-189.

[6]Sun J,Hong G S,Wong Y S,et al.Effective Training Data Selection in Tool Condition Monitoring System[J].International Journal of Machine Tools and Manufacture,2006,46(2):218-224.

[7]Tsai F,Philpot W D.A Derivative-aided Hyper Spectral Image Analysis System for Land-cover Classification[J].IEEE Transactions on Geoscience and Remote Sensing,2002,40(2):416-425.

[8]Yu Xin,Zheng Zhaobao.Training Samples Selection Method based on Correspondence Analysis[J].Acta Geodaetica et Cartographica Sinica,2008,37(2):190-195.[虞欣,郑肇葆.基于对应分析的训练样本的选择[J].测绘学报,2008,37(2):190-195.]

[9]Zhu Xiufang,Pan Yaozhong,Zhang Jinshui,et al.The Effects of Training Samples on the Wheat Planting Area Measure Accuracy in TM Scale (I):The Accuracy Response of Different Classifiers to Training Samples[J].Journal of Remote Sensing,2007,11(6):826-837.[朱秀芳,潘耀忠,张锦水,等.训练样本对 TM 尺度小麦种植面积测量精度影响研究(I)—训练样本与分类方法间分类精度响应关系研究[J].遥感学报,2007,11(6):826-837.]

[10]Alejo R,Sotoca J,Valdovinos R,et al.Edited Nearest Neighbor Rule for Improving Neural Networks Classifications[J].Advances in Neural Networks,2010,6063:303-310.

[11]Blum A L,Langley P.Selection of Relevant Features and Examples in Machine Learning[J].Artificial Intelligence,1997,97(1-2):245-271.

[12]Foody G M,Mathur A.Toward Intelligent Training of Supervised Image Classifications:Directing Training Data Acquisition for SVM Classification[J].Remote Sensing of Environment,2004,93(1-2):107-117.

[13]Foody G M,Mathur A,Sanchez-Hernandez C,et al.Training Set Size Requirements for the Classification of a Specific Class[J].Remote Sensing of Environment,2006,104(1):1-14.

[14]Piper J.Variability and Bias in Experimentally Measured Classifier Error Rates[J].Pattern Recognition Letters,1992,13(10):685-692.

[15]Song C,Woodcock C E,Seto K C,et al.Classification and Change Detection Using Landsat TM Data:When and How to Correct Atmospheric Effects[J].Remote Sensing of Environment,2001,75(2):230-244.

[16]Jensen J R.Introductory Digital Image Processing:A Remote Sensing Perspective[M].New Jersey:Prentice Hall,Inc,2007:343.

[17]Tang Guoan.Image Process of Remote Sensed Data[M].Beijing:Science Press,2004:181-189.[汤国安.遥感数字图像处理[M].北京:科学出版社,2004:181-189.]

[18]Zhao Degang,Zhan Yulin,Liu Xiang,et al.Land Cover Classification in China based on Chosen Bands of MODIS[J].Remote Sensing for Land & Resources,2010,(3):108-113.[赵德刚,占玉林,刘翔,等.基于波段选择的 MODIS 全国土地覆盖分类[J].国土资源遥感,2010,(3):108-113.]

[19]Zhao Yingshi.The Principle and Method of Analysis of Remote Sensing Application[M].Beijing:Science Press,2003:245-249.[赵英时.遥感应用分析原理与方法[M].北京:科学出版社,2003:245-249.]

[20]Foody G M.Status of Land Cover Classification Accuracy Assessment[J].Remote Sensing of Environment,2002,80(1):185-201.

[21]Wang L,Jia X.Integration of Soft and Hard Classifications Using Extended Support Vector Machines[J].IEEE Geoscience and Remote Sensing Letters,2009,6(3):543-547.

[1] 郭钇宏,王博,刘勇,杨亦宁 . 综合优度法和不一致性法的最优分割参数选择方法[J]. 遥感技术与应用, 2014, 29(3): 489-497.
[2] 郑忠,曾永年,刘慧敏,徐艳艳,于菲菲. 并联结构组合分类器的误差分析[J]. 遥感技术与应用, 2011, 26(3): 340-347.
[3] 何宇婷,柯长青. 半方差函数纹理提取在遥感图像分类中的应用[J]. 遥感技术与应用, 2008, 23(5): 571-575.
[4] 杜红艳,张洪岩,张正祥. GIS支持下的湿地遥感信息高精度分类方法研究[J]. 遥感技术与应用, 2004, 19(4): 244-248.
[5] 李 爽,丁圣彦,钱乐祥. 决策树分类法及其在土地覆盖分类中的应用[J]. 遥感技术与应用, 2002, 17(1): 6-11.
[6] 徐素兰 , 李四海 . 航空像片数字图像土地利用分类和制图[J]. 遥感技术与应用, 1996, 11(1): 54-61.
[7] 朱建国编译. 分段线性分类与最大似然和平行六面体分类的对比分析[J]. 遥感技术与应用, 1995, 10(4): 56-60.
[8] 吴健平 杨星卫. 遥感数据分类结果的精度分析[J]. 遥感技术与应用, 1995, 10(1): 17-24.