Classification of the Level of Flight Delay Based on a VMD-MD-Clustering Method
-
摘要: 针对航班数量逐年增加导致的航班延误日益频繁问题,研究对航班延误等级分类的方法,从而为制定针对性措施,降低航班延误造成的损失提供理论基础。从时间、空间和效率3个方面确定航班延误时间、航班飞行时间、延误影响人数和航程这4个数值属性指标,以及过站是否经停、飞机载客量2个类属性指标,共计6个评估指标构建航班延误等级分类模型。提出了1种基于变分模态分解(VMD)、马氏深度(MD)函数和K-means数据聚类(Clustering)的航班延误等级分类方法(以下简称V-M-C方法)。V-M-C方法将非正态、非平稳的多维航班延误数据视作含噪声的信号序列进行处理,通过VMD降噪获得正态、稳定的多维信号数据;利用MD函数进行降维处理得到一维的稳定信号数据;使用K-means方法对得到的一维数据进行聚类,对航班延误等级分类。为确定航班延误等级分类精确性,采用带惩罚权重的支持向量机(SVM)对分类结果进行分析,可以在一定程度上提高V-M-C方法的普适性。以某大型枢纽机场某月的航班运行数据为例,只使用K-means算法的航班延误等级分类精度为81.9%,而V-M-C方法对航班延误等级分类精度可提升至95.41%。实验结果表明,V-M-C方法的分类准确率更高,能够帮助机场根据相应延误等级制定预案,保障航班整体运行正点率。Abstract: Due to the increasing number of flights, the flight delay has been increasing in recent years. To mitigate this problem, a method for classifying flight delays is studied, which provides a theoretical basis for developing relevant measures and reducing the number of flight delays. A classification model is proposed based on six indicators from time, space, and efficiency aspects. These indicators include four numerical indicators, namely"delay time", "flying duration", "number of people affected by the delay", and"voyages affected by the delay", as well as two attribute indicators, i.e., "stopover flight or not"and"passenger capacity of delayed aircraft". Then, a method for classifying levels of flight delays is proposed, which combines the variational mode decomposition(VMD), Mahalanobis depth(MD)function, and K-means clustering, named as"VMD-MD-Clustering"(V-M-C)method. Firstly, non-normal and non-stationary multi-dimensional delay data are treated as a signal sequence with noise. Secondly, the VMD method is used to stabilize and normalize the delay data. Thirdly, the MD function is used to reduce the dimensionality of the data to one dimension(1D). Fourthly, the K-means method is applied to cluster the 1D signal data and output the level of flight delay. Finally, to evaluate the proposed method, a weighted support vector machine(SVM)is applied to analyze the classification results. The operation data collected from an airport in one month are used for validation. The validation results show that the proposed V-M-C method have an accuracy of 95.41%, which outperforms the K-means method with an accuracy of 81.9%. Study results show that the proposed V-M-C method has an enhanced accuracy and therefore, it is potentially useful for formulating flight-delay disposal plans and improving the punctuality of flight operations.
-
表 1 航班延误等级评价指标
Table 1. Evaluation index of flight delay level
属性 评价指标 说明 数值属性 航班延误时间T /min
T = Tataf - Tetaf
式中:Tataf为航班f的实际起飞时间;Tetaf为航班f的预计起飞时间航班延误时间越长,航班延误成本越高,对后续航班造成的影响更大,波及延误更广,对机场、航空公司以及旅客产生直接的延误经济损失 飞行时间Tf /min
Tf= Tataf-Tendf
式中:Tendf为航班f到目的地机场的时间。飞行时间越长,油耗成本越高,若延误航班的飞行时间越长,则飞行中产生影响航班安全飞行和实际飞行时长的因素几率就越大,且可能导致航班需要过夜,增加延误成本。飞行时间从航空公司角度反映了延误程度 延误影响人数N:航班f的实际载客量 延误影响人数越多,造成的旅客经济损失就越大,这种经济损失可描述为延误时间占用了旅客正常的工作时间,延误人数是从旅客角度评价延误程度 航程d /km:航班f从起飞机场到目的机场的飞行距离 延误航班飞行距离越长,那么航班经过的管制扇区就相对越多,需要进行管制移交的次数就越多,受影响的管制员就越多,航程是从管制员角度考虑延误程度 类属性 经停:航班f是否需要经停,经停为1,不经停为0 延误航班是否需要经停反映了受影响的机场个数,经停延误航班不仅对目的地机场产生影响同时也会影响经停机场,航班是否经停是从机场角度确定延误程度 延误航班机型g:根据飞机执机机型不同可分为4类:
g1:小机型,座位数100座以下;
g2:中机型,座位数100座(含)以上,200座以下;
g3:大机型,座位数200座(含)以上,400座以下;
g4:超大机型,座位数400座(含)以上延误航班机型越大,所需尾流间隔也越大,恢复航班运行难度也相对较大,地面等待以及空中等待经济损失也越大,机型指标是从等待经济损失角度确定延误程度 表 2 航班延误实验数据(局部)
Table 2. Flight delay experimental data (local)
延误时间/min 飞行距离/km 飞行时长/min 是否经停 机型 载客人数/人 8 915 120 0 g3 159 110 850 190 1 g2 128 1 9 130 715 0 g3 301 299 1 481 195 0 g1 65 19 2 493 245 0 g2 159 $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ $ \vdots $ 7 1 111 105 0 g2 170 表 3 评价指标权重
Table 3. Evaluation index weight
评价指标 权重值 延误时间 0.5 飞行距离 0.1 飞行时间 0.1 载客人数 0.3 -
[1] 中国民用航空局. 2019年民航行业发展统计公报[EB/OL]. http://www.caac.gov.cn.Civil Aviation Administration of China. 2019 Civil aviation industry development statistics bulletin[EB/OL]. http://www.caac.gov.cn. (in Chinese) [2] HENRIQUES R, FEITEIRA I. Predictive modelling: Flight delays and associated factors, Hartsfield-Jackson atlanta international airport[J]. Procedia Computer Science, 2018(138): 638-645. [3] SUVOJIT M, SANKET B, RITANK K, et al. A statistical approach to predict flight delay using gradient boosted decision tree[C]. International Conference on Computational Intelligence in Data Science(ICCIDS), Chennai, India: IEEE, 2017. [4] YI D. Predicting flight delay based on multiple linear regression[J]. IOP Conference Series: Earth and Environmental Science, 2017(81): 172-198. [5] BIN Y, ZHEN G, SOBHAN A, et al. Flight delay prediction for commercial air transport: A deep learning approach[J]. Transportation Research Part E: Logistics and Transportation Review, 2019, 125(5): 203-221. [6] MITICIC M. Probabilistic flight delay predictions using machine learning and applications to the flight-to-gate assignment problem[J]. Aerospace, 2021, (8)6: 152-172. [7] SHI T, LAI J, GU R, et al. An improved artificial neural network model for flights delay prediction[J]. International Journal of Pattern Recognition and Artificial Intelligence, 2021, 35 (8): 146-158. [8] 曹悦琪, 贾奇. 基于Logistic模型的大面积航班延误预测方法研究[J]. 交通信息与安全, 2017, 35(1): 86-91. doi: 10.3963/j.issn.1674-4861.2017.01.011CAO Y Q, JIA Q. A forecasting method for large-scale flight delays based on a logistic model[J]. Journal of Transport Information and Safety, 2017, 35(1): 86-91. (in Chinese) doi: 10.3963/j.issn.1674-4861.2017.01.011 [9] 黄俊生, 广晓平. 航班延误恢复的建模与算法研究[J]. 交通运输系统工程与信息, 2018, 18(A1): 44-52. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT2018S1008.htmHUANG J S, GUANG X P. Study on modeling and algorithm for delay recovery of flight[J]. Journal of Transportation Systems Engineering and Information Technology, 2018, 18 (A1): 44-52. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT2018S1008.htm [10] 王语桐, 朱金福, 马思思. 基于支持向量回归和线性回归的航班延误组合预测[J]. 武汉理工大学学报(交通科学与工程版), 2019, 43(3): 426-431. doi: 10.3963/j.issn.2095-3844.2019.03.010WANG Y T, ZHU J F, MA S S. Combination forecast of flight delay based on support vector regression and linear regression[J]. Journal of Wuhan University of Technology (Transportation Science & Engineering), 2019, 43(3): 426-431. (in Chinese) doi: 10.3963/j.issn.2095-3844.2019.03.010 [11] 丁建立, 孙玥. 基于LightGBM的航班延误多分类预测[J]. 南京航空航天大学学报, 2021, 53(6): 847-854. https://www.cnki.com.cn/Article/CJFDTOTAL-NJHK202106003.htmDING J L, SUN Y. Multi-classification prediction of flight delay based on lightGBM[J]. Journal of Nanjing University of Aeronautics & Astronautics, 2021, 53(6): 847-854. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-NJHK202106003.htm [12] 姜雨, 陈名扬, 袁琪, 等. 基于时空图卷积神经网络的离港航班延误预测[J/OL]. 北京航空航天大学学报. (2012-10)[2021-12-27]. https://doi.org/10.13700/j.bh.1001-5965.2021.0415.JIANG Y, CHEN M Y, YUAN Q, et al. Departure flight delay prediction based on spation-temporal graph convolutional networks[J/OL]. Journal of Beijing University of Aeronautics and Astronautics. (2012-10)[2021-12-27]. https://doi.org/10.13700/j.bh.1001-5965.2021.0415. [13] 刘继新, 杨光. 基于KNN的机场航班短期延误风险预测[J]. 重庆交通大学学报(自然科学版), 2021, 40(12): 12-18.LIU J X, YANG G. Short-time flight delay risk forecast based on KNN[J]. Journal of Chongqing Jiaotong University (Natural Science), 2021, 40(12): 12-18. (in Chinese) [14] 朱代武, 陈泽晖, 刘豪. 基于DBN-SVM的航班延误内在模式分析[J]. 航空计算技术, 2022, 52(1): 36-40. https://www.cnki.com.cn/Article/CJFDTOTAL-HKJJ202201008.htmZHU D W, CHEN Z H, LIU H. Internal pattern analysis of flight delay based on DBN-SVM[J]. Aeronautical Computing Technique, 2022, 52(1): 36-40. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HKJJ202201008.htm [15] 罗凤娥, 王波, 李娜, 等. 基于数据挖掘技术的航班延误预测综述[J]. 科技和产业, 2020, 20(11): 75-80. https://www.cnki.com.cn/Article/CJFDTOTAL-CYYK202011012.htmLUO F E, WANG B, LI N, et al. Research review of flight delay prediction based on data mining technology[J]. Science Technology and Industry, 2020, 20(11): 75-80. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-CYYK202011012.htm [16] 游峰, 梁健中, 曹水金, 等. 面向多目标跟踪的密集行人群轨迹提取和运动语义感知[J]. 交通运输系统工程与信息, 2021, 21(6): 42-54+95. https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT202106006.htmYOU F, LIANG J Z, CAO S J, et al. Dense pedestrian crowd trajectory extraction and motion semantic information perception based on multi-object tracking[J]. Journal of Transportation Systems Engineering and Information Technology, 2021, 21(6): 42-54+95. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-YSXT202106006.htm [17] DRAGOMIRETSKIY K, ZOSSO D. Two-dimensional variational mode decomposition[J]. IEEE Transaction On Signal Processing, 2015, 62(3): 531-544. [18] 车建国, 赵赛. 基于数据深度的过程工业故障检测方法[J]. 计算机工程与应用, 2020, 56(1): 265-271. https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202001037.htmCHE J G, ZHAO S. Fault detection method based on data depth for process industry[J]. Computer Engineering and Applications, 2020, 56(1): 265-271. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JSGG202001037.htm [19] 孙灿飞, 王友仁, 沈勇, 等. 基于参数自适应变分模态分解的行星齿轮箱故障诊断[J]. 航空动力学报, 2018, 33(11): 2756-2765. https://www.cnki.com.cn/Article/CJFDTOTAL-HKDI201811022.htmSUN C F, WANG Y R, SHEN Y, et al. Fault diagnosis of planetary gearbox based on adaptive parameter variational mode decomposition[J]. Journal of Aerospace Power, 2018, 33(11): 2756-2765. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-HKDI201811022.htm [20] 王兴隆, 纪君柔, 石宗北. 加权K-prototype-粗糙集的航班延误等级划分研究[J]. 计算机仿真, 2021, 38(9): 70-75. https://www.cnki.com.cn/Article/CJFDTOTAL-JSJZ202109014.htmWANG X L, JI J R, SHI Z B. Flight delay classification based on weighted K-prototype-rough sets[J]. Computer Simulation, 2021, 38(9): 70-75. (in Chinese) https://www.cnki.com.cn/Article/CJFDTOTAL-JSJZ202109014.htm