Abstract:
Background The surface flashover in SF6 under nanosecond pulses involves complex physical processes, and accurately predicting the surface flashover voltage of insulating media in such environments constitutes a critical challenge for the design of high-voltage pulsed power equipment and the evaluation of insulation reliability. Compared with traditional AC or DC voltages, the extremely short rise time and high amplitude of nanosecond pulses lead to significant space charge effects and distinct discharge development mechanisms, thereby posing severe challenges to prediction models based on classical theories. In recent years, with the rapid improvement of computer computing power and breakthroughs in artificial intelligence algorithms, data-driven machine learning methods have demonstrated great potential in solving complex nonlinear insulation problems.
Purpose Targeting this specific challenge under nanosecond pulses, this paper selects four algorithms, including support vector machine (SVM), multi-layer perceptron (MLP), random forest (RF), and extreme gradient boosting (XGBoost), to train and predict flashover voltage data under different experimental conditions within the multi-scale distance range of 15 mm to 500 mm.
Methods First, external operating conditions such as electric field distribution, voltage waveform, and gas pressure were parametrically extracted and characterized. The Pearson correlation coefficient was employed to conduct a correlation analysis on the aforementioned characteristic parameters, and ultimately 22 feature quantities were screened out as the model inputs. Subsequently, the Bayesian hyperparameter optimization algorithm was utilized to perform hyperparameter optimization for four types of algorithms, and the 10-fold cross-validation method was adopted to select the optimal hyperparameter combination for each algorithm. After that, the sample training set was input into the four algorithms for training, and each algorithm was validated on the test set.
Results The four algorithms demonstrated overall good performance. Among them, random forest (RF) and XGBoost exhibited excellent performance on the training set but poor performance on the validation set, which is likely a manifestation of overfitting in ensemble learning and indicates weak generalization ability. support vector machine (SVM) achieved relatively outstanding performance on both the training set and the validation set. Furthermore, the generalization performance of the SVM and XGBoost algorithms was validated using data outside the sample dataset. The results showed that SVM yielded better prediction outcomes on the data outside the sample dataset.
Conclusions SVM achieved high prediction accuracy on the training set, test set, and data outside the sample dataset, making it more suitable for the insulation design of electromagnetic pulse simulation devices.