Abstract:
A local outlier factor algorithm based on GeoHash approach (GeoHash-LOF) was proposed to obtain comprehensive and reliable environmental monitoring data. Compared to the traditional LOF algorithm, GeoHash-LOF introduced the concepts of address partitioning and region encoding, significantly reducing computational overhead. Identified outlier data was repaired using Genetic Algorithm-improved Grey Model (GA-GM) prediction technique. By optimizing the background value and initial value in the grey prediction model, the accuracy of prediction was enhanced. Taking the data provided by European Nuclear Energy Agency (ENEA) as an example, the proposed GeoHash-LOF algorithm and GA-GM technique were compared with other algorithms. The results demonstrated that the proposed algorithms exhibited higher efficiency in identifying anomaly data and achieved better fit in missing data restoration.