Modelling and Evaluating Air Quality with Fuzzy Logic Algorithm-Ankara-Cebeci Sample

Air is one of the most important life sources for all living things. Gases that are present and absent in the composition of clean air also considered as pollutants in the atmosphere. If the pollutants rise above a certain concentration level, air pollution occurs. Air pollution damages all living things, especially human health. Accurate estimation of pollutant concentrations through air pollution modeling has an important effect in reducing the adverse effects of air pollution and taking necessary precautions. Conventional statistical models are widely used in air pollution forecasting and modeling. As a different approach, in this study, fuzzy logic algorithm (FLA), which has been increasingly successful in many field applications, is used to model air quality and air pollution analyzes are made based on this model. Ankara province data is used in the sample of the research.


Introduction
Air is a gas mixture in its composition that contains unchanging gases such as nitrogen and oxygen, changing gases such as carbon dioxide, ozone and water spring, and pollutants such as sulfur dioxide, carbon monoxide, ozone oxide and particle matter. Air is one of the indispensable sources of life for all living beings. Air pollution occurs when pollutants rise above a certain concentration value in air. Increases in the amount of pollutants in air are caused by natural sources such as forest fire, volcano eruption, and artificial sources such as heating devices, industrial establishments, vehicles [1 -3]. Urbanization and industrialization have increased with the increasing population density and developing technology in the world. As a result, the air pollution has a significant impact on the global scale as well as on the local and regional scale. This has become a threat to all living things, especially to human health. Therefore, it is now an urgent necessity to determine the strategies and methods to improve air quality and prevent air pollution. Public authorities need a well-informed community support to succeed in their strategies [4]. Information on air quality status related to regions in all the countries of the world and its health effects are presented by public authorities via Air Quality Index (AQI). The index provides a viable flow of information to transform air pollution data into an understandable format and taking appropriate measures to protect public health [5]. AQI performs this function by dividing the air quality according to measured pollutants into classes such as good, moderate, unhealthy and hazardous, and assigning colours to each class including different definition intervals. AQI is defined according to each country's own methods and criteria. Accurately estimating the concentrations of pollutants by modelling the air quality has an important effect in reducing the adverse effects of pollution and taking the necessary precautions. In air quality estimation and modelling, traditional statistical models such as regression analysis, clustering analysis, variance analysis are widely used. However, these models do not provide the desired success in determining air quality due to the presence of complex and non-linear relationships between pollutants, and the inability to perceive large data sets [6]. In this study, an AQI model is designed to measure air quality status by the FLA, which is known to produce successful results in solving problems with nonlinear structure. The rest of the study is organized in the following way: Section 2 presents a literature review on the FLAbased air quality. The modelling of the air quality with the FLA is described in detail according to the pollutant data of Ankara Province, Cebeci region in Section 3. Section 4 presents the results and evaluations of the proposed model. In the last section, the conclusions we obtained are given.

Related Works on FLA-Based Air Quality
Studies on air quality over the last 10 years have shown that statistical and analytical methods cannot achieve the desired performance on modeling and estimating air quality. The nonlinear relationship between pollutants and air quality, and the inadequacies of working with large data sets of related methods are among the main reasons for this failure. Therefore, researchers needed new methods that make a sensitive assessment by establishing a relationship between contradictory observations and that correlate sensitively between nonlinear relations in environmental issues such as air quality assessment. The FLA proposed by Zadeh in 1965 is one of well-suited candidates to be used for solution of environmental problems. The first important application of FLA in environmental decision-making and imaging problems was performed by Silvert in 2000. In this study, Silvert applied FLA to environmental conditions and showed that this algorithm provides a powerful and useful formalism for classifying environmental conditions and defining anthropogenic changes [7].  The studies conducted in this area in the following years are presented in chronological order below. In the study conducted by Pokrovsky, Kwok and Ng in 2002, the fuzzy sets and artificial neural network approaches were applied to air pollution parameters such as wind speed, wind direction, solar radiation, air temperature, and O3, NO2, NO, NOy to define the urban air pollution [8]. In the study conducted by Sowlat, Gharibi, Yunesian, Mahmoudi and Lotfi in 2011, a FLA-based AQI was developed. In order to determine the air quality index, they used air pollutants such as CO, SO2, PM10, O3, NO2 and other harmful substances such as benzene, toluene, ethyl benzene, xylene known to have adverse effects on human health. Using the air quality data from five different monitoring stations in Tehran from January 2008 to December 2009, they proved that the FLA can successfully determine the AQI [9]. Another study, using FLA to determine air quality, was conducted by Mandal, Gorai and Pathak in 2012. In this study, the AQI was modelled with the FLA as in the previous study, and the results are compared with the traditional analytical hierarchy approach [10]. In another study conducted by Assimakopoulos, Dounis, Spanou, and Santamouris in the same year, the AQI for indoor environments were evaluated with FLA according to CO2, PM10, PM2,5, PM1 pollutants and passenger counts measured in the Athens metro [11]. In 2014, Milla dos Santos and their friends conducted a study on health risk assessment by taking maximum and average concentrations of São Paulo city pollutant parameters (SO2, NO2, O3, CO and PM10) into account. In this study, they used both FLA and geographic information systems algorithm. They showed that this study can produce useful results in this area if two algorithms are used together [12]. In the study conducted by Dursun, Kunt and Taylan in 2015, the FLA has been utilized to show the influence of some environmental conditions on air quality and SO2 pollution level. SO2 concentrations and meteorological data were collected from 15 sites selected from Konya City for the estimation of the air quality. The results of the model were evaluated with fuzzy quality schedules and compared with the results obtained from Turkish Environmental Protection Association air quality standards [13].

Modeling Air Quality with FLA
The modeling of the air quality is carried out through air quality sampling data obtained from the web site of the Ministry of Environment and Urban Planning (EUP Ministry) in Ankara. This data consists of a data set for January-December of 2010-2016 which was watched in the Cebeci area [14]. The data set contains sulfur dioxide (SO2), nitrogen dioxide (NO2), carbon monoxide (CO), ozone (O3) and particulate matter (PM10) as pollutants. The model to be used in determining the air quality is created considering the data containing these matters. For the monitoring data above the boundary values in the data set, the maximum values determined by the EUP Ministry shown in Table 1 are taken into consideration. In addition, the data from the average values of the pollutants are used instead of missing data in the dataset so that the model can produce more healthy results. In the first step of creating the FLA-based model, the input and output variables of the system are determined. In general, the air quality is determined by the AQI and the value of this parameter is decided according to pollutants in the air. Therefore, SO2, NO2, CO, O3, and PM10 pollutants are selected as the input variables and the AQI parameter is selected as the output variable. In the second step of modelling, the boundary values of the universal sets for the selected input and output variables are determined and the fuzzy sets to be defined in these universes are detected for the fuzzification process. The boundary values of universal sets are decided by using the boundary values given in Table 1. Each variable is represented by three different fuzzy sets, "LOW", "MODERATE", "HIGH" in these universes. For these fuzzy sets, both the trapezoidal and triangular membership functions are used. The boundary values of "LOW", "MODERATE" and "HIGH" fuzzy sets are determined in the form corresponding to the boundary values of the sets defined as "GOOD", "MODERATE-SENSITIVE" and "UNHEALTHY-VERY UNHEALTHY-HAZARDOUS" by EUP Ministry, respectively. Table 2 shows the boundary values of universal sets and fuzzy sets for input variables.  The membership functions for fuzzy sets belonging to input variables are defined according to the boundary values given in Table 2. Fig. 1 and Fig.2 show the triangular and trapezoidal type membership functions defined for SO2, NO2, CO, O3 and PM10 pollutants, respectively.  The AQI, selected as an output variable, is represented by six fuzzy sets: "GOOD", "MODERATE", "SENSITIVE", "UNHEALTHY", "VERY UNHEALTHY" and "DANGEROUS". Just as in determining the boundary values of the fuzzy sets defined for the input variables, the boundary values of these sets are also decided by considering the value ranges used by the EUP Ministry. Table 3 shows the boundary values of universal sets and fuzzy sets for the output variable. The triangular and trapezoidal membership functions defined for the AQI according to the boundary values given in Table 3 are shown in Fig.3 and Fig.4, respectively.  Table 4. In the last step of the modeling, the methods to be used in the inference and defuzzification process are decided. In general, there are 4 types of inference strategies used in the composition of rules in the inference procedure: max-min composition, max-product composition, max-drastic-product composition and max-boundedproduct composition. Among them, the max-min and max-product inference strategies are widely used in FLA-based applications due to their computational simplicity and efficiency. The difference between the max-product and max-min inference strategies can be explained in the way that the max-product inference strategy produces a lower change in the output of the FLA than the maxmin inference strategy because, in this strategy, the effect of the corresponding rule on the composition is decided by using the product operator. On the other hand, many methods such as the max-membership principle, centroid, weighted average, and center of sums are used in the defuzzification process. Among them, the centroid method produces a more accurate result than the other methods because it finds the value that corresponds to the center of gravity of the shape obtained as a result of inference as the defuzzified result. Thus, in this study, while the max-min inference strategy is used in the inference process, the centroid method is used in the defuzzification process.

Results and Discussions
In the FLA-based AQI model presented in Chapter 3, the input variables with the highest values are effective in determining the AQI value. Fig. 5   (c) (d) The statistical data, comparing the AQI results on hourly basis of FLA-based AQI model with the ones of the current method used by the EUP Ministry, is shown in Table 5.
When the results are analyzed in general, it is seen that, in some ranges, the AQI results produced by the FLA-based AQI model are closer to the ones produced by the current model whereas, in other ranges, these results are different from each other. The difference between the results of the current model and the FLA-based AQI model results from the dissimilarity in deciding the AQI. While the current model shows a sharper approach by taking into consideration the highest pollutant value to determine the AQI, the FLA-based AQI model provides a more flexible approach by determining this index with rules based on the human experience. When the results of both FLA based models are analyzed, it can be seen that the FLA-based AQI model with the trapezoidal membership function produces closer results to the current model than the FLA-based AQI model with triangular membership functions.

Conclusion
In this study, an AQI model based on the FLA that determines the air quality according to the pollutant data watched in Ankara province-Cebeci area between 2010 and 2016 was developed and a number of analyzes were conducted through the related model in order to test its performance. Analyzes on the FLA-based models constructed using triangular and trapezoid type membership functions showed that they can successfully determine the air quality in the mentioned region. In addition, observations on the results of the proposed model and the current method used by the EUP Ministry showed that the model with trapezoidal type membership functions produces closer results to the current method. In conclusion, these results proved to us that the FLA has successfully determined the AQI and so it can provide important contributions to the solution of the problems encountered in air quality estimation.