CDAM: Conservative data analytical model for dynamic climate information evaluation using intelligent IoT environment—An application perspective

CDAM: Conservative data analytical model for dynamic climate information evaluation using intelligent IoT environment—An application perspective

Computer Communications 150 (2020) 177–184 Contents lists available at ScienceDirect Computer Communications journal homepage: www.elsevier.com/loca...

2MB Sizes 0 Downloads 10 Views

Computer Communications 150 (2020) 177–184

Contents lists available at ScienceDirect

Computer Communications journal homepage: www.elsevier.com/locate/comcom

CDAM: Conservative data analytical model for dynamic climate information evaluation using intelligent IoT environment—An application perspective Jun Ma, Hongzhi Yu ∗, Yan Xu, Kaiying Deng School of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, China

ARTICLE Keywords: Data correlation Fuzzy logic IoT Sensor data analysis Weather forecast

INFO

ABSTRACT The emergence of IoT in real-time applications has served the purpose of providing sophisticated services for industrial and residential purposes. Environmental observation and analysis of data have prevailed over multiple real-time applications for improving the efficiency of fore-cast and reporting systems. Analysis of observed data is prominent in determining the efficiency of reporting system despite compensating observation and processing errors. In this article, a conservative data analytical model (CDAM) is presented for real-time data evaluation and remittance. This evaluation model relies on the factors of correlation and conditional similarity verification for remitting reliable information across the reporting system. The correlation analysis model is preceded using fuzzy derivatives for deriving possible solutions. The solutions are obtained for the varying sensor update over uneven time intervals. The proposed CDAM is tested with the available real-time weather information for predicting dynamic climatic changes in a given region. The experimental verification of the proposed CDAM shows that it improves correlation accuracy and data analysis rate by reducing the error rate.

1. Introduction Internet of things (IoT) paradigm has emerged as a sophisticated means of interconnected communication platform encompassing heterogeneous devices. The heterogeneous devices range from tiny sensors to human interactive computers and industrial machines. Communicating devices are interconnected through internet where different technologies such as cloud, grid, parallel computing, etc. are inherited to provide a seamless and universal service support for end-users [1]. Wireless communication technologies, intelligent computers, applications, services, operating platforms, infrastructures, etc. are extended for serve end-user requests and queries with the distributed data resources. The sophistication of IoT paradigm has paved its way in diverse applications including healthcare, engineering, military, habitat monitoring, smart homes, and industrial automation [2]. The most recent application of IoT enables the integration of sensing units in weather forecasting and dynamic climatic data analysis. Intelligent sensing units such as temperature, humidity, rain detection, etc. are deployed across the geographical region to sense climatic information. The sensed information is transmitted to the forecast center that is located away from the sensing region. Observed/ sensed information is transmitted using the IoT devices that is shared through the cloud platform and is processed in the forecast center [3]. The day-to-day activities of real-world entities are dependent on the weather forecasting information that is levied at periodic intervals.

Periodic update and precise analysis of sensed information is used to predict useful information concerned with the climatic changes. Computer and technology assisted weather forecasting and data analysis is widely-adopted for reducing computational errors and improving the accuracy of prediction [4]. In particular, the manual processing that consumes large time is reduced through computed assisted processing. Real-time weather prediction and dynamic climatic changes are irregular intervals and reactive measures are getting much feasible with the introduction of IoT paradigm. IoT helps to integrate the communications of different sensors in both periodic and nonperiodic intervals and analyze those using different analytical and decision-making models. Most prominently, big data, IoT and cloud are the common technologies that are harmonized for handling climatic information and observed data from multiple sensors [5,6]. Sensor data aggregated from multiple sensing units are integrated into a single entity and is broadcasted to the forecasting center. The forecasting center employs a wide-range of applications, algorithms and processing techniques for analyzing the reliability of the sensed data and to extract useful information [7]. The useful information extracted helps to predict weather forecast and is used for fortifying the region by pre-determining natural disasters. Therefore, the reliability of the extracted information relies on the analytical system or the processing technique employed. Data analysis faces a variety of challenges including failure, complex, storage constraint, tardiness, etc. [8]. Irrespective

∗ Corresponding author. E-mail address: [email protected] (H. Yu).

https://doi.org/10.1016/j.comcom.2019.11.014 Received 11 September 2019; Received in revised form 25 October 2019; Accepted 11 November 2019 Available online 22 November 2019 0140-3664/© 2019 Elsevier B.V. All rights reserved.

J. Ma, H. Yu, Y. Xu et al.

Computer Communications 150 (2020) 177–184

artificial hydrocarbon network model achieves prediction at remote location with root mean square of 2.7 ◦ C the error in this work is estimated in the variation observed in the temperature. Jiang et al. [16] studied the various mathematical models base on future building blocks with varying weather conditions. Based on the study rather than mathematical data, morphological data suits better. On analyzing all these conditions, an application is developed namely WeatherMorph a file generator that can access weather data around 2100 locations around the world. An experimental analysis for the future i.e. 2020, 2050 and 2080 has been studied. Farah et al. [17] suggested a methodology of integrating climate change factors with the available historical weather data that tends to be more suitable for building simulations. Feature selection of available time series data is made based on maximum and minimum monthly averages, number of day’s temperature exists above a predefined threshold and consecutive number of times in a day temperature crosses a predefined threshold. Investigations of the proposed method shows the annual heating thermal energy decreases by 21%–22%, annual cooling thermal energy increases by 29%–31% and the combined annual heating and cooling thermal energy decreases by 4%–5%. This shows the actual need of climate change conditions with the historical data available. Moazami et al. [18] designed a dynamic and statistical weather dataset for future weather conditions based on Geneva building models that is practical for energy simulations of 16 ASHRAE standard buildings. Datasets are synthesized for extreme conditions with typical climatic conditions and its importance for energy performance of buildings. On comparison with the existing models, the proposed design provides an extensive result for future energy based forecasting for buildings. Guarino et al. [19] proposed a data fusion method for a combined approach that suits better for morphing existing weather data. The data fusion technique is simulated for future energy use of heating and cooling of building with consideration of climatic changes. It is compared for robustness with the existing techniques. Based on the data obtained and a prediction dataset of heating and cooling, data aggregation happens. Thus the proposed method provides a better estimation in analyzing heating and cooling of buildings. Luo et al. [20] proposed a predictive model on the basis of hybrid K mean clustering and artificial neural network. Parameters such as wall temperature, indoor air, windows, roofs etc. are monitored with IoT sensors placed at the locations of the building. A weather profile is created based on clustering analysis and it trained to an artificial neural network that improves accuracy in prediction. Elsaadany and Khalil [21] examined the quality of coverage metrics in an IoT environment. Algorithms have been proposed and tested for the metrics and the performance of the algorithms has been studied. Keswani et al. [22] proposed an irrigation control scheme-based water management scheme in order to locate water scarcity regions. On the data retrieved from environmental sensors, intensity device, CO2 sensors, temperature sensors a comparative study has been made on optimization techniques for a feed forward neural network which intends for pattern classification namely soil moisture content. A fuzzy based weather condition system manipulates the control commands with respect to varying weather conditions. From the above survey, the rate of analysis of sensor data is specific regarding the application it is employed for. The analysis of the sensed information contains errors in transmission that is to be handled carefully in deciding the results for better accuracy. Therefore, the analysis process requires time related information observation and correlation relating to the specific intervals. The application specific approaches deny the rate of evaluation from the utilization perspective, resulting in failures during irregular data analysis. Conservation data analytical model relies on the time factor irrespective of the application to validate sensed information in a prompt manner. This helps to extract precise information irrespective of the communication lags between the devices and application.

of all the challenges, it is the responsibility of the data analysis model for improving the efficiency of data analysis to aid precise prediction. Therefore, a variety of optimization algorithms and methods such as fuzzy, hybrid and genetic techniques, neural computing, etc. are employed by the forecast center for handling huge information [9]. The contributions of the article are as follows i. Designing a conservative data analysis model for processing sensor data to improve the rate of accuracy irrespective of the transmission errors ii. Designing a fuzzy solution based analysis model for correlating processed sensor information along with the similarity check for improving the rate of data analysis and reducing accuracy. iii. Performing a comparative analysis using different metrics with the existing similar approaches for verifying the consistency of the proposed method The organization of the article is as follows Section 2 introduces the existing works related to the theme of the proposed work. In Section 3 the proposed CDAM is discussed with its phases and operations in an elaborative manner followed by the experimental verification in Section 4. Section 5 concludes the article with the summary and findings of the proposed method. 2. Related works Santos et al. [10] presented an infrastructure namely PortoLivingLab that integrates IoT with city-based sensing for parameters such as Environment, Weather, Density of People and Transport for public. A flexible crowd sensing platform deployed in vehicular network with six hundred vehicles and nineteen sensors are considered as simulation attributes. The data is aggregated based on the analysis of characterization under urban environment that is stored on the basis of spatio-temporal data models. Zhang et al. [11] proposed a multi-dimensional joint prediction model (MDPM) for complex time series data extracted from sensors and based on the data, a multidimensional feature selection model and dynamic prediction model is designed. On comparison with traditional IoT prediction model, the proposed prediction provides an accuracy of around 98% for Intel Berkley Lab data and 92% accuracy for Chicago park weather & water data. Calderoni et al. [12] designed a sensor network management framework namely IoT manager developed by university of bologna. With this case study, a detailed procedure about this implementation has been provided for scientific community. Results depict that IoT manager can be used as a test bed for research and teaching purposes. Fowdur et al. [13] implemented an IoT based short term weather monitoring system on university campus to provide data at intervals of twenty minutes to an hour. This monitoring system is designed using multiple linear regression technique (MLRT). Several prediction algorithms from linear regression to K nearest neighbor have been experimented. Parameters related to weather such s temperature, humidity, Wind direction, rainfall etc. Based on these parameters the prediction algorithms are analyzed. Dobrescu et al. [14] developed a control and monitoring unit which controls entity execution on IoT platform, a context aware control platform which functions as a three tier architecture that acts to be a middleware between environmental sensors and IoT, cloud and fourlevel architecture that monitors and performs agricultural processes. A case study has been made implemented on IBM bluemix IoT platform that controls irrigation system with reference to environmental changes. Ponce and Gutierrez [15] presented an IoT system for predicting climatic environment i.e. temperature based on supervised learning namely, artificial hydrocarbon networks model. It predicts the temperature at remote location by comparing it with information that is obtained from web service. Experimental results depict that the 178

J. Ma, H. Yu, Y. Xu et al.

Computer Communications 150 (2020) 177–184

3. CDAM: Conservation Data Analytical Model

In Eq. (1), C, 𝜇 and 𝑀 represents the co-variance matrix, mean deviation and data feature count respectively. The distribution on the (𝐴×𝑁) with 𝑇 is formatted as an matrix wherein the distribution with time is given as

CDAM is designed to analyze IoT sensor data and to mitigate the errors in weather forecast. The purpose of CDAM follows three phases namely: Data observation, correlation analysis and conditional similarity verification. In the data observation phase, a typical IoT sensor environment for sensing temporal climate parameters is deployed. The sensed information is transmitted to the forecast center (FC) where further analysis is preceded. For data analysis, the forecast center employs CDAM and then proceeds with the forecast results. In this article, CDAM is considered as an intelligent data processing for all the received input data. In Fig. 1, the process of CDAM in a typical IoT architecture is described. The sensed information is transmitted using gateways/other IoT layer components. As this is a monitoring IoT platform, the deployed sensors perform periodic update of the climatic changes. Unlike a request/response IoT communication, the information is a one-way flow. The phases in the CDAM are described in the following sub-section.

⎡ 𝑎11 ⎤ 𝑎12 ⋯ 𝑎1𝑛 ⎢ ⎥ 𝑎21 𝑎22 … 𝑎2𝑛 ⎢ ⎥ (2) 𝐴 = 𝑎1 , 𝑎2 , … , 𝑎𝑛 = ⎢ ⎥ ⋮ ⋮ ⋮ … ⎢ 𝑎 ⎥ 𝑎𝑚2 𝑎𝑚𝑛 𝑚1 ⎣ ⎦ { } Here, 𝑎1 , 𝑎2 , … , 𝑎𝑛 ∈ 𝐴 and {𝑚1, 𝑚2, … , 𝑚𝑛} ∈ 𝑀. The representation in Eq. (2) correlated the different types of data with the observed features 𝑀. The data represented as a combination of type and feature and therefore it is analyzed as different variants (i.e.) C𝑎 and C𝑚 . The estimation of C𝑎 and C𝑚 is given in Eq. (3) ]

[

C𝑚 = C𝑎 =

1 ∑𝑇 𝑖=1 (𝑎𝑚𝑖 𝑇 −1 1 ∑𝑇 𝑗=1 (𝑎𝑚𝑗 𝑇 −1

⎫ − 𝑎̂𝑖 )2 ⎪ ⎬ − 𝑎̂𝑗 )2 ⎪ ⎭

(3)

⎫ 𝑎𝑖𝑐 𝑎𝑛𝑑 ⎪ ⎬ ∑(𝐴×𝑁) 𝑎𝑖𝑑 ⎪ 𝑑=1 ⎭

(4)

Where,

3.1. Phase 1: Data observation

𝑎̂𝑖 =

In the data observation phase, the hardware setup of the data analysis model is presented. This model resembles the architecture as illustrated in Fig. 1. The architecture is classified into three layers namely: monitoring, IoT-Cloud and weather forecast center. In the monitoring layer, sensing devices for temperature, rain, wind speed (anenometer) is placed for observing climatic changes. The sensed information is transmitted using gateways through the IoT-Cloud layer, for further analysis. The raw data is processed in the weather forecast canter using CDAM. The CDAM processes are correlation analysis and conditional verification for estimating the reliability and error of the observed data. The communication between the different layers is performed through dedicated gateways that employ different wireless technologies for information sharing. The word layer here indicates the physical components present in the architecture illustrated. The monitoring and cloud layers are the direct functional components in sensing, actuating and transmitting sensor information. In an IoT data transmission and sensing follows the conventional layer based operations of the network. The weather forecast center is the processing layer where the entire data is stored, processed, and retrieved for future use. The sensor information is obtained in a periodic and reactive manner depending on the dynamic changes in the climate.

1 (𝐴×𝑁)

𝑎̂𝑗 =

∑(𝐴×𝑁)

1 (𝐴×𝑁)

𝑐=1

From Eqs. (3) and (4) the independent analysis of the observed data with respect to features and count with time is estimated. This distributed analysis is useful in classifying correlated data with different transmission time. Followed by the C𝑚 , C𝑎 estimation, the integrated co-variance (C) is estimated using Eq. (5) as 𝑇 −1 1 ∑ (𝑎 − 𝑎̂𝑖 )(𝑎𝑛𝑖 − 𝑎̂𝑗 ) (5) 𝑇 − 1 𝑖=1 𝑚𝑐 ( ) From a joint estimation of C𝑎 , C𝑚 using Eq. (5), the degree of correlation 𝜌C is computed using Eq. (6) ( ) C𝑎 , C𝑚 C 𝜌C = √ = √ (6) C𝑚 × C𝑎 C𝑚 × C𝑎

( ) C = C𝑎 , C𝑚 =

The above degree of correlation is estimated in accordance with C𝑚 and C𝑎 respective for all (𝐴 × 𝑁) sensor data in 𝑡𝑛 . This degree analysis is estimated in all intervals of 𝑇 . Contrarily, this input set consists of errors (𝑒) due to replications/ missing inputs from the IoT monitoring layer. Therefore this correlation is analyzed using a partial fuzzy set. The next fuzzy analysis, two sets are required. Considering the sensing time interval as constant (𝑡2 − 𝑡1 = 𝑡3 − 𝑡2 = ⋯ 𝑡𝑛 − 𝑡𝑛−1 ), the (𝐴 × 𝑁) is differentiated into two sets as 𝑎𝑛 and 𝑎𝑚 . The sensing interval is constant depending upon the application. The type and update time of the application varies for the sensors in the real-time monitoring. For a constant flow of sensor information, this interval is periodically set to update the current information. Based on this information, the nearness and precision of the analyzed data is concluded. From these two sets representing the same matrix as in Eq. (2), with respect to time, the correlation fuzzy degree (𝜌𝑓 ) is estimated using Eq. (7) as

3.2. Phase 2: Correlation analysis The information/Data from the monitoring sensors are subjected to vary with tome, projecting it as temporal. FC analyzes this temporal data for weather forecast. The reliability in analyzing temporal data is dependent on the analyzing system to achieve better accuracy. In this proposed scheme, fuzzy base correlation analysis with conditional verification is incorporated. The fuzzy analysis operated in both correlation analysis and condition similarity verification phases. It operates in a partial manner over the third phase for framing the conditional rules of analysis. The information from the IoT layer (Climate-parameter sensing) is observed from multiple sensors (as mentioned in phase 1). Let 𝐴 represent the type of data from of fata from 𝑁 sensors, therefore, at any analysis initiating time (𝑡𝑎 ), the total available data is (𝐴 × 𝑁). It 𝑇 represents the cumulative time interval for observation, 𝑇 = {𝑡1 , 𝑡2 , … , 𝑡𝑛 } is the segregated time for (𝐴 × 𝑁) available data. The instigation of the correlation analysis follows (𝐴 × 𝑁) distribution to improve the rate of analysis. This distribution is performed to analyzed data independently irrespective of the missing/error prone data from the sensors. The data distribution process for analysis follows Gaussian model for which the distribution function 𝐷(𝐴×𝑁) is given by Eq. (1) ∑ ] [ −(𝑀 − 𝜇)𝑇 (𝑀 − 𝜇) 1 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 (1) 𝐷(𝐴×𝑁) = √ 2 (2𝜋)𝑀∕2 |C|

𝑎𝑛 − 𝑎𝑚 𝜌𝑓 = √ , ∀𝑛 > 𝑚 √ 1 − 𝑎2𝑛 . 1 − 𝑎2𝑚

(7)

The fuzzy degree as in Eq. (7) is estimated for both the number of sensor inputs and features jointly. The forecast information relies on 𝜌C and 𝜌𝑓 for providing an accurate weather report. The information (𝐴 × 𝑁) is correlated with data stored previously. The prediction and analysis of the observed data is subject to the reference data that is stored previously. Therefore, 𝜌𝑓 of the stored data in accurate and hence its maximum reliability is 1. On the other hand, the observed data in 𝑡𝑛 ∈ 𝑇 , needs to pass the verification provide 𝜌C = 𝜌𝑓 (ideal solution). As the stored 𝜌𝑓 information in accurate, 𝜌C ≯ 𝜌𝑓 but the chances of 𝜌C < 𝜌𝑓 is high and therefore 𝜌C + 𝑒 = 𝜌𝑓 . This means that 179

J. Ma, H. Yu, Y. Xu et al.

Computer Communications 150 (2020) 177–184

Fig. 1. CDAM Process in Weather Forecast Centre Illustration. Table 1 Mean variance and error of analyzed data.

the analysis of 𝜌C is accounted with the error to achieve 𝜌𝑓 . Now the error estimation becomes prominent is validating the reliability of the correlation analysis. For any time instance 𝑡𝑛 , as per fuzzy derivative, ( ) ( ) 𝑡 𝑎 , 𝑎 = 𝑡 𝑎 , 𝑎 && ( 𝑛 𝑛 ) 𝑚 ( 𝑛 𝑚) 𝑛 𝑡𝑛 = 𝑎𝑛 , 𝑎𝑚 ∥ 𝑡𝑛 𝑎𝑚 , 𝑎𝑛 = 1, ∀𝑎𝑛 = 𝑎𝑚 𝑒𝑙𝑠𝑒 ( ) ( ) 𝑡𝑛 = 𝑎𝑛 , 𝑎𝑚 ∥ 𝑡𝑛 𝑎𝑚 , 𝑎𝑛 < 1

⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭

Day

1 2 3 4 5 6 7 8 9 10

(8)

The above fuzzy derivatives are extracted for ′ 𝑛′ ∈ 𝑁 requests in the time instance 𝑡𝑛 ∈ 𝑇 . The independent fuzzy correlation analysis for both 𝑛 and 𝑚 is derived using Eq. (9) ( ) 𝜌𝑓 𝑎𝑛 =

∑𝑇

𝑖=1 (𝑎𝑚 −𝑎𝑖 )

∑𝑇

∑𝑇

𝑗=1 (𝑎𝑚𝑖 −𝑎𝑛𝑗 )

𝑖=1

( ) 𝜌𝑓 𝑎𝑚 =

∑𝑇 ∑𝑇 𝑖=1

𝑖=1 (𝑎𝑛 −𝑎𝑖 )

∑𝑇

𝑗=1 (𝑎𝑚𝑗 −𝑎𝑛𝑗 )

⎫ ⎪ ⎪ ⎬ ⎪ ⎪ ⎭

(9)

Wind Speed

Temperature

Rainfall

Error

Deviation

Error

Deviation

Error

Deviation

0.023 0.0017 0.0028 0.0017 0.008 0.0022 0.0017 0.0017 0.0028 0.012

0.368 0.021 0.0353 0.021 0.127 0.0353 0.021 0.021 0.0353 0.106

0.0044 0.0107 0.0024 0.0021 0.0021 0.0017 0.0073 −0.142 0.0064 0.0136

0.028 0.106 0.021 0.021 0.021 0.028 0.078 1.4 0.106 0.106

0.032 0 0.00857 0.02 0.016 0.0348 0.047 0 0.18 0.08

0.028 0 0.021 0.028 0.028 0.028 0.0494 0 0.0636 0.0424

update time in T. The With the help of phase 2, the analysis of the wind speed, temperature and rainfall data corresponding to its observed data is presented in Figs. 3a–3c respectively. The relative variation in the analyzed data is estimated using 𝜌𝑓 and 𝜌C evaluations for the sensor data. In the above analysis, the results of the previously stored (observed) climate data is correlated with the current sensor data (dataset input) for analyzing the reliability of the CDAM. In Table 1, the mean variance and error of the observed data with respect to wind speed, temperature and rainfall is presented. The negative value of the error in temperature field indicates that it is an invalid data. The rest of the data is further analyzed for conditional similarity verification. The invalid data is identified if no information is extracted from the sensed information as its correlating information is lagging. This means the exact analysis match for the data is not present in the processing layer and therefore error is high in validating this information. This errors is used for future reference to consider this as a check point in validating other sensor inputs.

The error from both the correlation analysis is estimated as 𝑎 −𝑎 𝑎 −𝑎 ⎛ 𝑙𝑛 [1 + √∑𝑇𝑚 𝑚𝑖 ] ⎞ ⎛⎜ ]⎞ 𝑙𝑛 [1 + √∑𝑇𝑛 𝑛𝑖 ⎜ 2 ⎟ 2 ⎟ (𝑎 −𝑎 ) (𝑎 −𝑎 ) 𝑛 𝑖 𝑚 𝑖 𝑖=1 𝑖=1 ⎟ + ⎜1 − ⎟ (10) 𝑒 = ⎜1 − | 𝑎 −∑𝑇 𝑎 | ⎟ ⎜ | 𝑎 −∑ 𝑇 𝑎 | ⎟ ⎜ | | 𝑛 𝑚 𝑖=1 𝑛𝑖 | 𝑖=1 𝑚𝑖 | ⎜ | √∑𝑇 | ⎟ ⎜ | √∑𝑇 | ⎟ | | 2| 2| ⎠ ⎝ ⎠ ⎝ 𝑖=1 (𝑎𝑛 −𝑎𝑖 ) | 𝑖=1 (𝑎𝑚 −𝑎𝑖 ) | | | The observed error is then proliferated with 𝜌C to estimate 𝜌𝑓 . This error is the difference in correlation analysis of the processed data and the previously observed information. This error helps to approximate the succeeding input analysis from the sensors. The errors are common in transmission due to the varying update interval and time of processing. The fuzzy solution generated for the sensed information deviates from the errors where the estimated values are either away or high then the solution range. Such errors are estimated using Eq. (10) that corresponds to the validation errors. If these errors are unaddressed, the ) ( accuracy in processing is degraded. If 𝑡𝑛 𝑎𝑛 − 𝑎𝑚 = 1, then 𝑒 = 0. In other cases, the error is estimated using equation using Eq. (10) for the sensed (𝐴 × 𝑁) inputs. Finally, the analyzed data (without e) is update with the actual weather report in the storage. The process plane of the data correlation analysis with respect to the two conditions 𝜌𝑓 = 𝜌C and 𝜌𝑓 = 𝜌C + 𝑒 is illustrated in Figs. 2a and 2b respectively. The correlation analysis for both the independence sets 𝑎𝑛 and 𝑎𝑚 is illustrated in the above figures. The above correlation is applied for 𝐷𝐴×𝑁 distribution for all 𝐴 in 𝑇 . Therefore, the distribution with { } respect to 𝑡1 , 𝑡2 , … , 𝑡𝑛 is a linear increment function where in the current correlation analysis set relies on the previous stored input, that is segregated using the matrix represented in Eq. (2). The analyses of the above conditions are presented with the variation of T as the interval is distinct for each of the sensor and its function. With respect to the varying interval of the above conditions, the transmitted sensor data is analyzed. The conditional verification is preceded for the varying

3.3. Phase 3: Conditional similarity verification In this verification phase, the possible analysis of the sensor data prone to error is directly correlated with the previous stored information. This kind of verification helps to reduce unnecessary time expenses in validating similar data at different time instances. Besides, the next partial fuzzy derivative employed in this phase frames rules for verifying the extracted information and filter errors more precisely. For verification process, fuzzy derivative frames the following condition that is represented in Table 2. The conditional analysis presented in Table 2 represents the correlation degree with respect to the independent set 𝑎𝑛 . The degree of the fuzzy derivative is analyzed for all possible conditions of 𝜏𝑎𝑛 that is ∑ 𝑎 − 𝑇𝑖=1 𝑎𝑛𝑖

𝑛 √∑

𝑇 2 𝑖=1 (𝑎𝑛 −𝑎𝑖 )

180

for which independent solutions sets are generated. This

J. Ma, H. Yu, Y. Xu et al.

Computer Communications 150 (2020) 177–184

Fig. 2a. 𝝆𝒇 = 𝝆C Analysis for 𝒂𝒏 and 𝒂𝒎 with respect to T.

Fig. 2b. 𝝆𝒇 > 𝝆C Analysis for 𝒂𝒏 and 𝒂𝒎 with respect to T. Table 2 Conditions for fuzzy and analysis solution. Condition 1

Condition 2

𝜌𝑓 (𝑎𝑛 ) ≤ 0 ( ) 𝜌𝑓 𝑎𝑛 > 0 ( ) 𝜌𝑓 𝑎𝑛 < 0 ( ) 𝜌𝑓 𝑎𝑛 < 0

𝜏 𝑎𝑛 𝜏 𝑎𝑛 𝜏 𝑎𝑛 𝜏 𝑎𝑛

≥0 <0 ≥0 <0

Solution ( ) ( ) [𝜌𝑓 𝑎𝑛 , min{1, 𝜌𝑓 𝑎𝑛 + 𝜏 𝑎𝑛 }] ( ) ( ) [𝜌𝑓 𝑎𝑛 − 𝜏 𝑎𝑛 , 𝜌 𝑎𝑛 ] ( ) ( ) [𝜌 𝑎𝑛 , 𝜌 𝑎𝑛 + 𝜏 𝑎𝑛 ] [ { ( ) } ( )] max −1, 𝜌 𝑎𝑛 − 𝜏 𝑎𝑛 , 𝜌 𝑎𝑛

solution set relies on the range of fuzzy derivative classified between its minimum and maximum values. Abruptly, the change in sensor data with T results in classification of fuzzy solution between the minimum and maximum range. Therefore, the solution does not exceed the erroneous correlation analysis of the input data. Here for simplicity, ∑ 𝑎𝑛 − 𝑇𝑖=1 𝑎𝑛𝑖 = 𝜏𝑎𝑛 (11) √ ∑𝑇 2 (𝑎 − 𝑎 ) 𝑛 𝑖 𝑖=1 Is used. The set of conditions is Table 2 is considered for 𝑎𝑚 where, ∑𝑇 𝑖=1 𝑎𝑚𝑖 𝜏𝑎𝑚 = √ ∑𝑇 2 𝑖=1 (𝑎𝑚 − 𝑎𝑖 )

Fig. 3a. Avg. wind speed analysis.

The processed data must be compatible with either of the condition represented in Table 2. Followed by the condition framing, the input data is classified depending on the available conditions. Let ∅𝑛𝑚 represent the set of inputs classified under the conditions framed under 𝑚

(12)

181

J. Ma, H. Yu, Y. Xu et al.

Computer Communications 150 (2020) 177–184 Table 3 Experimental setup and values. Experimental setup

Value

Sensor pairs Sensing and forecasting layer distance Processing machine clock speed Local storage size Physical memory Sensed data update interval Days

12 (3 in each) 120–1500 m 2.4 GHz 4TB 8GB 5–45 min 10

Solution: [𝜌(𝑎𝑛 ), 𝜌(𝑎𝑛 ) + 𝜏𝑎𝑛 ] Analysis 1: In this case, the observed ratio of 𝑎𝑛 to (𝑎𝑛 − 𝑎𝑖 ) is high at multiple time instances and therefore, the solution is either 𝜌(𝑎𝑛 ) or 𝜌(𝑎𝑛 )+𝜏𝑎𝑛 ]. The solution lies between these two ranges wherein 𝑎𝑚 ≫ 𝑎𝑛 and therefore, from Eq. (7), 𝜌𝑓 ∝ 𝑎𝑚 , where 𝑚 > 𝑛(i.e.) the extracted features are high than the number of data inputs. The analysis metric [ ] that satisfies the conditional verification is 𝐴 = 𝑎𝑛𝑚 ∀𝑚 > 𝑛 and hence, 𝑙 (1+𝜏 ) . If this error 𝑒, lies between the the error is expected to be 1 − 𝑛 |𝜏 𝑎𝑚 𝑎𝑚 | solution set, and the data is valid else it can be discarded. ( ) Case 2: Condition: 𝜌 𝑎𝑛 < 0 𝑎𝑛𝑑 𝜏𝑎𝑛 < 0 { ( ) } ( ) Solution: [max −1, 𝜌𝑎𝑛 − 𝜏𝑎𝑛 𝜌 𝑎𝑛 ]

Fig. 3b. Avg. temperature analysis.

Analysis 2: Here, the validity of the observed data is void if 𝜏𝑎𝑛 = 𝑎 negative integer. Therefore, the acceptable solution set is either (𝜌𝑎𝑛 − ( ) ( ) 𝜏𝑎𝑛 ) or 𝜌 𝑎𝑛 . The chances for 𝜌 𝑎𝑛 is less as 𝜏𝑎𝑛 ≠ 0 and therefore, the unanimous solutions is (𝜌𝑎𝑛 − 𝜏𝑎𝑛 ). Despite 𝑛 > 𝑚 or vice-verse, the correlation degree is same as in Eq. (7). Similarly, the error factor is less but takes 0 < 𝑒 ≤ 1 value. It the 𝑒 value exceeds the 𝜇 of the 𝐷𝐴𝑋𝑁 , then it is discarded. The discarded information cannot be used for analysis and it provides non-feasible results. 4. Performance analysis The performance of the proposed data analytical model is verified by exploiting an open source weather report [23] dataset. This data set consists of the following fields: wind speed, direction, minimum and maximum temperature and rainfall, and forecast information. The information is analyzed using the following experimental setup and value as in Table 3. For ease of verification, the proposed CDAM is compared with the existing MLRT [13] and MDPM [11] for the metrics correlation accuracy, error rate, and analyzed data ratio.

Fig. 3c. Avg. rainfall analysis.

and 𝑛 respectively. The matching function (𝛥𝜃) is then given by Eq. (13) as { 𝛾(∅𝑛𝑚 ) = 1, 𝑖𝑓 𝑒 = 0 𝛥𝜃 = (13) 𝛾(∅𝑛𝑚 ) < 1, 𝑖𝑓 0 < 𝑒 ≤ 0 Where, 𝛾∅𝑛𝑚 is the similarity index that is computed using Eq. (14) [ ∑𝐴𝑋𝑁 ] 𝑛 ( 𝑛) 𝑖=1 𝜏𝑎𝑖 − ∅𝑚𝑖 𝛾 ∅𝑚 = 𝑒𝑥𝑝𝑟𝑒𝑠𝑠𝑖𝑜𝑛 − (14) 2𝜇 2

4.1. Correlation accuracy comparison In Fig. 4, correlation accuracy of the existing and proposed analytical models is compared. The proposed analytical model performs 𝜌C over 𝐷(𝐴×𝑁) for all T. The data input is analyzed in both the ends using 𝜌𝑓 in 𝑡𝑛 segregating 𝑎𝑛 and 𝑎𝑚 for correlation. The errors in this phase is mitigated by framing conditional similarity verifi( ) cation where the 𝛾 ∅𝑛𝑚 is estimated for matching the stored and analyzed data. These two diverse processes help to reduce the errors in data correlation and thus improving the correlation ratio. Besides, in the conditional verification process, the solutions are feasible around ( ) ( ) ( ) ( ) ( ) ( ) [𝜌𝑓 𝑎𝑛 , min{1, 𝜌𝑓 𝑎𝑛 +𝜏𝑎𝑛 }], [𝜌𝑓 𝑎𝑛 −𝜏𝑎𝑛 , 𝜌 𝑎𝑛 ] and [𝜌 𝑎𝑛 , 𝜌 𝑎𝑛 + [ { } ( ) 𝜏𝑎𝑛 ] in a direct manner. The correlation ratio in max −1, 𝜌 𝑎𝑛 − 𝜏𝑎𝑛 , ( )] 𝜌 𝑎𝑛 solution alone comprises of error that is mitigated using the 𝜏𝑎𝑚 ( ) analysis to achieve a nearest possible solution of (𝜌𝑎𝑛 − 𝜏𝑎𝑛 ) or 𝜌 𝑎𝑛 . Therefore, the error observed in these features is less. The same analysis [ ] process is adopted for 𝐴 = 𝑎1 , 𝑎2 , … , 𝑎𝑛 matrix solutions unanimously, to retain the correlation accuracy.

In Eq. (14) the 𝜏𝑎𝑖 represents both 𝑛 and 𝑚 parameters with respect to the time interval 𝑇 . The stored data is correlated with the observed data for verifying the conditions given in Table 2. In particular, the verification is preferred for 𝜌𝑓 = 𝜌C + 𝑒 data analysis to estimate the error in the received data. The received data is stored in an temporal manner for processing and verifying it with the stored data. The received data is classified as ∅𝑛𝑚 and the tuples such as 𝑎𝑚 and 𝑎𝑛 are substituted in the fields of stored information to estimate 𝑒. If the stored information substitution is empty, then 𝑒 = 1. There are two possible conditions namely 𝜌(𝑎𝑛 ) < 0 ∀𝜏𝑎𝑛 ≥ 0 and 𝜏𝑎𝑛 < 0, where the error is abruptly recorded. In such cases the analysis is performed as explained below. Case 1: condition 𝜌(𝑎𝑛 ) < 0 𝑎𝑛𝑑 ∑𝜏𝑎𝑛 ≥ 0 i.e. the fuzzy derivative of the 𝑇

set 𝑎𝑛 is less than 0 and the √∑𝑇 𝑖=1

𝑎𝑚𝑖

2 𝑖=1 (𝑎𝑚 −𝑎𝑖 )

< 0. 182

J. Ma, H. Yu, Y. Xu et al.

Computer Communications 150 (2020) 177–184

Fig. 6. Analyzed data comparisons. Fig. 4. Correlation accuracy comparison. Table 4 Comparison values of MLRT, MDPM, and CDAM. Metrics

MLRT

MDPM

CDAM

Correlation accuracy (%) Error rate Analyzed data (%)

78.14 0.244 88.31

85.73 0.152 93.42

96.4 0.112 95.16

4.3. Analyzed data ratio Fig. 6 portrays the ratio of analyzed data comparisons between the existing methods and proposed models. The rate of analyzed data in the proposed data analysis model is high as the analysis is two-fold. In phase 2 and phase 3, the rate of analysis is augmented through fuzzy correlation and conditional verification concurrently. The process supports to any rate of input data from the monitoring layer that is analyzed in different manners. The rate of errors is mitigated in both the process along with the extraction of correlated data. In Fig. 6, the comparison is presented for data that is analyzed in both the phases. At some point (0.2, 0.5 to 0.7), the analyzed rate of data is less as the verification (phase 3) requires additional time. The rate of analysis in this phase undergoes different conditional verification and this halts the processing of incoming data from the sensor layer. In the overall process, the analyzed data is comparatively high due to the sequential and concurrent processing of input in both phases 2 and phase 3. In Table 4, the comparative analysis of the existing and proposed models is presented.

Fig. 5. Error rate analysis.

4.2. Error rate comparison The error rate in the CDAM occurs [ proposed { } ( in)] two prominent ( ) cases: if the solution is max −1, 𝜌 𝑎𝑛 − 𝜏𝑎𝑛 , 𝜌 𝑎𝑛 and 𝛾(∅𝑛𝑚 ) < 1 condition. Besides, any data satisfying a condition 𝜌𝑓 < 𝜌C is discarded to reduce the impact of error in analysis. A correlation analysis where 𝑛 < 𝑚 for all the 𝑒 > 1 is also discarded to reduce its impact on accuracy degradation. In the phase 2 process, irrespective of the time interval and update time slot, estimation of 𝜌C and stored validation of 𝜌𝑓 in 𝑡𝑛 approximates the error in accuracy detection. Therefore, the solution set is 𝜌𝑓 = 𝜌C + 𝑒 where the error is accountable. However, ( ) ( ) the assessed error is independent of 𝜌𝑓 𝑎𝑛 and 𝜌𝑓 𝑎𝑚 at any time instance. Therefore, the data input in an update time intervals satisfying ( ) ( ) ( ) ( ) 𝑡𝑛 𝑎𝑛 , 𝑎𝑚 = 𝑡𝑛 𝑎𝑚 , 𝑎𝑛 and 𝑎𝑛 , 𝑎𝑚 ∥ 𝑡𝑛 𝑎𝑚 , 𝑎𝑛 = 1, ∀𝑎𝑛 = 𝑎𝑚 is directly accepted as error free data. On the other hand, the fuzzy condition ( ) ( ) satisfying 𝑡𝑛 = 𝑎𝑛 , 𝑎𝑚 ∥ 𝑡𝑛 𝑎𝑚 , 𝑎𝑛 < 1 is estimated for the errors using the individual error factors irrespective of the update time. This error analysis method is lagging in the existing methods presented in Fig. 5, aiding the proposed CDAM to achieve less error rate.

5. Conclusion A conservative data analytical model is for improving the accuracy of climatic data evaluation is presented in this article. This model operates in forecasting center where, the operations are segregated for data analysis and conditional verification. The different processes are assimilated for reducing the correlation errors and leveraging the accuracy rate along with the analyzed data rate. Fuzzy based correlation analysis operates in a partial manner across the data analysis and similarity verification through framed conditions for improving the efficiency of the data analytical model. The data analytical model is verified using a real-time data source that is correlated for the forecast information for its consistency. 183

J. Ma, H. Yu, Y. Xu et al.

Computer Communications 150 (2020) 177–184

Declaration of competing interest

[9] A. Cuzzocrea, M.M. Gaber, E. Fadda, G.M. Grasso, An innovative framework for supporting big atmospheric data analytics via clustering-based spatio-temporal analysis, J. Ambient Intell. Humaniz. Comput. 10 (9) (2018) 3383–3398. [10] P.M. Santos, J.G.P. Rodrigues, S.B. Cruz, T. Lourenco, P.M. Dorey, Y. Luis, C. Rocha, S. Sousa, S. Crisostomo, C. Queiros, S. Sargento, A. Aguiar, J. Barros, PortoLivingLab: An IoT-based sensing platform for smart cities, IEEE Internet Things J. 5 (2) (2018) 523–532. [11] C. Zhang, Y. Liu, F. Wu, W. Fan, J. Tang, H. Liu, Multi-dimensional joint prediction model for IoT sensor data search, IEEE Access 7 (2019) 90863–90873. [12] L. Calderoni, A. Magnani, D. Maio, IoT manager: An open-source IoT framework for smart cities, J. Syst. Archit. 98 (2019) 413–423. [13] T. Fowdur, Y. Beeharry, V. Hurbungs, V. Bassoo, V. Ramnarain-Seetohul, E.C.M. Lun, Performance analysis and implementation of an adaptive real-time weather forecasting system, Internet Things 3–4 (2018) 12–33. [14] R. Dobrescu, D. Merezeanu, S. Mocanu, Context-aware control and monitoring system with IoT and cloud support, Comput. Electron. Agric. 160 (2019) 91–99. [15] H. Ponce, S. Gutiérrez, An indoor predicting climate conditions approach using Internet-of-Things and artificial hydrocarbon networks, Measurement 135 (2019) 170–179. [16] A. Jiang, X. Liu, E. Czarnecki, C. Zhang, Hourly weather data projection due to climate change for impact assessment on building and infrastructure, Sustainable Cities Soc. 50 (2019) 101688. [17] S. Farah, D. Whaley, W. Saman, J. Boland, Integrating climate change into meteorological weather data for building energy simulation, Energy Build. 183 (2019) 749–760. [18] A. Moazami, V.M. Nik, S. Carlucci, S. Geving, Impacts of future weather data typology on building energy performance – Investigating long-term patterns of climate change and extreme weather conditions, Appl. Energy 238 (2019) 696–720. [19] F. Guarino, D. Croce, I. Tinnirello, M. Cellura, Data fusion analysis applied to different climate change models: An application to the energy consumptions of a building office, Energy Build. 196 (2019) 240–254. [20] X. Luo, L.O. Oyedele, A.O. Ajayi, C.G. Monyei, O.O. Akinade, L.A. Akanbi, Development of an IoT-based big data platform for day-ahead prediction of building heating and cooling demands, Adv. Eng. Inform. 41 (2019) 100926. [21] A. Elsaadany, K. Khalil, Assessment of coverage quality of sensor networks for IoT applications, in: Peer-To-Peer Networking and Applications, 2019. [22] B. Keswani, A.G. Mohapatra, A. Mohanty, A. Khanna, J.J.P.C. Rodrigues, D. Gupta, V.H.C.D. Albuquerque, Adapting weather conditions based IoT enabled smart irrigation technique in precision agriculture mechanisms, Neural Comput. Appl. 31 (1) (2018) 277–292. [23] https://www.gov.im/weather/WeatherPdf?FiveDayForcast=1.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments His work was supported by the by the Science and Technology Support Project of Gansu Province (Grant No. 1604FKCA093); Supported by the Fundamental Research Funds for the Central Universities (Grant No. 31920190056); Supported by the Fundamental Research Funds for the Central Universities (Grant No. 31920190165); Supported by the Fundamental Research Funds for the Central Universities (Grant No. 31920180119); IN 2017 the Ministry of Education Humanities and Social Science Research Project (Grant No. 17YJA740035). References [1] O. Elijah, T.A. Rahman, I. Orikumhi, C.Y. Leow, M.N. Hindia, An overview of internet of things (IoT) and data analytics in agriculture: Benefits and challenges, IEEE Internet Things J. 5 (5) (2018) 3758–3773. [2] G. Mois, S. Folea, T. Sanislav, Analysis of three IoT-based wireless sensors for environmental monitoring, IEEE Trans. Instrum. Meas. 66 (8) (2017) 2056–2064. [3] T. Malche, P. Maheshwary, R. Kumar, Environmental monitoring system for smart city based on secure internet of things (IoT) architecture, Wirel. Pers. Commun. 107 (4) (2019) 2143–2172. [4] J. Kim, Y.C. Kwon, T.-H. Kim, A scalable high-performance I/O system for a numerical weather forecast model on the cubed-sphere grid, Asia-Pac. J. Atmos. Sci. 54 (S1) (2018) 403–412. [5] D. Liu, L. Zeng, C. Li, K. Ma, Y. Chen, Y. Cao, A distributed short-term load forecasting method based on local weather information, IEEE Syst. J. 12 (1) (2018) 208–215. [6] A.J. Hussain, P. Liatsis, M. Khalaf, H. Tawfik, H. Al-Asker, A dynamic neural network architecture with immunology inspired optimization for weather data forecasting, Big Data Res. 14 (2018) 81–92. [7] A.G. Salman, Y. Heryadi, E. Abdurahman, W. Suparta, Single layer & multi-layer long short-term memory (LSTM) model with intermediate variables for weather forecasting, Procedia Comput. Sci. 135 (2018) 89–98. [8] Z. Rahimi, H.Z.M. Shafri, M. Norman, A GNSS-based weather forecasting approach using nonlinear auto regressive approach with exogenous input (NARX), J. Atmos. Sol.-Terr. Phys. 178 (2018) 74–84.

184