CN116709392B - Large-scale wireless sensor network data fusion method - Google Patents

Large-scale wireless sensor network data fusion method Download PDF

Info

Publication number
CN116709392B
CN116709392B CN202310991312.2A CN202310991312A CN116709392B CN 116709392 B CN116709392 B CN 116709392B CN 202310991312 A CN202310991312 A CN 202310991312A CN 116709392 B CN116709392 B CN 116709392B
Authority
CN
China
Prior art keywords
data
sensor
variable
sub
sensors
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310991312.2A
Other languages
Chinese (zh)
Other versions
CN116709392A (en
Inventor
曾建祥
欧阳路
何海鱼
邓群
邓林海
王军
吴稳
薛学科
胡雅琴
刘孟夫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hunan Tianlian City Data Control Co ltd
Original Assignee
Hunan Tianlian City Data Control Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hunan Tianlian City Data Control Co ltd filed Critical Hunan Tianlian City Data Control Co ltd
Priority to CN202310991312.2A priority Critical patent/CN116709392B/en
Publication of CN116709392A publication Critical patent/CN116709392A/en
Application granted granted Critical
Publication of CN116709392B publication Critical patent/CN116709392B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W24/00Supervisory, monitoring or testing arrangements
    • H04W24/02Arrangements for optimising operational condition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/008Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols involving homomorphic encryption
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W12/00Security arrangements; Authentication; Protecting privacy or anonymity
    • H04W12/03Protecting confidentiality, e.g. by encryption
    • H04W12/033Protecting confidentiality, e.g. by encryption of the user plane, e.g. user's traffic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W84/00Network topologies
    • H04W84/18Self-organising networks, e.g. ad-hoc networks or sensor networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Arrangements For Transmission Of Measured Signals (AREA)

Abstract

The application discloses a large-scale wireless sensor network data fusion method, which comprises the following steps: designing a wireless sensor network comprising various heterogeneous sensors, and optimizing the deployment positions of various sensor nodes; transmitting the original data acquired by each sensor to a data center through a wireless network for storage; performing data cleaning and conversion processes on the data center, and standardizing all original data into a uniform data format; and performing a fusion operation of association analysis and pattern mining on the stored data on the data center. The method of the application proves the accuracy and the reliability in practical application. Accurate and reliable results are obtained, both at the single sensor level and at the overall network level.

Description

Large-scale wireless sensor network data fusion method
Technical Field
The application relates to the technical fields of sensors and the Internet of things, in particular to a large-scale wireless sensor network data fusion method.
Background
Wireless Sensor Networks (WSNs) are a technology that is widely used in the fields of environmental monitoring, intelligent transportation, intelligent home, health monitoring, etc. The wireless sensor network is composed of a group of small, low-cost sensor nodes that are capable of sensing environmental information, such as temperature, humidity, illumination intensity, etc., and transmitting data to a data center via wireless communication. However, how to efficiently perform large-scale wireless sensor network data fusion is an important and challenging problem due to sensor heterogeneity and data heterogeneity.
The energy of wireless sensors is typically limited, so in designing wireless sensor networks, consideration is often given to how to optimize the location and operating time of the sensor to minimize energy consumption and maximize coverage. In addition, since the communication distance of the sensors is limited, it is also necessary to consider the communication problem between the sensors. Since wireless sensor networks are typically composed of multiple types of sensors, the data collected is heterogeneous and heterogeneous. For example, the data collected by the temperature sensor and the data collected by the humidity sensor may all differ in data type, unit, and range. Therefore, data cleaning and conversion are important steps in data fusion of wireless sensor networks. The data collected by the wireless sensor network typically has a spatiotemporal correlation. For example, there may be some correlation between data collected at different times at the same location, or data collected at the same time at different locations. Thus, techniques such as time series knowledge patterns can be used to analyze this correlation, thereby mining valuable patterns. Privacy security of data is an important issue when large-scale wireless sensor network data fusion is performed. Therefore, how to perform effective data fusion while ensuring data privacy is a challenging problem.
Disclosure of Invention
The present application aims to solve at least one of the technical problems existing in the prior art. Therefore, the application discloses a large-scale wireless sensor network data fusion method. The method provides a data fusion method of the wireless energy-carrying sensor network, can effectively process heterogeneous and heterogeneous data, and provides new possibility for data processing of the large-scale wireless sensor network with safety and privacy.
The application aims at realizing a large-scale wireless sensor network data fusion method by the following technical scheme that the method comprises the following steps:
step 1, designing a wireless sensor network comprising various heterogeneous sensors, and optimizing the deployment positions of various sensor nodes;
step 2, transmitting the original data acquired by each sensor to a data center through a wireless network for storage;
step 3, implementing a data cleaning and conversion process on the data center, and standardizing all original data into a unified data format;
and 4, performing association analysis and pattern mining fusion operation on the stored data on the data center.
The optimizing the deployment position of each sensor node comprises the following steps:
let the sensor set beEach sensor->Has a predetermined total energy +.>Energy consumption per unit time->Total working time->Working time->Definitions->For sensor->And->Distance between->For the set of areas that can be covered, +.>For communication range, if sensor +.>Can cover area +.>,/>Otherwise
Establishing a deployment position optimization model, and setting an objective function as follows: on the premise of meeting the data acquisition requirement, the least sensors are used, namely, the number of the sensors is minimized, which is expressed as:the method comprises the steps of carrying out a first treatment on the surface of the The set constraint is expressed as follows:
all functional areas need to be covered; the energy of each sensor cannot exceed the preset energy level; the working time of each sensor must not exceed the preset working time; the communication distance between each pair of sensors must not exceed the maximum communication distance C only if the sensorsSelected and at least one other sensor is present +.>And->In direct communication, the->Can be collected;
wherein,is a binary sensor variable when the sensor +.>When selected, add->Otherwise->,/>Is a binary variable when the sensor is +.>And->When a direct communication link exists between them, +.>Otherwise->
The process for solving the deployment position optimization model comprises the following steps:
step 101, initializing: setting the upper bound UB to be one larger size value +' infinity; initializing a lower bound LB as an optimal solution of the unconstrained problem;
step 102, creating a search tree: initializing a root node of a search tree, and taking the whole problem as a sub-problem of the root node; creating a set of variables including sensor variablesAnd communication variable->The method comprises the steps of carrying out a first treatment on the surface of the Creating a constraint condition set, including an area coverage constraint, an energy constraint, a working time constraint, a communication limit constraint and a sensor selection and intermediation sensor constraint, and defining an objective function, namely minimizing the number of sensors;
step 103, selecting branch variables: in the current sub-problem, selecting an unbranched variable for branching, and heuristically selecting a sensor variable with the greatest number of activity constraintsBranching is carried out; for the current sub-problem, calculate each sensor variable +.>The number of activity constraints representing the number of constraints the sensor variable involves in the current sub-problem; selecting the sensor variable with the maximum number of activity constraints +.>As a branching variable, this means that the sensor variable with the greatest influence or relevance is selected for branching, helping to converge more quickly to the optimal solution; branching with the variable whose estimated computation time is shortest if there are a plurality of sensor variables having the same maximum number of activity constraints; for each sensor variable +.>Obtaining a predicted calculation time by accumulating calculation times of operations related to the sensor variables;
step 104, for selected branch variablesTwo sub-problems are created: sub-problem one: will->Setting to 1, and updating the problem according to the constraint condition; secondary problems: will->Setting to 0, and updating the problem according to the constraint condition;
step 105, for each sub-problem, solving the mixed integer programming problem to obtain a lower bound LB and a feasible solution, and discarding the sub-problem if the lower bound LB is greater than the current upper bound UB;
step 106, if the optimal solution of a certain sub-problem is smaller than the current upper bound UB, updating the upper bound UB to the optimal solution;
step 107, for each sub-problem, discarding the sub-problem if the optimal solution is greater than the current upper bound UB;
step 108, terminating the search if all the sub-problems in the search tree are discarded or if all the sub-problems in the search tree have been solved and an optimal solution is found;
step 109, if the termination condition is not satisfied, returning to step 103, and selecting the next variable which is not branched to perform branching operation;
step 110, returning to the optimal sensor selection scheme found in the search tree, meeting the data acquisition requirements and communication limitations, and minimizing the number of sensors.
The original data is transmitted through a wireless network, and the method comprises the following steps:
each sensorAcquisition of raw data->
At each sensorData of->Add noise satisfying Laplace distribution or Gaussian distribution>Obtaining noisy data->For the noise of the Laplace distribution, the probability density function is:
wherein,is a location parameter, +.>Is a scale parameter, in satisfying->In the case of differential privacy, scale parametersWherein->Is a function->Sensitivity on proximity database, +.>Is a privacy budget;
data to be noisyHomomorphic encryption is carried out to obtain encrypted data +.>
Each sensor will encrypt dataTo a data center.
After the data center collects all the encrypted data, the data center performs various forms of data fusion on the encrypted data, and is provided withFor fusion function->For all sets of encrypted dataThe encrypted data after fusion is +.>
When the data center needs to deeply analyze the data, the data center uses the secret key to decrypt the encrypted data after fusion, and a decryption function is set asDecryption result is->Due to the homomorphic encryption and differential privacy properties
Specifically, the encryption adopts Paillier encryption algorithm, and for plaintextAnd corresponding->The encryption and decryption functions are defined as follows:
encryption function:wherein->And->Is public key (L)>Is a random number;
decryption function:wherein->As a private key,
specifically, the data cleaning and converting process comprises the following steps:
collecting raw data from different types of sensors;
after the data collection is completed, cleaning the data, including deleting redundant data, identifying and processing missing values, abnormal values and the like;
since heterogeneous sensors may have different data acquisition frequencies and time stamps, alignment data is required;
because of possible inconsistency of data format and units among the sensors, data needs to be converted to achieve consistency;
and finally, fusing all the processed sensor data to form a data set with a uniform data format.
The unified data format includes the following elements:
timestamp: representing the time of data acquisition;
sensor identification: this data represents the source of the data, i.e., which sensor acquired the data;
location: representing the geographic location of the sensor;
measurement value: representing actual data collected by the sensor;
units: units representing measured values;
data quality: representing the quality of this data, is a signal strength or other indicator.
The step of carrying out relevance analysis on the stored data is as follows:
constructing a time sequence knowledge graph: constructing a time-series knowledge graph based on the collected sensor data, wherein each sensor is considered an entity, each measurement is considered another entity, the "measurement" relationship between the sensor and the measurement, and the "on" relationship between the measurement and the timestamp is considered an edge;
identifying the relevance: for any two sensors, their measurements at different times are analyzed to identify their correlation, by calculating the correlation coefficients of their measurements, defined as:
wherein,representing covariance +_>Represents standard deviation->And->Representing sensor +.>And->Is a sequence of measurements of (1);
adding a relevance edge: for the sensor pairs with the correlation coefficient larger than a certain threshold value, adding a correlation edge between the sensor pairs, and taking the correlation number as the weight of the correlation edge;
correlation analysis: and finding out important modes in the sensor network by analyzing the attribute values and the relevance edges.
The important modes include: a community detection algorithm is used to find a closely related sensor population or a path analysis method is used to identify key factors affecting a certain sensor measurement.
The mode of the mode mining comprises the following steps:
periodic mode: many sensor data will exhibit periodic variations, analytical steps: for a certain sensorUsing Fourier transform method to measure itQuantitative sequence->Conversion to the frequency domain, identification of the dominant frequency component, mathematical formula:wherein->Is the length of the sequence,/>From->To->,/>From->To the point of
Abnormal mode: the sensor data may contain some outliers, the analysis step: for a certain sensorThe measurement sequence is detected by means of statistical methods or machine learning methods>Is an outlier of (2);
cluster mode: there are some closely related sensor groups in the sensor network, and their measurement data have similar change patterns; the analysis step: and clustering the measurement sequences of all the sensors by using a clustering algorithm, and then identifying cluster groups.
Compared with the prior art, the method has the advantages that: the technical scheme provides a large-scale wireless sensor network data fusion method, and sensor resources are effectively managed and distributed by means of an optimization model. The model can be optimized according to the energy, communication capacity and working time of the sensor, so that the maximum utilization of the sensor is realized, and the total quantity is reduced as much as possible; the method of the application cleans and converts the data aiming at the heterogeneity of the sensor network and the heterogeneity of the sensing data, and can process various types and formats of data so as to unify and normalize the data; the correlation between data is better understood by using a time-sequential knowledge-graph technique. This enables us to understand the data from different angles and levels, thereby mining deeper patterns and trends; the safety of the data is enhanced, and the safety of the sensor data in the transmission and fusion processes is effectively protected. The method of the application proves the accuracy and the reliability in practical application. Accurate and reliable results are obtained, both at the single sensor level and at the overall network level.
Drawings
Fig. 1 shows a schematic flow chart of an embodiment of the application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the application.
In this example, we assume that we are running a large farm and use a wireless sensor network to collect data to optimize planting results. In farms we may use various types of sensors including temperature sensors, humidity sensors, illumination sensors, soil pH sensors etc.
All the sensors start working according to the set time and position, and various data are collected. Because of the heterogeneity of the sensors, after data acquisition is completed, the raw data needs to be cleaned and converted, and all the data is standardized into a uniform data format.
After data standardization, a time sequence knowledge graph is used for carrying out relevance analysis on various data. For example, by analyzing the combined distribution of temperature, humidity, light and pH, certain patterns can be found, such as greater effect of light on plant growth under specific humidity and temperature conditions. We can then use these modes to optimize the management decisions of the farm.
To protect the privacy of the data, we will encrypt the data using homomorphic encryption and add noise to the data using differential privacy methods before the sensor uploads the data. This allows us to perform data fusion and analysis on the data while ensuring data privacy.
Thus, a method for fusing large-scale wireless sensor network data, the method comprising:
step 1, designing a wireless sensor network comprising various heterogeneous sensors, and optimizing the deployment positions of various sensor nodes;
step 2, transmitting the original data acquired by each sensor to a data center through a wireless network for storage;
step 3, implementing a data cleaning and conversion process on the data center, and standardizing all original data into a unified data format;
and 4, performing association analysis and pattern mining fusion operation on the stored data on the data center.
The optimizing the deployment position of each sensor node comprises the following steps:
let the sensor set be:each sensor->Has a predetermined total energy +.>Energy consumption per unit time->Total working time->Working time->Definitions->For sensor->And->Distance between->For the set of areas that can be covered, +.>For communication range, if sensor +.>Can cover area +.>,/>Otherwise
Establishing a deployment position optimization model, and setting an objective function as follows: on the premise of meeting the data acquisition requirement, the least sensors are used, namely, the number of the sensors is minimized, which is expressed as:the set constraint is expressed as follows:
all functional areas need to be covered; the energy of each sensor cannot exceed the preset energy level; the working time of each sensor must not exceed the preset working time; the communication distance between each pair of sensors cannot exceed the maximum communication distance C; only when the sensor isSelected and at least one other sensor is present +.>And->In direct communication, the->Can be collected;
wherein,is a binary sensor variable when the sensor +.>When selected, add->Otherwise->,/>Is a binary variable when the sensor is +.>And->When a direct communication link exists between them, +.>Otherwise->
The process for solving the deployment position optimization model comprises the following steps:
step 101, initializing: setting the upper bound UB to be one larger size value +' infinity; initializing a lower bound LB as an optimal solution of the unconstrained problem;
step 102, creating a search tree: initializing a root node of a search tree, and taking the whole problem as a sub-problem of the root node; creating a set of variables including sensor variablesAnd communication variable->The method comprises the steps of carrying out a first treatment on the surface of the Creating a set of constraints including an area coverage constraint, an energy constraint, a working time constraint, a communication constraint, and a sensor selectionSelecting and mediating sensor constraints, defining an objective function, namely minimizing the number of sensors;
step 103, selecting branch variables: in the current sub-problem, selecting an unbranched variable for branching, and heuristically selecting a sensor variable with the greatest number of activity constraintsBranching is carried out; for the current sub-problem, calculate each sensor variable +.>The number of activity constraints representing the number of constraints the sensor variable involves in the current sub-problem; selecting the sensor variable with the maximum number of activity constraints +.>As a branching variable, this means that the sensor variable with the greatest influence or relevance is selected for branching, helping to converge more quickly to the optimal solution; branching with the variable whose estimated computation time is shortest if there are a plurality of sensor variables having the same maximum number of activity constraints; for each sensor variable +.>Obtaining a predicted calculation time by accumulating calculation times of operations related to the sensor variables;
step 104, for selected branch variablesTwo sub-problems are created: sub-problem one: will->Setting to 1, and updating the problem according to the constraint condition; secondary problems: will->Setting to 0, and updating the problem according to the constraint condition;
step 105, for each sub-problem, solving the mixed integer programming problem to obtain a lower bound LB and a feasible solution, and discarding the sub-problem if the lower bound LB is greater than the current upper bound UB;
step 106, if the optimal solution of a certain sub-problem is smaller than the current upper bound UB, updating the upper bound UB to the optimal solution;
step 107, for each sub-problem, discarding the sub-problem if the optimal solution is greater than the current upper bound UB;
step 108, terminating the search if all the sub-problems in the search tree are discarded or if all the sub-problems in the search tree have been solved and an optimal solution is found;
step 109, if the termination condition is not satisfied, returning to step 103, and selecting the next variable which is not branched to perform branching operation;
step 110, returning to the optimal sensor selection scheme found in the search tree, meeting the data acquisition requirements and communication limitations, and minimizing the number of sensors.
The original data is transmitted through a wireless network, and the method comprises the following steps:
each sensorAcquisition of raw data->
At each sensorData of->Add noise satisfying Laplace distribution or Gaussian distribution>Obtaining noisy data->Probability density function for Laplace distributed noiseThe method comprises the following steps:
wherein,is a location parameter, +.>Is a scale parameter, in satisfying->In the case of differential privacy, scale parametersWherein->Is a function->Sensitivity on proximity database, +.>Is a privacy budget;
data to be noisyHomomorphic encryption is carried out to obtain encrypted data +.>Each sensor will encrypt data +.>To a data center.
After the data center collects all the encrypted data, the data center performs various forms of data fusion on the encrypted data, and is provided withFor fusion function->For the set of all encrypted data, the fused encrypted data is +>When the data center needs to deeply analyze the data, the data center uses the secret key to decrypt the encrypted data after fusion, and the decryption function is set as +.>Decryption result is->Due to homomorphic encryption and differential privacy properties +.>
Specifically, the encryption adopts Paillier encryption algorithm, and for plaintextAnd corresponding->The encryption and decryption functions are defined as follows:
encryption function:wherein->And->Is public key (L)>Is a random number;
decryption function:wherein->As a private key,
specifically, the data cleaning and converting process comprises the following steps:
collecting raw data from different types of sensors;
after the data collection is completed, cleaning the data, including deleting redundant data, identifying and processing missing values, abnormal values and the like;
since heterogeneous sensors may have different data acquisition frequencies and time stamps, alignment data is required;
because of possible inconsistency of data format and units among the sensors, data needs to be converted to achieve consistency;
and finally, fusing all the processed sensor data to form a data set with a uniform data format.
The unified data format includes the following elements:
timestamp: representing the time of data acquisition;
sensor identification: this data represents the source of the data, i.e., which sensor acquired the data;
location: representing the geographic location of the sensor;
measurement value: representing actual data collected by the sensor;
units: units representing measured values;
data quality: representing the quality of this data, is a signal strength or other indicator.
The step of carrying out relevance analysis on the stored data is as follows:
constructing a time sequence knowledge graph: constructing a time-series knowledge graph based on the collected sensor data, wherein each sensor is considered an entity, each measurement is considered another entity, the "measurement" relationship between the sensor and the measurement, and the "on" relationship between the measurement and the timestamp is considered an edge;
identifying the relevance: for any two sensors, their measurements at different times are analyzed to identify their correlation, by calculating the correlation coefficients of their measurements, defined as:
wherein,representing covariance +_>Represents standard deviation->And->Representing sensor +.>And->Is a sequence of measurements of (1); adding a relevance edge: for the sensor pairs with the correlation coefficient larger than a certain threshold value, adding a correlation edge between the sensor pairs, and taking the correlation number as the weight of the correlation edge;
correlation analysis: and finding out important modes in the sensor network by analyzing the attribute values and the relevance edges.
The important modes include: a community detection algorithm is used to find a closely related sensor population or a path analysis method is used to identify key factors affecting a certain sensor measurement.
The mode of the mode mining comprises the following steps:
periodic mode: many sensor data will exhibit periodic variations, analytical steps: for a certain sensorMeasuring the sequence by means of Fourier transformation method>Conversion to the frequency domain, identification of the dominant frequency component, mathematical formula:wherein->Is the length of the sequence,/>From->To->,/>From->To the point of
Abnormal mode: the sensor data may contain some outliers, the analysis step: for a certain sensorThe measurement sequence is detected by means of statistical methods or machine learning methods>Is an outlier of (2);
cluster mode: there are some closely related sensor groups in the sensor network, and their measurement data have similar change patterns; the analysis step: and clustering the measurement sequences of all the sensors by using a clustering algorithm, and then identifying cluster groups.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

Claims (8)

1. The data fusion method for the large-scale wireless sensor network is characterized by comprising the following steps of:
step 1, designing a wireless sensor network comprising various heterogeneous sensors, and optimizing the deployment positions of various sensor nodes;
step 2, transmitting the original data acquired by each sensor to a data center through a wireless network for storage;
step 3, implementing a data cleaning and conversion process on the data center, and standardizing all original data into a unified data format;
step 4, carrying out association analysis and pattern mining fusion operation on the stored data on a data center;
the optimizing the deployment position of each sensor node comprises the following steps:
let the sensor set be s= { S 1 ,s 2 ,...,s n Each sensor s i With a predetermined total energy E i Energy consumption per unit time e i Total working time T i Operating time t i Definition d ij Is sensor s i Sum s j The distance between Sd is the set of areas that can be covered, D is the communication range, if the sensor s i Can cover the areas j, a ij =1, otherwise a ij =0;
Establishing a deployment position optimization model, and setting an objective function as follows: the minimum number of sensors, i.e. the minimum number of sensors used, is indicative of meeting the data acquisition requirementsThe method comprises the following steps:the set constraint is expressed as follows:
the above constraints represent: all functional areas need to be covered; the energy of each sensor cannot exceed the preset total energy; the working time of each sensor must not exceed the preset total working time; the communication distance between each pair of sensors cannot exceed the maximum communication distance C; only when the sensor s i Selected and at least one other sensor s is present j And s i S in direct communication i Can be collected; x is x i Is a binary sensor variable, when the sensor s i When selected, x i =1, otherwise x i =0,y ij Is a binary variable, when the sensor s i And s j When a direct communication link exists between the two, y ij =1, otherwise y ij =0;
Wherein, the original data is transmitted through a wireless network, comprising the following steps:
each sensor s i The original data d is collected i
At each sensor s i Data d of (2) i Adding noise n satisfying Laplace distribution or Gaussian distribution i Obtaining noisy data d' i =d i +n i For the noise of the Laplace distribution, the probability density function is:
P(x|μ,b)=1/(2b)*exp(-|x-μ|/b);
where μ is a location parameter, b is a scale parameter, where in case epsilon-differential privacy is satisfied, the scale parameter b = Δf/epsilon, where Δf is the sensitivity of the function f on the proximity database and epsilon is the privacy budget; will noisy data d' i Homomorphic encryption is performed to obtain encrypted data ci=enc (d' i );
Each sensor will encrypt data c i Transmitting to a data center;
after the data center collects all the encrypted data, carrying out various forms of data fusion on the encrypted data, and setting F as a fusion function, C as a set of all the encrypted data, wherein the fused encrypted data is C' =F (C); when the data center needs to deeply analyze the data, the data center uses the secret key to decrypt the encrypted data after fusion, and the decryption function is Dec, and the decryption result is x '=Dec (c').
2. The method for fusing large-scale wireless sensor network data according to claim 1, wherein the process of solving the deployment location optimization model comprises the following steps:
step 101, initializing: setting the upper bound UB to + -infinity; initializing a lower bound LB as an optimal solution of the unconstrained problem;
step 102, creating a search tree: initializing the root node of the search tree toThe whole problem is used as a sub-problem of the root node; creating a set of variables including the sensor variable x i And the communication variable y ij The method comprises the steps of carrying out a first treatment on the surface of the Creating a constraint condition set, including an area coverage constraint, an energy constraint, a working time constraint, a communication limit constraint and a sensor selection and intermediation sensor constraint, and defining an objective function, namely minimizing the number of sensors;
step 103, selecting branch variables: in the current sub-problem, selecting an unbranched variable for branching, and heuristically selecting the sensor variable x with the greatest number of activity constraints i Branching is carried out; for the current sub-problem, calculate each sensor variable x i The number of activity constraints representing the number of constraints the sensor variable involves in the current sub-problem; selecting a sensor variable x with a maximum number of activity constraints i As a branching variable, this means that the sensor variable with the greatest influence or relevance is selected for branching, helping to converge more quickly to the optimal solution; branching with the variable whose estimated computation time is shortest if there are a plurality of sensor variables having the same maximum number of activity constraints; for each sensor variable x i Obtaining a predicted calculation time by accumulating calculation times of operations related to the sensor variables;
step 104, for selected branch variable x i Two sub-problems are created: sub-problem one: will x i Setting to 1, and updating the problem according to the constraint condition; secondary problems: will x i Setting to 0, and updating the problem according to the constraint condition;
step 105, for each sub-problem, solving the mixed integer programming problem to obtain a lower bound LB and a feasible solution, and discarding the sub-problem if the lower bound LB is greater than the current upper bound UB;
step 106, if the optimal solution of a certain sub-problem is smaller than the current upper bound UB, updating the upper bound UB to the optimal solution;
step 107, for each sub-problem, discarding the sub-problem if the optimal solution is greater than the current upper bound UB; step 108, terminating the search if all the sub-problems in the search tree are discarded or if all the sub-problems in the search tree have been solved and an optimal solution is found;
step 109, if the termination condition is not satisfied, returning to step 103, and selecting the next variable which is not branched to perform branching operation;
step 110, returning to the optimal sensor selection scheme found in the search tree, meeting the data acquisition requirements and communication limitations, and minimizing the number of sensors.
3. The method for fusing large-scale wireless sensor network data according to claim 2, wherein in the encrypting process, for plaintext x and corresponding ciphertext C, encrypting and decrypting functions are defined as follows:
encryption function: enc (x) =g x *r n (mod n 2 ) Wherein g and n are public keys and r is a random number;
decryption function: dec (c) =l (c) λ modn 2 )/L(gλmodn 2 ) Where λ is the private key, L (u) = (u-1)/n.
4. The method for fusing data of a large-scale wireless sensor network according to claim 1, wherein the data cleaning and converting process comprises the steps of:
collecting raw data from different types of sensors;
cleaning the data after the data collection is completed, including deleting redundant data, identifying and processing missing values and abnormal values; heterogeneous sensor alignment data;
converting the sensor data;
and finally, fusing all the processed sensor data to form a data set with a uniform data format.
5. The method for fusing large-scale wireless sensor network data according to claim 4, wherein the unified data format comprises the following elements:
timestamp: representing the time of data acquisition;
sensor identification: this data represents the source of the data, i.e., which sensor acquired the data;
location: representing the geographic location of the sensor;
measurement value: representing actual data collected by the sensor;
units: units representing measured values;
data quality: representing the quality of this data, is a signal strength or other indicator.
6. The method for fusing large-scale wireless sensor network data according to claim 1, wherein the step of performing correlation analysis on the stored data is as follows:
constructing a time sequence knowledge graph: constructing a time-series knowledge graph based on the collected sensor data, wherein each sensor is considered an entity, each measurement is considered another entity, the "measurement" relationship between the sensor and the measurement, and the "on" relationship between the measurement and the timestamp is considered an edge;
identifying the relevance: for any two sensors, their measurements at different times are analyzed to identify their correlation, by calculating the correlation coefficients of their measurements, defined as:
r ij =cov(D i ,D j )/(σ ij );
wherein cov represents covariance, σ represents standard deviation, D i And D j Representing a sequence of measurements of sensors i and j;
adding a relevance edge: for the sensor pairs with the correlation coefficient larger than a certain threshold value, adding a correlation edge between the sensor pairs, and taking the correlation number as the weight of the correlation edge;
correlation analysis: and finding out important modes in the sensor network by analyzing the attribute values and the relevance edges.
7. The method for fusing data of a large-scale wireless sensor network according to claim 6, wherein the important mode comprises: a community detection algorithm is used to find a closely related sensor population or a path analysis method is used to identify key factors affecting a certain sensor measurement.
8. The method for fusing data of a large-scale wireless sensor network according to claim 1, wherein the pattern mining comprises:
periodic mode: many sensor data will exhibit periodic variations, analytical steps: for a certain sensor s i Measuring the sequence D by using a Fourier transform method i Conversion to the frequency domain, identification of the dominant frequency component, mathematical formula:where N is the length of the sequence, j is from 0 to N-1, and k is from 0 to N-1;
abnormal mode: the sensor data contains some outliers, the analysis step: for a certain sensor s i Using statistical or machine learning methods to detect its measurement sequence D i Is an outlier of (2);
cluster mode: there are some closely related sensor groups in the sensor network, and their measurement data have similar change patterns; the analysis step: and clustering the measurement sequences of all the sensors by using a clustering algorithm, and then identifying cluster groups.
CN202310991312.2A 2023-08-08 2023-08-08 Large-scale wireless sensor network data fusion method Active CN116709392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310991312.2A CN116709392B (en) 2023-08-08 2023-08-08 Large-scale wireless sensor network data fusion method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310991312.2A CN116709392B (en) 2023-08-08 2023-08-08 Large-scale wireless sensor network data fusion method

Publications (2)

Publication Number Publication Date
CN116709392A CN116709392A (en) 2023-09-05
CN116709392B true CN116709392B (en) 2023-11-14

Family

ID=87837953

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310991312.2A Active CN116709392B (en) 2023-08-08 2023-08-08 Large-scale wireless sensor network data fusion method

Country Status (1)

Country Link
CN (1) CN116709392B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117575123B (en) * 2024-01-15 2024-03-29 成都电科星拓科技有限公司 Sowing path planning method, sowing path planning device, electronic equipment and readable storage medium
CN118368593B (en) * 2024-06-18 2024-09-20 广东凯得智能科技股份有限公司 Wireless communication method, system and equipment for measuring and transmitting temperature and humidity data

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008131787A1 (en) * 2007-04-25 2008-11-06 Nec Europe Ltd. Method for aggregating data in a network
CN102196429A (en) * 2011-04-27 2011-09-21 暨南大学 Encrypted data fusion method for wireless sensor network
CN102299792A (en) * 2011-09-30 2011-12-28 北京理工大学 Method for safely and efficiently fusing data
WO2012143931A2 (en) * 2011-04-21 2012-10-26 Tata Consultancy Services Limited A method and system for preserving privacy during data aggregation in a wireless sensor network
CN103476040A (en) * 2013-09-24 2013-12-25 重庆邮电大学 Distributed compressed sensing data fusion method having privacy protection effect
US9491060B1 (en) * 2014-06-30 2016-11-08 EMC IP Holding Company LLC Integrated wireless sensor network (WSN) and massively parallel processing database management system (MPP DBMS)
CN106255038A (en) * 2016-08-04 2016-12-21 南京邮电大学 A kind of wireless sensor network security data fusion method
CN109478057A (en) * 2016-05-09 2019-03-15 强力物联网投资组合2016有限公司 Method and system for the Industrial Internet of Things
CN110134675A (en) * 2019-05-23 2019-08-16 大连海事大学 Data cleaning method and system for ocean data stream
CN111553469A (en) * 2020-05-18 2020-08-18 国网江苏省电力有限公司电力科学研究院 A wireless sensor network data fusion method, device and storage medium
CN114071631A (en) * 2020-11-10 2022-02-18 北京市天元网络技术股份有限公司 Distributed sensor network data fusion method and system
CN114745689A (en) * 2022-04-07 2022-07-12 江西师范大学 Multi-time-segment data fusion method and system for wireless sensor network
CN115243273A (en) * 2022-09-23 2022-10-25 昆明理工大学 A wireless sensor network coverage optimization method and device, equipment, medium
CN116456296A (en) * 2023-04-07 2023-07-18 南京航空航天大学 Data Fusion Control Method for Heterogeneous Wireless Sensor Networks Based on Fuzzy Logic

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11599806B2 (en) * 2020-06-22 2023-03-07 International Business Machines Corporation Depth-constrained knowledge distillation for inference on encrypted data

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008131787A1 (en) * 2007-04-25 2008-11-06 Nec Europe Ltd. Method for aggregating data in a network
WO2012143931A2 (en) * 2011-04-21 2012-10-26 Tata Consultancy Services Limited A method and system for preserving privacy during data aggregation in a wireless sensor network
CN102196429A (en) * 2011-04-27 2011-09-21 暨南大学 Encrypted data fusion method for wireless sensor network
CN102299792A (en) * 2011-09-30 2011-12-28 北京理工大学 Method for safely and efficiently fusing data
CN103476040A (en) * 2013-09-24 2013-12-25 重庆邮电大学 Distributed compressed sensing data fusion method having privacy protection effect
US9491060B1 (en) * 2014-06-30 2016-11-08 EMC IP Holding Company LLC Integrated wireless sensor network (WSN) and massively parallel processing database management system (MPP DBMS)
CN109478057A (en) * 2016-05-09 2019-03-15 强力物联网投资组合2016有限公司 Method and system for the Industrial Internet of Things
CN106255038A (en) * 2016-08-04 2016-12-21 南京邮电大学 A kind of wireless sensor network security data fusion method
CN110134675A (en) * 2019-05-23 2019-08-16 大连海事大学 Data cleaning method and system for ocean data stream
CN111553469A (en) * 2020-05-18 2020-08-18 国网江苏省电力有限公司电力科学研究院 A wireless sensor network data fusion method, device and storage medium
CN114071631A (en) * 2020-11-10 2022-02-18 北京市天元网络技术股份有限公司 Distributed sensor network data fusion method and system
CN114745689A (en) * 2022-04-07 2022-07-12 江西师范大学 Multi-time-segment data fusion method and system for wireless sensor network
CN115243273A (en) * 2022-09-23 2022-10-25 昆明理工大学 A wireless sensor network coverage optimization method and device, equipment, medium
CN116456296A (en) * 2023-04-07 2023-07-18 南京航空航天大学 Data Fusion Control Method for Heterogeneous Wireless Sensor Networks Based on Fuzzy Logic

Also Published As

Publication number Publication date
CN116709392A (en) 2023-09-05

Similar Documents

Publication Publication Date Title
CN116709392B (en) Large-scale wireless sensor network data fusion method
Fawzy et al. Outliers detection and classification in wireless sensor networks
JP6184270B2 (en) System and method for creating index profiles related to attacks by correlating various indices with past attack cases in order to detect and predict future network attacks
US8028061B2 (en) Methods, systems, and computer program products extracting network behavioral metrics and tracking network behavioral changes
US20150096026A1 (en) Cyber security
Yang et al. Deep network analyzer (DNA): A big data analytics platform for cellular networks
Sugiarto et al. Data classification for air quality on wireless sensor network monitoring system using decision tree algorithm
Burgess Probabilistic anomaly detection in distributed computer networks
CN117118849B (en) Gateway system of Internet of things and implementation method
Iturbe et al. Towards Large‐Scale, Heterogeneous Anomaly Detection Systems in Industrial Networks: A Survey of Current Trends
CN118264473A (en) Method and system for detecting network attack of telecommunication network signaling system
CN117749409A (en) Large-scale network security event analysis system
Sikdar et al. Time series analysis of temporal networks
Kumar A big data analytical framework for intrusion detection based on novel elephant herding optimized finite Dirichlet mixture models
CN115310499B (en) Industrial equipment fault diagnosis system and method based on data fusion
CN119069055A (en) A clinical data management system and method based on cloud computing
WO2024124640A1 (en) Node analysis method and apparatus based on threat analysis graph
Sontowski et al. Detecting anomalies using overlapping electrical measurements in smart power grids
Leyba et al. Cutting through the noise to infer autonomous system topology
Li et al. Extracting semantic event information from distributed sensing devices using fuzzy sets
YR et al. Iot streaming data outlier detection and sensor data aggregation
Gkouliaras et al. False data injection detection in nuclear systems using dynamic noise analysis
Hoang A Study on Anomaly Data Traffic Detection Method for Wireless Sensor Networks
Palkó et al. FFT-Based Identification of Gilbert–Elliott Data Loss Models
Chitradevi et al. Efficient density based techniques for anomalous data detection in wireless sensor networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: A Large scale Wireless Sensor Network Data Fusion Method

Granted publication date: 20231114

Pledgee: Changsha Lixiang Road Sub Branch of China Everbright Bank Co.,Ltd.

Pledgor: HUNAN TIANLIAN CITY DATA CONTROL Co.,Ltd.

Registration number: Y2024980019060

PE01 Entry into force of the registration of the contract for pledge of patent right
OSZAR »