Banking on Big Data

On a daily basis, thousands upon thousands of monitoring stations around the world collect vast quantities of air quality data for use in spotting pollution problems, analyzing air quality trends, and guiding effective responses. To date, these monitoring stations have served as digital eyes and ears trained on the planet's atmosphere. But all of that seems likely to change in the not-too-distant future as evolving networks of air sensors that are just now beginning to be deployed around the globe result in an avalanche of data, all of which has the very real potential to overwhelm those trying to make sense of it.

The data challenge facing the world's air monitoring stations stems primarily from the rapid improvements being made in sensor technology. Advances in design and manufacturing have significantly improved the quality of data being produced by these sensors, quickly making more reliable than in the past. And as sensor quality is rising, their cost is falling. Air quality sensors can now be had for as little as $200 each, although the data they produce is clearly less accurate.

These two trends mean that much denser monitoring networks can be created. In some cities, these sensors are being mounted on street lights to create a very granular picture of air quality conditions. Chicago's Array of Things, for example, features a network of interactive, modular sensor boxes installed around the city to collect real-time data and measure factors that impact liveability, such as climate, air quality, and noise.

In other locations, they are being placed on buildings and other structures. Sensors' more economical cost also means they can be deployed in areas of sparse population where installing an expensive monitoring station would have been regarded as wasteful in the past.

On their own, these advances would translate into more data from more sensor locations. Imagine, though, the impact on data availability and collection when you add in a third trend – the Internet of Things (IoT) communication platforms and new machine learning algorithms, both of which are poised to enter the mainstream. These developments will herald a new era of air quality data management as the technology available exponentially increases the volume and value of data available.

They also will present a huge problem: The techniques that were used to analyze data from tens of thousands of monitoring stations will no longer be effective when the number of monitors climbs into the millions. The data avalanche caused by such massive sensor networks simply cannot be managed using traditional approaches.

The sheer volume of information now available potentially could mean data analysts risk missing out on potentially valuable insights. Monitoring agencies, meanwhile, are likely to remain in a permanent reactive mode, rather than being positioned to take proactive steps to minimize or prevent air quality issues. In short, rather than being a benefit, all of this additional data may, in fact, represent an enormous hindrance.

Taking a Different Approach
To overcome the challenges posed by this dramatic increase in available data, a different approach is obviously required. Sophisticated tools capable of collecting and analyzing massive data sets and then displaying the results in visual form are no longer an option. They are becoming a necessity.

Fortunately, the answer may well be in adapting big data analytics software which is already in use in other applications. Powered by high-performance analytics systems, big data combines complex applications with elements such as predictive models, statistical algorithms, and what-if analyses. Taking this approach, however, would require a shift in mindset within the air quality community. Because big data analytics enables data scientists, predictive modelers, statisticians, and other analytics professionals to analyze huge volumes of both structured and unstructured data, air quality professionals would need to embrace the fact that the more data they have, the better off they are.

While this marks somewhat of a departure from current analysis, it will result in an increasing focus on trends that emerge from this data, enabling scientists to study the correlation between data sets and contextualize data with geography. In an area like Long Beach, Calif., for example, this approach could help analysts to conclude that while industrial facilities are typically blamed for air pollution in the area, the real culprit is automobile exhausts.

Because humans are far more adept at spotting trends and interrelationships in data when it is presented visually, big data tools would also need to present air quality data visually. Doing so would also enable this data to be combined with data from other sources. For example, data from a sensor network could be overlaid on digital maps, providing an effective way to determine the source of pollution and which adjacent areas are most likely to be affected. The data also can be combined with feeds from weather agencies so the likely movement of pollutants can be determined. Sudden wind shifts or changes in atmospheric conditions can have a significant impact on pollution levels across a given area.

Armed with effective visualization and analysis tools, monitoring agency staff will be in a much better position to manage air quality issues and respond to incidents. Forecast models can be created which show where problems are most likely to occur so that proactive steps can be taken to minimize the effects on residents. As a result, if a public complaint is received about an odor, the tools can be used to generate a back track that will indicate the likely source of the problem. This can allow the proper agency to respond quickly to the complaints or incidents with an answer that is based on real data, rather than anecdotal evidence.

Just as money spent on proactive health initiatives such as exercise and healthy eating campaigns can lead to longer-term savings for the medical system, investing in better monitoring and analysis tools will have a payoff when it comes to the natural environment. In short, with an investment now in visualization and big data analysis tools that are capable of effectively processing the growing avalanche of data being generated by sensor networks, environmental agencies will be in a much better position to deal with the air quality challenges of the future.


Andres Quijano is Systems Operations Manager–Americas for Envirosuite Limited, a global provider of environmental management technology through its leading Software-as-a-Service platform. The Envirosuite platform provides a range of environmental monitoring, management, and investigative capabilities that are incorporated into a diverse array of operations from waste water treatment to large-scale construction, open-cut mines, port operations, environmental regulators, and heavy industry uses. Envirosuite Limited is a global provider of environmental management technology through its leading Software-as-a-Service platform. The Envirosuite platform provides a range of environmental monitoring, management, and investigative capabilities that are incorporated into a diverse array of operations from waste water treatment to large-scale construction, open-cut mines, port operations, environmental regulators, and heavy industry uses. Envirosuite helps industry and government meet the growing demands of communities the world over for better environmental quality. For additional information, email inquiries@envirosuite.com or visit https://envirosuite.com/.

Featured Webinar