Machine learning techniques have also been combined with regression models, as demonstrated by Saetta et al. who developed a sensor platform to predict chlorine residuals in university buildings. DO, pH, EC, ORP, and free chlorine data were collected, and two models were developed. The first is a linear regression model, and the second is a gradient boosting machine model. In R statistical software, the “ggpairs” function and “leaps” function to create linear regression models which were unable to predict free Chlorine levels, however, the GBM models had lower RMSE values than their multivariate linear regression counterparts, and the t-tests showed that there was no significant difference in the predicted vs. actual data, showing the sensors used in this study could be used to predict chlorine by using the GBM model. Despite the difficulty in predicting water quality using linear regression several studies used multivariate and non-linear regression techniques to estimate heavy metal content, nutrients, stacking pots and bacteria concentration, as well as WQI’s in various water sources with R2 values as high as 0.90.
Factors including EC, pH, turbidity were among some of the most important in influencing the overall predictions. The relationships explored in the work reviewed serve as a guiding framework for the proposed prototype and give insight into expected sensor behaviors.Various organizations, including the World Health Organization, The European Union, and the United States Environmental Protection Agency, set drinking water quality standards to ensure that drinking water is clean and safe for consumers . For the prototype system proposed, the parameters to be monitored from the sensor package were based on their relationships to chemical and biological contamination, with particular emphasis on selecting the minimum amount of key sensor measurements with the greatest predictive water quality capabilities.Lab grade sensors that use various electrochemical mechanisms were selected from the manufacturer Atlas Scientific. Primary factors that were considered when selecting the sensors from off-the-shelf manufacturers to be used for the prototype included sensitivity, selectivity, stability, lifetime, response time, pressure tolerance, cost, and size. In addition, the robustness, ease of calibration, compact size, and commercially available plumbing components contributed to the overall selection process for the sensor probes to be included in the prototype.
The complete sensor package includes sensor probes for pH, EC, DO, ORP, and temperature. Table 1 describes specifications for each of the selected sensors.As evidenced by some of the related work on wireless sensors, these can be used as indicators for a wide range of water quality parameters. The sensors all operate on different electrochemical principles that convert raw voltages into digital values. pH is a measure of how acidic or basic water is and is important in water quality because it determines the solubility and biological availability of chemical constituents, nutrients, and heavy metals. The pH probe measures the hydrogen ion activity in liquids and has a glass membrane where hydrogen ions in the liquid diffuse onto the outer layer of the glass while larger ions remain in solution. The difference of concentration of hydrogen ions on the outside of the probe vs. inside the probe creates a measurable current that is proportional to the concentration of hydrogen ions in the liquid. The probe includes an internal double junction, an EXR glass tip, and a body of extruded epoxy, making the probe suitable for pH measurements of high purity waters and capable of resisting strong acids and bases for contaminated waters. Electrical conductivity was another parameter to be measured. Conductivity is the measure of a water’s ability to pass an electrical current.
Substances that conduct electrical current include dissolved salts and other inorganic chemicals. Most waters will have a relatively constant range for conductivity; therefore, significant changes can be good indicators of the pollution of an aquatic resource. Waters with elevated conductivity may have other impaired or altered indicators as well. Inside the conductivity probe, two electrodes are positioned opposite each other. An AC voltage is applied to the electrodes, which results in the cations moving to the negatively charged electrode while the anions move to the positively charged electrode. The next sensor parameter selected was dissolved oxygen. The Atlas Scientific DO probe consists of a PTFE membrane, an anode bathed in an electrolyte, and a cathode. The operating principle is based on the oxygen molecules diffusing through the membrane of the probe at a constant rate. After crossing the membrane, the oxygen molecules then reach the cathode, where they are reduced, and a small voltage is produced, which is read by an analog to digital converter. Dissolved oxygen is important in drinking water because high DO levels can damage components and systems that are used in drinking water treatment and distribution Namely, high DO levels can contribute to corrosion in pipes, and too low of levels can create issues with the taste of water. ORP is a measure of the oxidation-reduction potential where oxidation is the loss of electrons and reduction is the gain of electrons. An ORP probe measures electron activity in a liquid and shows the strength at which electrons are transferred to or from a substance in a liquid. The ORP probe selected contains a platinum tip and a 4 molar ??? reference solution. ORP was selected because it, combined with pH and temperature, can be used to estimate free chlorine concentrations, which is an indicator of the presence or absence of disease-causing bacteria and viruses, these are typically the cause of most acute symptoms of waterborne disease or illness. ORP was also selected because it can be used in combination with pH and metal concentrations to plot the equilibrium potential of electrochemical reactions. This is useful in predicting the corrosion risk and speciation of various chemical constituents in aqueous solutions and is incorporated into the back-end software package.For the hardware, the central measurement system consists of a microprocessor, the five sensor probes, and two expansion boards outfitted with EZO embedded circuits designed for each sensor probe by the manufacturers Atlas Scientific. The microprocessor used in this system is the Raspberry Pi Model 4 B. It acts as the gateway to collect the information from the sensors and transfer the collected data to the Git repository via a wireless network. For point-of-use applications, the device chosen for the microprocessor should have high storage capabilities, flexible connectivity, be low-cost, and have ample computing power to run any necessary programs. The Raspberry Pi meets these requirements with its powerful 1.5 GHz 64-bit quad-core ARM Cortex-A27 processor, onboard 802.11ac Wi-Fi, Bluetooth 5, full gigabit Ethernet, and 2-8 GB of RAM. The Raspberry Pi 4 Model B also has two USB 2.0 ports, two USB 3.0 ports, and a standard 40-pin GPIO header. The GPIO pins connect to the WhiteBox Labs carrier boards. The carrier boards are stackable and contain 6 slots for the Atlas Scientific EZO circuits. These features eliminate the need for multiplexing, wiring, and breadboards. Up to six sensors can be connected at once for data collection with these two carrier boards. The carrier boards connect directly to the Raspberry Pi pins, this allows for easy establishment of serial communication using I2C protocol between the microprocessor and the circuits. The system also contains auxiliary hardware components, including a keyboard, mouse, and a 7 in. LCD screen, strawberry gutter system which allows the operator to run the data acquisition python scripts. Figures 1 show the various individual components of the system. Figure 2 shows the completely assembled system, including the auxiliary hardware.The code for the prototype was written in python integrated development environment software.
Source code from Atlas Scientific was the starting point for the data collection, and modifications were made for customized formatting and real-time data transmission. Once the data acquisition code is initialized by an operator, DO, pH, EC, temperature, and ORP data were received from the sensor in 1s intervals. The data were temperature compensated, parsed into integers, assigned units, positioned into arrays, and displayed in the terminal window during data collection. To save the data, a new file was created, and the collected data was written and saved locally onto the micro-SD card. The software program also automatically pushed the data collected from the microprocessor to a cloud-based GitHub repository via a wireless network.After following the listed steps, the collected data will be accessible to any collaborators with access to the GitHub account. Features on the repository show each data collection run as separate .csv files that can be easily exported to other data processing software such as excel, R, or SPSS. Besides the real-time data collection, an additional program was developed for back-end data processing using the ORP and pH sensor data. The back-end algorithms consist of Pourbaix diagrams developed for copper, zinc, iron, and lead at standard temperatures and pressure. These were created using the Nernst equations for redox reactions and acid-base reactions transcribed in Marcel Pourbaix’s Atlas of Electrochemical Equilibria in Aqueous Solutions Pourbaix diagrams, are electrochemical graphs that show possible thermodynamically stable phases for aqueous systems. Using Pourbaix diagrams one can predict the equilibrium states of all the possible reactions between an element, its ions and its solid and gaseous compounds in the presence of water. To construct a Pourbaix diagram a standard chemical potential and ion activities must be assumed for the substances reacting. These diagrams can be read similarly to a standard phase diagram with electrical potential and pH as the axes. The lines of a Pourbaix diagram are developed using the Nernst Equation and show the equilibrium conditions for the species on each side of that line where each species on either side is said to predominate.Reliable sensor measurements are essential for smooth operation of the proposed prototype. If a sensor fails to provide accurate measurements false ideas of the water quality are perpetuated creating potential health risks or leading to sub-optimal operation. Thus the, first task for testing the system was to ensure the sensors returned accurate and reasonable results for tap water samples. The procedure for preparing the prototype for sample collection so it could be tested in drinking water is detailed in Appendix C. The practice of sensor validation approach consists of using multiple sensors for the same parameter measurement. This technique can determine if a sensor is faulty or if high deviations between multiple sensors exists. This technique was applied by first by measuring the pH, EC, DO, temperature, and ORP for a given sample with the Atlas Scientific sensor probes. The pH, EC, temperature, and ORP of the same sample were then measured using a laboratory grade instrument to compare the results between the two devices. The Myron Ultrameter was selected as the comparative instrument due to its streamlined and accurate functionality. A total of 34 tap water samples for testing were collected from a laboratory tap water faucet at the University of California Los Angeles between the hours of 10 AM to 4 PM from the months of August 2021 through October 2021. The samples were collected in 1000mL volumes in a graduated cylinder. Approximately 100mL of the sample was immediately tested using the Myron Ultrameter for pH, ORP, EC, and temperature. This process consisted of 1) cleaning the cell cup thoroughly with MilliQ water 2) turning on the Myron Ultrameter 3) rinsing the sample cell 3 times with the sample to be tested 4) refilling the sample cell with additional sample 5) pressing the desired measurement key, and 6) recording the values. The remaining ~900mL volume of the sample was transferred to a 1000mL beaker and placed on a hot plate with a magnetic stirrer. The sample was stirred at approximately 750 Rpms to mimic the flow of water through an inline pipe system and provide sufficient flow for the DO sensor. The clean and dry sensor probes were then inserted into the beaker, ensuring that the sensors did not touch the stir bar and had minimal contact with the walls of the 1000mL beaker. For each sensor run the hotplate was maintained at 25℃; however, some small temperature variations were observed during testing due to factors such as varying ambient room temperatures and temperature changes in the building pipes throughout the day. Data collection was then initiated using the steps described in the previous software section. The sensors collected one data point every second continuously for 10 minutes to allow sufficient sensors stabilization time.