A discussion on the limitations of image analysis for determining bubble size in industrial flotation when using algorithms successfully tested from idealized images

: This paper evaluates the capacity of an automated algorithm to detect bubbles and estimate bubble size (Sauter mean diameter, D 32 ) from images recorded in industrial flotation machines. The algorithm is previously calibrated from laboratory images. The D 32 results are compared with semi-automated estimations, which are used as "ground truth". Although the automated algorithm is reliable to estimate bubble size at laboratory scale, a significant bias is observed from industrial images for D 32 > 3.0-4.0 mm. This uncertainty is caused by the presence of small and large bubbles in the same population, with large bubbles forming complex clusters and being observed incomplete, limited by the region of interest. Flotation columns are more prone to this condition, which hinders the estimation of Sauter diameters. The results show the need for bubble size databases that include industrial images. As several image processing tools are currently available, software calibration from ideal bubble images (synthetic or from laboratory rigs) will mostly lead to biased D 32 estimations in industrial flotation machines.


Introduction
Froth flotation remains as one of the most versatile separation technique, essential for processing complex and low-grade ores (Wills and Finch, 2016).Gas dispersion is a critical parameter in flotation as it significantly impacts particle-bubble interactions and froth stability, directly influencing the flotation rate constant and overall performance (Gorain, 1998;Wills and Finch, 2016;Barbian et al., 2006).One of the variables defining gas dispersion is bubble size, which is typically characterized by the Sauter diameter,  32 = ∑  B,i 3 ∑  B,i 2 ⁄ , with dB, i representing the diameter of the i-th bubble of a sample.Smaller bubbles increase the overall bubble surface, which improves the collection efficiency (Tao, 2005).Optimal control of bubble size requires adequate gas injection systems and operating conditions, such as frother dosage, superficial gas rate, and impeller speed (Nesset, 2011).
Several methods to estimate bubble size have been proposed in flotation literature.Conductivitybased technologies have shown suitable results in two-phase systems; however, the measurement quality significantly degrades in industrial systems (Chen et al., 2001).The UCT (University of Cape Town) bubble size analyser has been claimed to obtain a suitable overall performance, using a capillary sampler (Randall et al., 1989).Nevertheless, biased results have been reported due to large bubbles can break inside the capillary tube (Grau and Heiskanen, 2002).One of the most widely used technologies to measure bubble size in flotation consists of bubble viewers for sampling combined with image processing tools (Rodrigues and Rubio, 2003;Hernandez-Aguilar et al., 2002;Malysa et al., 1999).Bubble viewers allow the bubbles to be sampled from the collection zone, which are photographed in a 2D plane to reduce errors related to variable depth of field.The images are processed to detect bubbles and estimate bubble size distributions (BSD) and D32 values (Bailey et al., 2005;Mesa et al., 2022).The bubble viewers are typically compatible with plant conditions, which has expanded their use in hydrodynamic assessment at large scale.However, the image processing tools have mostly been designed and calibrated from laboratory images, and their performance from industrial data have been a concern among flotation practitioners due to uncertainties caused by the misidentification of irregular-shaped bubbles and complex clusters (Karn et al., 2015;Ma et al., 2014;Riquelme et al., 2013).
In recent years, machine learning and artificial intelligence (AI) have been considered attractive techniques for automated and accurate measurements of bubble size (Haas et al., 2020;Hessenkemper et al., 2022).However, these technologies present inherent limitations.For example, the need for extensive training datasets, which must incorporate as many operating conditions as possible to effectively segment complex clusters and accurately identify non-convex bubbles.Synthetic images have also been used to train learning machines (Chen et al., 2023;Poletaev et al., 2016); however, these images have been highly idealized, not representing industrial conditions typically observed in froth flotation.
Bubble size measurements at industrial scale face significant challenges due to the presence of solids, variable lighting, access limitations for bubble sampling, and uncertainties on the operating conditions.In addition, industrial bubble size distributions are more prone to be heterogeneous compared to those observed under controlled conditions.This paper illustrates the limitations in D32 estimations at industrial scale, when automated image-processing algorithms are employed in the characterization, and whose parameters (thresholds) are defined from on a successful performance using ideal images.The need for industrial databases that incorporate representative bubble images and reliable bubble size estimations is briefly discussed.

Experimental procedure at laboratory scale
Bubble size was measured in the laboratory-scale flotation cell shown in Fig. 1, which emulated a slice of a forced-air industrial machine with a 140 × 140 cm cross-section and a width of 15 cm.The air was fed from 24 porous spargers and controlled by a needle valve.The McGill bubble size analyser (MBSA) described by Gomez and Finch (2002) was used for bubble sampling and image recording.The MBSA was initially filled with conditioned water, with the same frother concentration as in the flotation cell.The bubbles were photographed in a 2D plane with a Teledyne Dalsa video camera at a resolution of 0.056 mm/pxl.

Experimental procedure at large scale
A similar procedure was employed at industrial scale.The bubble size measurements were conducted in mechanical flotation cells (self-aerated and forced air) and flotation columns from different concentrators.One hundred and sixty-eight datasets were analysed, with sixty-seven of them corresponding to measurements in flotation columns.The sampling tube of the MBSA was immersed about 15-30 cm below the pulp-froth interface to capture bubbles entering the froth.The chamber was fully filled with process water and a digital video camera (Canon GL2) was used for image acquisition, at a sampling rate of 30 frames per second.The MBSA was completely sealed to avoid possible leaks in the industrial measurements.For further details on the experimental procedure at industrial scale, please refer to Vinnett et al. (2012).

Image processing
For each experimental condition, a subset of the recorded images was randomly chosen, analysing more than 1500 bubbles per test.However, a minimum of 10 images were processed, which was specially defined for conditions with high gas hold-ups.All images were analysed in laboratory tests operated at no frother.The images were first automatically processed to identify bubbles and estimate their size.The image-processing tool was developed in Matlab (The MathWorks, USA).The algorithm described by Vinnett et al. (2020) was improved and employed in the automatic detection.The new hierarchical approach consisted of the following steps: (i) Image binarization.
(ii) Bubble detection based on solidity and estimation of bubble axes considering the ellipse that has the same normalized second central moments as those generated by the object.(iii) The detected bubbles in the previous step were removed from the binary images.(iv) Watershed segmentation and object detection based on ellipse fitting.In the latter, the approach reported by Fitzgibbon et al. (1996) was used.
Step (iii) was repeated, and subsequently, steps (ii) and (iii) were repeated.(v) Circle detection by Hough Transform was performed for spherical bubbles not detected previously.
Step (iii) was repeated, and subsequently, steps (ii) and (iii) were again repeated.The size of each identified bubble was estimated as an equivalent ellipsoid diameter.Table 2 summarizes the thresholds for the bubble identification and size estimation.These values were obtained from an overall analysis of laboratory images.The detection and estimation thresholds were upgraded with respect to the solution presented by Vinnett et al. (2020), taking the new hierarchical algorithm into consideration.Although the thresholds were not optimized for each experimental condition, they allowed for an adequate trade-off between successfully bubble detection and the number of misidentified bubbles.Similarly, the thresholds for large-scale measurements approached those obtained from laboratory measurements.Threshold to define edge pixels of circles 0.66 The automated image analysis was complemented by manual processing.Bubbles whose axes were erroneously estimated were first removed from the results and manually processed.Similarly, nonidentified bubbles such as bubbles in clusters and irregular bubbles were manually estimated.Fig. 2 illustrates an example of an image acquired at laboratory scale that was processed by the semiautomated procedure.Bubbles that are highlighted by their border (or best ellipse) were automatically detected, whereas bubbles highlighted by crosses were manually identified.Fig. 2. Example of an image that was processed by the semi-automated procedure

Results and discussion
Bubble size estimations obtained from the automated algorithm were compared to the results obtained from the semi-automated procedure.The latter was considered as "ground truth".The comparisons were conducted in terms of the Sauter mean diameter of the bubble size samples, as this parameter has proven to be correlated with the flotation rate constant (Gorain et al., 1997).Fig. 3 shows the results from the laboratory-scale data.The automated algorithm resulted effective in determining D32, independent that a percentage of bubbles were not detected, and some objects (bubbles or clusters) were misidentified.The coefficient of correlation between the automated and semi-automated estimates for the D32 values was 0.9933.As in this and most of the image processing tools the detection thresholds are defined from ideal data, a high correlation between the D32 estimations was expected at laboratory scale.
The same comparison was conducted from industrial data consisting of bubble images recorded in mechanical flotation cells and flotation columns.No changes were made to the threshold definition presented in Table 1.Industrial data are typically more heterogeneous and then the automated algorithm was tested under non-ideal bubble images.In addition, industrial images were subject to additional source errors due to lighting limitations, uncertainties on the gas flowrate and hold-up, and the presence of solids.
Fig. 4 shows the industrial comparison for the D32 estimation, which was classified by plant in Fig. 4(a) (Plant A to G) and by type of flotation machine in Fig. 4

(b) (mechanical cells and flotation columns).
The industrial results presented higher variability with respect to the laboratory data, which was critical for D32 > 4.0 mm (D32 from the semi-automated algorithm).Although the estimated bubble sizes may Physicochem. Probl. Miner. Process., 59(5), 2023, 174474 have been influenced and biased by the presence of a few large cap-shaped bubbles, results from Fig. 4(b) indicates that the gas dispersion mechanism plays a role in the D32 reliability.For mechanical cells, the automated algorithm proved leading to comparable robustness in the D32 estimations with respect to those obtained at lab scale (Fig. 3), except for a higher variability and some abnormal conditions as discussed by Vinnett et al. (2022b).Gas dispersion in flotation columns is sensitive to sparger limitations, inappropriate air pressures, and lack of maintenance or replacement of the air injection systems.As a result, bubble size in flotation columns was measured in the transition from spherical-ellipsoidal regimes to ellipsoidal and churn-turbulent regimes., the presence of large and small bubbles in the same population led to a D32 underestimation from the automated approach.As illustrated in Fig. 5(a), 5(g) and 5(j), large irregular and ellipsoidal bubbles tend to collide in the visual field forming complex clusters, hindering the segmentation by classical algorithms.In addition, large bubbles tend to be observed in the borders of the region of interest, which also complicates their identification.As a result, the Sauter mean diameters are underestimated because the relative presence of large bubbles artificially decreased when the images are automatically processed.
Fig. 5.A sequence of automatic bubble identification from images recorded in a flotation column Results from Figs. 3 to 5 show that defining an automated algorithm to determine bubble size from ideal images leads to significant bias when using the same application at large scale.This is particularly critical when the flotation technologies present limitations in gas dispersion, leading to bubbles in the transition between ellipsoidal and turbulent regimes (D32 > 3-4 mm).The use of neural networks to estimate bubble size is an attractive solution to avoid the biased results presented in Fig. 4.However, current machine learning and artificial intelligence approaches to estimate bubble size have employed laboratory-scale or synthetic images as ground truth.This procedure is certainly useful when the proposed tool will only be applied at laboratory scale; nevertheless, the results of Fig. 3 indicate that estimating bubble size under ideal conditions does not seem to require advanced image processing techniques.On the contrary, the main potential of advanced algorithms is in the analysis of industrial images, from which extensive databases are required.Therefore, the development of new algorithms must consider the use of both laboratory and industrial data to generalize applicability.Otherwise, uncertainties in the D32 estimation comparable to those presented in Fig. 4 are expected, when characterizing bubble size at large scale.

Conclusions
Bubble size was estimated at laboratory and industrial scales in terms of the Sauter mean diameters.Both databases included more than 150 experimental conditions.A bubble viewer along with a hierarchical algorithm were employed in the bubble size measurements and estimations.The software performance to automatically detect bubbles from industrial images was assessed, considering that the algorithm parameters (thresholds) were defined from laboratory conditions.A semi-automated D32 estimation allowed for non-biased bubble size estimations, which were used as ground truth.The main findings of this study are as follows: • The automated algorithm was effective to estimate the Sauter diameter of bubble size populations at laboratory scale, with a correlation coefficient of 0.9933 between the automated and semiautomated approaches.• The automated D32 estimations were subject to much higher variability at industrial scale for D32 > 4.0 mm.This uncertainty was caused by the presence of large and small bubbles in the same population.The former has higher probabilities of being observed in complex clusters or incompletely photographed in the region of interest.As a result, these bubbles are mostly removed from the automated analysis, biasing the D32 estimations.• Biases in the bubble size estimation at large scale proved to be correlated with the gas dispersion mechanism.Most of the biased results were related to bubble size measurements in flotation columns, which have been designed to generate suitable gas dispersion conditions.However, limitations in the injection systems and their maintenance have commonly led to the presence of large and small bubbles in the same population, with the former distorting the D32 estimations.Most of the current studies on bubble sizing have been focused on the use of more robust and sophisticated algorithms to determine bubble size in flotation.The typical conclusion of those studies is the goodness of the proposed solution, which has been tested from either laboratory or synthetic data.The results presented here proved that an industrial database must be raised to generalize the performance of new algorithms, considering the potential of machine learning and artificial intelligence techniques to determine bubble size.
Fig. 1.Two-dimensional flotation cell and installation of the McGill bubble size analyser(Vinnett et al., 2022a)

Table 1 .
Experimental conditions at laboratory scale

Table 2 .
Thresholds in the automatic bubble detection and size estimation.