Procesamiento de Señales e Imágenes Digitales.

Permanent URI for this collectionhttp://98.81.228.127/handle/20.500.12404/5040

Browse

Now showing 1 - 19 of 19

3D reconstruction of chronic wounds using a hand-held camcorder and its application in cutaneous leishmaniasis wounds
(Pontificia Universidad Católica del Perú, 2017-03-09) Casas Guido, Eda Leslie Mónica; Castañeda Aphan, Benjamín
Chronic wounds are a major healthcare problem worldwide which mainly a ects geriatric population and patients with limited mobility. In tropical countries, Cutaneous Leishmaniasis (CL) is also a cause for chronic wounds, being endemic in 75% of Peru . In this context, the assessment of these type of wounds represents a big challenge due to the limited access to specialized medical resources. This work aims to develop a video-based method to compute the 3D point cloud of skin wounds which could provide accurate metrics for medical assessment despite of the location of the patient. Recently, CL specialists have used metrics as volume in clinical assessment with promising results. The acquisition protocol is prompt to be user friendly and feasible in remote locations; the video is taken using a commercial hand-held video camera without a rig or special illumination. The algorithm follows the Structure from Motion methodology: FAST feature detector, pyramidal optical flow and Jacob’s method for missing points estimation. The results show good performance in terms of accuracy and repeatability of the point cloud computation, less than 0.6 mm and 0.21 mm respectively. However, experiments suggest that the volume computation technique does not adapt well to the proposed method output and requires a deeper analysis. The method has been entirely implemented using open source libraries.
3D updating of solid models based on local geometrical meshes applied to the reconstruction of ancient monumental structures
(Pontificia Universidad Católica del Perú, 2014-10-14) Zvietcovich Zegarra, José Fernando; Castañeda Aphan, Benjamín; Perucchio, Renato
We introduce a novel methodology for locally updating an existing 3D solid model of a complex monumental structure with the geometric information provided by a 3D mesh (point cloud) extracted from the digital survey of a specific sector of a monument. Solid models are fundamental for engineering analysis and conservation of monumental structures of the cultural heritage. Finite elements analysis (FEA), the most versatile and commonly used tool for the numerical simulation of the static and dynamic response of large structures, requires 3D solids which accurately represent the outside as well as the inside geometry and topology of the domain to be analyzed. However, the structural changes introduced during the lifetime of the monument and the damage caused by anthropogenic and natural factors contribute to producing complex geometrical configurations that may not be generated with the desired accuracy in standard CAD solid modeling software. On the other hand, the development of digital techniques for surveying historical buildings and cultural monuments, such as laser scanning and photogrammetric reconstruction, has made possible the creation of accurate 3D mesh models describing the geometry of those structures for multiple applications in heritage documentation, preservation, and archaeological interpretations. The proposed methodology consists of a series of procedures which utilize image processing, computer vision, and computational geometry algorithms operating on entities defined in the Solid Modeling space and the Mesh space. The operand solid model is defined as the existing solid model to be updated. The 3D mesh model containing new surface information is first aligned to the operand solid model via 3D registration and, subsequently, segmented and converted to a provisional solid model incorporating the features to be added or subtracted. Finally, provisional and operand models are combined and data is transferred through regularized Boolean operations performed in a standard CAD environment. We test the procedure on the Main Platform of the Huaca de la Luna, Trujillo, Peru, one of the most important massive earthen structures of the Moche civilization. Solid models are defined in AutoCAD while 3D meshes are recorded with a Faro Focus laser scanner. The results indicate that the proposed methodology is effective at transferring complex geometrical and topological features from the mesh to the solid modeling space. The methodology preserves, as much as possible, the initial accuracy of meshes on the geometry of the resultant solid model which would be highly difficult and time consuming using manual approaches.
A study of new methods and techniques for ultrasonic attenuation estimation
(Pontificia Universidad Católica del Perú, 2017-03-09) Zenteno Valdiviezo, Omar Jonathan; Lavarello Montero, Roberto Janniel
The pathological states of biological tissue are often related in attenuation changes of itself. Thus, information about attenuating properties of tissue is valuable for the physician and could be useful in ultrasonic diagnosis. However, accurate characterization of tissue pathologies using ultrasonic attenuation is strongly dependent on the accuracy of the algorithm that is used to obtain the attenuation coefficient estimates. In the present document, we derive a new attenuation estimation method which uses the analytical backscatter coefficient (BSC) diffraction compensation function for single-element transducers proposed by Chen et al. and compare it to a reference phantom attenuation estimation method. The accuracy of the two methods was evaluated. The results showed that an accurate attenuation coefficient mean value can be estimated by the two methods presenting a low mean percentile error (MPE<6%). However, the coefficient of variation of the estimates remains higher than the desired values (CV>62%). Moreover, to remove the inherent size of the ROI’s limitation due to the high variability of the estimator, the use of full angular spatial compounding was extended to the estimation of attenuation coefficients and its performance was experimentally evaluated using two physical phantoms. The results suggest that the variance and field of view of attenuation imaging can be significantly improved without sacrificing estimation accuracy. Based on these observations, the analytic diffraction compensation method was applied in an animal model to estimate the mean attenuation value of thyroids lobes. To reduce variability on the estimates, a three neighboring layer spatial compounding approach was applied. The results suggest the mean attenuation value can potentially discriminate a particular pathology on thyroid from malignant and normal tissues. The final conclusions lead to remark the potential of parametric imaging of tissue attenuation by the analytic diffraction compensation method in conjunction with spatial compounding as a useful tool for medical detection and diagnostic.
Automatic regularization parameter selection for the total variation mixed noise image restoration framework
(Pontificia Universidad Católica del Perú, 2013-03-27) Rojas Gómez, Renán Alfredo; Rodríguez Valderrama, Paúl Antonio
Image restoration consists in recovering a high quality image estimate based only on observations. This is considered an ill-posed inverse problem, which implies non-unique unstable solutions. Regularization methods allow the introduction of constraints in such problems and assure a stable and unique solution. One of these methods is Total Variation, which has been broadly applied in signal processing tasks such as image denoising, image deconvolution, and image inpainting for multiple noise scenarios. Total Variation features a regularization parameter which defines the solution regularization impact, a crucial step towards its high quality level. Therefore, an optimal selection of the regularization parameter is required. Furthermore, while the classic Total Variation applies its constraint to the entire image, there are multiple scenarios in which this approach is not the most adequate. Defining different regularization levels to different image elements benefits such cases. In this work, an optimal regularization parameter selection framework for Total Variation image restoration is proposed. It covers two noise scenarios: Impulse noise and Impulse over Gaussian Additive noise. A broad study of the state of the art, which covers noise estimation algorithms, risk estimation methods, and Total Variation numerical solutions, is included. In order to approach the optimal parameter estimation problem, several adaptations are proposed in order to create a local-fashioned regularization which requires no a-priori information about the noise level. Quality and performance results, which include the work covered in two recently published articles, show the effectivity of the proposed regularization parameter selection and a great improvement over the global regularization framework, which attains a high quality reconstruction comparable with the state of the art algorithms.
Backscatter coefficient estimation using highly focused ultrasound transducers
(Pontificia Universidad Católica del Perú, 2014-05-26) Panizo Ríos, Diego; Lavarello Montero, Roberto Janniel
The backscatter coefficient (BSC) is an intrinsic property that quantifies the amount of energy that is reflected by a material as function of the ultrasound wave frequency. BSCs have been proposed for decades for tissue characterization, along with quantitative ultrasound (QUS) parameters derived from BSCs that have been used to construct images that represent how these properties vary spatially. The availability of formulations based on weakly focusing conditions has resulted in a widespread use of large focal number transducers for BSC estimation. The use of highly focused transducers offers the possibility of improving the spatial resolution of BSC-based imaging. The model by Chen et al. [1] was developed for estimating BSCs using transducers of arbitrary focal number. However, to this date only preliminary experimental validation of this method has been performed. The goals of the present study are to analyze for the first time the accuracy of Chen’s [1] method when estimating BSCs using highly focused transducers through both simulations and experiments, and to analyze the accuracy on the estimation of QUS parameters derived from BSCs (specifically the effective scatterer size (ESD) and concentration (ESC)) applying the Chen et al. [1] model. To achieve these goals, a theoretical model of BSC synthesis based on the method of Chen et al. [1]. was derived and used with simulated data. The model considers frequency dependent diffraction patterns, and the scatterers in the synthetic data replicate the properties of solid spheres. In experiments, data obtained using highly focused transducers from a physical phantom containing glass beads was used. This experimental data was appropriately compensated for attenuation and transmission effects. The accuracy of Chen’s method was evaluated calculating the mean fractional error between the estimated and theoretical BSCs curves for both simulations and experiments. Also, the QUS parameters were estimated and compared with real known parameters. BSCs and QUS parameter estimates were obtained from regions of interest from both the transducer focus and throughout the transducer focal region. Finally, the sound speed and the transducer focus were varied in appropriate ranges when processing the data for the BSC and QUS values estimation in order to assess the robustness of the method to uncertainties in these parameters. The results showed that BSCs and QUS parameters can be accurately estimated using highly focused transducers if the appropriate model is used, with regions of interest not restricted to be centered at the focus but to the full extension of the -6-dB transducer focal region. It was also verified that well estimated parameters as the sound speed and transducer focus are necessary in order to obtain accurate BSCs and QUS parameters estimates.
Characterization of healthy skin with high-frequency ultrasound using quantitative ultrasound
(Pontificia Universidad Católica del Perú, 2018-08-20) Saavedra Bazán, Ana Cecilia; Castañeda Aphan, Benjamín
The skin is the largest organ of the body that protects it from the external environment. High- frequency ultra sound (HF-US) has been used to visualize the skin in depth and to diagnose some pathologies in dermatological applications. Quantitative ultrasound (QUS) includes several techniques that provide values of particular physical properties. In this thesis work, three QUS parameters are explained and used to characterize healthy skin through HF-US: attenuation coefficient slope (ACS), backscatter coefficient (BSC) and shear wave speed (SWS). They were estimated with the regularized spectral-log difference (RSLD) method, the reference phan- tom method, and the crawling wave sonoelastography method, respectively. All the three parameters were assessed in phantoms, ex vivo and in vivo skin. In calibrated phantoms, RSLD showed a reduc- tion of up to 93% of the standard deviation concerning the estimation with SLD, and BSC showed an agreement with the Faran’s theoretical curve. In gelatin-based phantoms, surface acoustic waves (SAWs) were estimated in two interfaces: solid-water and solid-US gel, which all owed corroborating SAWs presence and finding an empirical compensation factor when the coupling interface is US gel. A correction factor of 0:97 for SAW-to-shear was found to avoid underestimation in phantoms. Porcine thigh was calculated in the range from 8 to 27 MHz, where the ACS was 4:08 _+_0:43 dB cm -1 MHz-1 and BSC was in the range from 10 1 to 10° sr-1 _cm-1. Crawling wave sonoelastography method was applied for the vibration frequencies between 200 Hz and 800 Hz, where SWS was in the range from 4:6 m/sto9:1 m/s. In vivo ACS and BSC were assessed in the healthy forearm and thigh, whereas SWS only in the thigh. The average ACS in the forearm dermis was 2.07dB cm-1 _MHz-1, which is in close agreement with the literature. A significant difference (p < 0.05) was found between the ACS in the forearm dermis and the thigh dermis (average ACS of 2.54dB cm-1 _MHz-1). The BSC of the forearm and thigh dermis were in the range from 10 -1 to 10° sr-1 _cm-1, and in the range from 10-1 to 10° sr-1 _cm-1, respectively. The SWS in the thigh dermis was 2:4 _+_0:38 m/s for a vibration frequency of 200Hz, with an increasing trend as frequency increases. Results suggest that these QUS parameters have the potential to be used as a tool for in vivo skin characterization and show potential for future application in skin lesions.
Computationally inexpensive parallel parking supervisor based on video processing
(Pontificia Universidad Católica del Perú, 2013-12-05) Espejo Pérez, Caterina María; Rodríguez Valderrama, Paúl Antonio
Parallel parking, in general, is a moderate difficulty maneuver. Moreover, for inexperienced drivers, it can be a stressful situation that can lead to errors such as stay far from the sidewalk or damage another vehicle resulting in traffic tickets that range from simple parking violation to crash-related violations. In this work, we propose a computationally effective approach to perform a collisionfree parallel parking. The method will calculate the minimum parking space needed and then the efficient path for the parallel parking. This method is computationally inexpensive in comparison with the current state of the art. Moreover, it could be used by any car because the parameters needed to perform all computations are taken from the specifications of real cars. Preliminary results of this work were summarized in [1] that was presented at the 15th International IEEE Conference on Intelligent Transportation Systems. The simulation and experimental data show the effectiveness of the method. This effectiveness is specified when the path followed by the driver and the path calculated with the method are compared. The image capture of the vehicle is used to get the path made by the driver for the parallel parking. Furthermore, road surface marks were determined (in a parking lot) as a visual aid for the drivers in order to perform the parallel parking maneuver. After analyzing the paths, it is noted that the vehicles that properly followed the marks, parked correctly.
Corn crops identification using multispectral images from unmanned aircraft systems
(Pontificia Universidad Católica del Perú, 2019-04-08) Trujillano Asato, Fedra Catherine; Racoceanu, Daniel
Climate change and migration of population from rural to urban areas are affecting the agricultural production around the world. This study was based in the particular department of Ancash - Peru where corn is one of the most important crops of the region. Authorities in this region are concerned in finding a method, different from census; that can constantly monitor corn crops areas. This data is important to evaluate how these two causes will impact on food security in Ancash. The first part of the present thesis reviews the current techniques in the recognition of crop areas using remote sensing and multispectral images. The second part explains the methodology developed for this study, considering the data acquisition using Unmanned Aircraft Systems, the preparation of the acquired data and two deep learning model approaches. The first approach is based on binary classification of corn patches using Le Net model with near infrared images. The second one describes the segmentation of corn areas in different stages using the U-net model, in this case five band images were considered. The third part shows the results of both approaches. From these results it is concluded that training a model with data from different stages and scenarios of two campaigns (2016 and 2017) can achieve a 95% of accuracy in corn segmentation.
Deep Learning for Semantic Segmentation versus Classification in Computational Pathology: Application to mitosis analysis in Breast Cancer grading
(Pontificia Universidad Católica del Perú, 2019-04-12) Jiménez Garay, Gabriel Alexandro; Racoceanu, Daniel
Existing computational pathology approaches did not allow, yet, the emergence of effective/efficient computer-aided tools used as a second opinion for pathologists in the daily practice. Focusing on the case of computer-based qualification for breast cancer diagnosis, the present article proposes two deep learning architectures to efficiently and effectively detect and classify mitosis in a histopathological tissue sample. The first method consisted of two parts, entailing a preprocessing of the digital histological image and a free-handcrafted-feature Convolutional Neural Network (CNN) used for binary classification. Results show that the methodology proposed can achieve 95% accuracy in testing with an F1-score of 94.35%, which is higher than the results from the literature using classical image processing techniques and also higher than the approaches using handcrafted features combined with CNNs. The second approach was an end-to-end methodology using semantic segmentation. Results showed that this algorithm can achieve an accuracy higher than 95% in testing and an average Dice index of 0.6 which is higher than the results from the literature using CNNs (0.9 F1-score). Additionally, due to the semantic properties of the deep learning approach, an end-to-end deep learning framework is viable to perform both tasks: detection and classification of mitosis. The results showed the potential of deep learning in the analysis of Whole Slide Images (WSI) and its integration to computer-aided systems. The extension of this work to whole slide images is also addressed in the last two chapters; as well as, some computational key points that are useful when constructing a computer-aided-system inspired by the described technology.
Desarrollo y comparación de diversos mapas de probabilidades en 3D del cáncer de próstata a partir de imágenes de histología
(Pontificia Universidad Católica del Perú, 2013-12-04) Díaz Rojas, Kristians Edgardo; Castañeda Aphan, Benjamín
Understanding the spatial distribution of prostate cancer and how it changes according to prostate specific antigen (PSA) values, Gleason score, and other clinical parameters may help comprehend the disease and increase the overall success rate of biopsies. This work aims to build 3D spatial distributions of prostate cancer and examine the extent and location of cancer as a function of independent clinical parameters. The border of the gland and cancerous regions from whole-mount histopathological images are used to reconstruct 3D models showing the localization of tumor. This process utilizes color segmentation and interpolation based on mathematical morphological distance. 58 glands are deformed into one prostate atlas using a combination of rigid, a ne, and b-spline deformable registration techniques. Spatial distribution is developed by counting the number of occurrences in a given position in 3D space from each registered prostate cancer. Finally a di erence between proportions is used to compare di erent spatial distributions. Results show that prostate cancer has a significant di erence (SD) in the right zone of the prostate between populations with PSA greater and less than 5 ng=ml. Age does not have any impact in the spatial distribution of the disease. Positive and negative capsule-penetrated cases show a SD in the right posterior zone. There is SD in almost all the glands between cases with tumors larger and smaller than 10% of the whole prostate. A larger database is needed to improve the statistical validity of the test. Finally, information from whole-mount histopathological images could provide better insight into prostate cancer.
Efficient algorithms for convolutional dictionary learning via accelerated proximal gradient
(Pontificia Universidad Católica del Perú, 2019-04-05) Silva Obregón, Gustavo Manuel; Rodríguez Valderrama, Paul Antonio
Convolutional sparse representations and convolutional dictionary learning are mathematical models that consist in representing a whole signal or image as a sum of convolutions between dictionary filters and coefficient maps. Unlike the patch-based counterparts, these convolutional forms are receiving an increase attention in multiple image processing tasks, since they do not present the usual patchwise drawbacks such as redundancy, multi-evaluations and non-translational invariant. Particularly, the convolutional dictionary learning (CDL) problem is addressed as an alternating minimization between coefficient update and dictionary update stages. A wide number of different algorithms based on FISTA (Fast Iterative Shrinkage-Thresholding Algorithm), ADMM (Alternating Direction Method of Multipliers) and ADMM consensus frameworks have been proposed to efficiently solve the most expensive steps of the CDL problem in the frequency domain. However, the use of the existing methods on large sets of images is computationally restricted by the dictionary update stage. The present thesis report is strategically organized in three parts. On the first part, we introduce the general topic of the CDL problem and the state-of-the-art methods used to deal with each stage. On the second part, we propose our first computationally efficient method to solve the entire CDL problem using the Accelerated Proximal Gradient (APG) framework in both updates. Additionally, a novel update model reminiscent of the Block Gauss-Seidel (BGS) method is incorporated to reduce the number of estimated components during the coefficient update. On the final part, we propose another alternative method to address the dictionary update stage based on APG consensus approach. This last method considers particular strategies of theADMMconsensus and our first APG framework to develop a less complex solution decoupled across the training images. In general, due to the lower number of operations, our first approach is a better serial option while our last approach has as advantage its independent and highly parallelizable structure. Finally, in our first set of experimental results, which is composed of serial implementations, we show that our first APG approach provides significant speedup with respect to the standard methods by a factor of 1:6 5:3. A complementary improvement by a factor of 2 is achieved by using the reminiscent BGS model. On the other hand, we also report that the second APG approach is the fastest method compared to the state-of-the-art consensus algorithm implemented in serial and parallel. Both proposed methods maintain comparable performance as the other ones in terms of reconstruction metrics, such as PSNR, SSIM and sparsity, in denoising and inpainting tasks.
Evaluation of Elastographic techniques generated by means of external vibration
(Pontificia Universidad Católica del Perú, 2017-12-01) Arroyo Barboza, Johnny Junior; Castañeda Aphan, Benjamín; Salcudean, Tim
Breast cancer is one of the greatest problems of national and international public health, whose incidence among women population shows an increasing trend. Nowadays there are several elastographic techniques, which seek to characterize the tissue, that is, to analyze the response produced by the application of a perturbation in the medium, to describe its mechanical properties. Among the modalities used are ultrasound, nuclear magnetic resonance and optical coherence tomography. On the other hand, among the types of disturbance used are low frequency mechanical waves, a uniform compression force or acoustic radiation force. In this thesis work, ultrasound was used due to its low economical cost in comparison to the other modalities. In addition, the type of perturbation selected was the external mechanical vibration, as it ensures the achievement of quantitative results, there is no risk of temperature rise in the analyzed area and allows the repeatability of the results obtained. Hence, two elastographic techniques were the axes of the present work: vibro-elastography and normal vibration holography. For the first, a calibrated phantom and a gelatin-based phantom were used, in order to characterize and validate the technique over a wide range of excitation frequencies. Posteriorly, 18 patients were analyzed prior biopsy exam, obtaining elastograms and contrasting them with the respective biopsy results. The results suggest that the technique is able to identify the presence of benign or malignant cancer, and the elasticity estimated agree with values reported in the literature. The second technique is proposed in the elastography field for the first time. Based on holography, its experimental scheme is established, and the mathematical expression for shear speed estimation is presented. Results from simulation and experiments performed on homogeneous and heterogeneous phantoms are presented, and the estimates are compared with previously obtained reference values. The results suggest that the estimates are close to the reference values for all media tested, and the technique must be studied in depth to revert artifacts formation.
Evaluation of shear wave speed measurements using crawling waves sonoelastography and single tracking location acoustic radiation force impulse imaging
(Pontificia Universidad Católica del Perú, 2015-07-25) Ormachea Quispe, Juvenal; Castañeda Aphan, Benjamín; Parker, Kevin J.
Many pathological conditions are closely related with an increase in tissue sti ness. For many years, experts performed manual palpation in order to measure elasticity changes, however, this method can only be applied on superficial areas of the human body and provides crude sti ness estimation. Elastography is a technique that attempts to characterize the elastic properties of tissue in order to provide additional and useful information for clinical diagnosis. For more than twenty years, di erent research groups have developed various elastography modalities with a strong interest for quantitative images during the last decade. Recently, comparative studies among di erent elastographic techniques have been performed in order to better characterize biomaterials, to cross-validate several shear wave elastographic modalities and to study the factors that influence their precision and accuracy. This comparison works may contribute to achieve standardization in quantitative elastography and their use in commercial equipment for their application in human patients. However, there is still a limited literature in the field of quantitative elastography modalities comparisons. This thesis focuses on the comparison between two elastographic techniques: crawling wave sonoelastography (CWS) and single tracking location-acoustic radiation force impulse (STL-ARFI). The comparison shows the estimation of the shear wave speed (SWS), lateral resolution, contrast and contrast-to-noise ratio (CNR) in homogeneous and inhomogeneous phantoms using both techniques. The SWS values obtained with both modalities are validated with mechanical measurements that are considered as ground truth. The SWS results for the three di erent homogeneous phantoms (10%, 13%, and 16% gelatin concentrations), show good agreement between CWS, STL-ARFI and mechanical measurements as a function of frequency. The maximum accuracy errors obtained with CWS were 2.52%, 1.63% and 2.26%. For STL-ARFI, the maximum errors were 6.22%, 5.63% and 4.08% for the 10%,13% and 16% gelatin phantom respectively. For lateral resolution, contrast and CNR estimated in the inhomogeneous phantoms, it can be seen that for vibration frequencies higher than 340 Hz, CWS presents better results than the obtained with STL-ARFI using distances between the push beams ( x) higher than 4 mm. However, using these vibration frequencies will not be feasible for in vivo tissues due to attenuation problems. It that sense, for lower vibration frequencies than 300 Hz and x among 3 mm and 6 mm, comparable lateral resolution, contrast and CNR was obtained. Finally, the results of this study contribute to the data currently available for comparing elastographic techniques. Moreover, the methodology implemented in this document may be helpful for future standardization for di erent elastographic modalities.
Multi-scale image inpainting with label selection based on local statistics
(Pontificia Universidad Católica del Perú, 2014-09-09) Paredes Zevallos, Daniel Leoncio; Rodríguez Valderrama, Paúl Antonio
We proposed a novel inpainting method where we use a multi-scale approach to speed up the well-known Markov Random Field (MRF) based inpainting method. MRF based inpainting methods are slow when compared with other exemplar-based methods, because its computational complexity is O(jLj2) (L feasible solutions’ labels). Our multi-scale approach seeks to reduces the number of the L (feasible) labels by an appropiate selection of the labels using the information of the previous (low resolution) scale. For the initial label selection we use local statistics; moreover, to compensate the loss of information in low resolution levels we use features related to the original image gradient. Our computational results show that our approach is competitive, in terms reconstruction quality, when compare to the original MRF based inpainting, as well as other exemplarbased inpaiting algorithms, while being at least one order of magnitude faster than the original MRF based inpainting and competitive with exemplar-based inpaiting.
Object detection in videos using principal component pursuit and convolutional neural networks
(Pontificia Universidad Católica del Perú, 2018-05-03) Tejada Gamero, Enrique David; Rodríguez Valderrama, Paul Antonio
Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) for object detection in the Imagenet Large Scale Visual Recognition Competition (ILSVRC), its use for image detection and classification has increased, becoming the state-of-the-art for such task, being Faster R-CNN the preferred model in the latest ILSVRC challenges. Moreover, the Faster R-CNN model, with minimum modifications, has been succesfully used to detect and classify objects (either static or dynamic) in video sequences; in such setup, the frames of the video are input “as is” i.e. without any pre-processing. In this thesis work we propose to use Robust PCA (RPCA, a.k.a. Principal Component Pursuit, PCP), as a video background modeling pre-processing step, before using the Faster R-CNN model, in order to improve the overall performance of detection and classification of, specifically, the moving objects. We hypothesize that such pre-processing step, which segments the moving objects from the background, would reduce the amount of regions to be analyzed in a given frame and thus (i) improve the classification time and (ii) reduce the error in classification for the dynamic objects present in the video. In particular, we use a fully incremental RPCA / PCP algorithm that is suitable for real-time or on-line processing. Furthermore, we present extensive computational results that were carried out in three different platforms: A high-end server with a Tesla K40m GPU, a desktop with a Tesla K10m GPU and the embedded system Jetson TK1. Our classification results attain competitive or superior performance in terms of Fmeasure, achieving an improvement ranging from 3.7% to 97.2%, with a mean improvement of 22% when the sparse image was used to detect and classify the object with the neural network, while at the same time, reducing the classification time in all architectures by a factor raging between 2% and 25%.
Regularized spectral log difference technique for ultrasonic attenuation imaging
(Pontificia Universidad Católica del Perú, 2017-07-13) Coila Pacompia, Andres Leonel; Lavarello Montero, Roberto Janniel
The attenuation coefficient slope (ACS) has the potential to be used for tissue characterization and as a diagnostic ultrasound tool, hence complementing B-mode images. The ACS can be valuable for estimation of other ultrasound parameters such as the backscatter coefficient. There is a well-known tradeoff between the precision of the estimated ACS values and the data block size used in spectral-based techniques such as the spectral log difference (SLD). This trade-off limits the practical usefulness of spectral-based attenuation imaging techniques. In this thesis work, the regularized spectral log difference (RSLD) technique is presented in detail and evaluated with simulations and experiments with physical phantoms, ex vivo and in vivo. The ACS values obtained when using the RSLD technique were compared to the ones obtained when using the SLD technique, as well as the ground truth ACS values obtained with insertion loss techniques. The results showed that the RSLD technique allowed significantly decreasing estimation variance when using small data block sizes (i.e., standard deviation of percentage error reduced by more than an order of magnitude in all cases when using 10 x 10 data blocks) without sacrificing estimation accuracy. Therefore, the RSLD allows for the reconstruction of attenuation coefficient images with an improved trade-off between spatial resolution and estimation precision.
Robust Minimmun Variance Beamformer using Phase Aberration Correction Methods
(Pontificia Universidad Católica del Perú, 2017-04-28) Chau Loo Kung, Gustavo Ramón; Lavarello Montero, Roberto Janniel; Dahl, Jeremy J.
The minimum variance (MV) beamformer is an adaptive beamforming method that has the potential to enhance the resolution and contrast of ultrasound images. Although the sensitivity of the MV beamformer to steering vector errors and array calibration errors is well-documented in other fields, in ultrasound it has been tested only under gross sound speed errors. Several robust MV beamformers have been proposed, but have mainly reported robustness only in the presence of sound speed mismatches. Additionally the impact of PAC methods in mitigating the effects of phase aberration in MV beamformed images has not been observed Accordingly, this thesis report consists on two parts. On the first part, a more complete analysis of the effects of different types of aberrators on conventional MV beamforming and on a robust MV beamformer from the literature (Eigenspace-based Minimum Variance (ESMV) beamformer) is carried out, and the effects of three PAC algorithms and their impact on the performance of the MV beamformer are analyzed (MV-PC). The comparison is carried out on Field II simulations and phantom experiments with electronic aberration and tissue aberrators. We conclude that the sensitivity to speed of sound errors and aberration limit the use of the MV beamformer in clinical applications, and that the effect of aberration is stronger than previously reported in the literature. Additionally it is shown that under moderate and strong aberrating conditions, MV-PC is a preferable option to ESMV. On the second part, we propose a new, locally-adaptive, phase aberration correction method (LAPAC) able to improve both DAS and MV beamformers that integrates aberration correction for each point in the image domain into the formulation of the MV beamformer. The new method is tested using fullwave simulations of models of human abdominal wall, experiments with tissue aberrators, and in vivo carotid images. The LAPAC method is compared with conventional phase aberration correction with delay-and-sum beamforming (DAS-PC) and MV-PC. The proposed method showed between 1-4 dB higher contrast than DAS-PC and MV-PC in all cases, and LAPAC-MV showed better performance than LAPAC-DAS. We conclude that LAPAC may be a viable option to enhance ultrasound image quality of both DAS and MV in the presence of clinically-relevant aberrating conditions.
Separable dictionary learning for convolutional sparse coding via split updates
(Pontificia Universidad Católica del Perú, 2019-05-16) Quesada Pacora, Jorge Gerardo; Rodriguez Valderrama, Paul Antonio
The increasing ubiquity of Convolutional Sparse Representation techniques for several image processing tasks (such as object recognition and classification, as well as image denoising) has recently sparked interest in the use of separable 2D dictionary filter banks (as alternatives to standard nonseparable dictionaries) for efficient Convolutional Sparse Coding (CSC) implementations. However, existing methods approximate a set of K non-separable filters via a linear combination of R (R << K) separable filters, which puts an upper bound on the latter’s quality. Furthermore, this implies the need to learn first the whole set of non-separable filters, and only then compute the separable set, which is not optimal from a computational perspective. In this context, the purpose of the present work is to propose a method to directly learn a set of K separable dictionary filters from a given image training set by drawing ideas from standard Convolutional Dictionary Learning (CDL) methods. We show that the separable filters obtained by the proposed method match the performance of an equivalent number of non-separable filters. Furthermore, the computational performance of this learning method is shown to be substantially faster than a state-of-the-art non-separable CDL method when either the image training set or the filter set are large. The method and results presented here have been published [1] at the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018). Furthermore, a preliminary approach (mentioned at the end of Chapter 2) was also published at ICASSP 2017 [2]. The structure of the document is organized as follows. Chapter 1 introduces the problem of interest and outlines the scope of this work. Chapter 2 provides the reader with a brief summary of the relevant literature in optimization, CDL and previous use of separable filters. Chapter 3 presents the details of the proposed method and some implementation highlights. Chapter 4 reports the attained computational results through several simulations. Chapter 5 summarizes the attained results and draws some final conclusions.
Soft tissue characterization using different quantitative ultrasound modalities
(Pontificia Universidad Católica del Perú, 2019-10-24) Romero Gutierrez, Stefano Enrique; Castañeda Aphan, Benjamín; Lavarello Montero, Roberto Janniel
Quantitative ultrasound has been used in several modalities for different experiments such as simulated phantom, physical phantoms, ex vivo and in vivo tissues. The potential of the ultrasound techniques could be useful to complemented medical diagnosis. In this work, two quantitative ultrasound techniques are applied on in vivo experiments: crawling waves sonoelastography applied to bicep brachii and a regularized power law for backscattering and attenuation coefficient for ovary tumor. A crawling waves sonoelastography (CWS) method was applied using two mini-shakers making parallel contact (conventional setup) and normal contact with the surface in two phantoms (homogeneous and inhomogeneous) using the phase derivative algorithm to assess the performance of the normal excitation with well-know metrics such as error, coefficient of variation, signal-to noise ratio and contrast-to noise ratio. The results suggest that the normal excitation provides comparable stiffness estimation in homogeneous and inhomogeneous phantoms. For in vivo test, a bicep barchii from healthy volunteers were assess in two experiments: relaxed-contracted and with a range weight of load. The application of normal setup indicated that a measurement of the relative stiffness on bicep brachii can be realized. The results indicated that a using the incremental weight causes a increase on the stiffness of the bicep following a linear behavior. A regularized power law (RPL) method was implemented and testing with simulated phantoms using a combination of the possible variables of data block size and the regularized parameters of the three variables of the backscattering and attenuation coefficients. The results showed that is possible provide accurate and precise backscattering and attenuation coefficient in the same algorithm. Additionally, in vivo breast experiments was performed and compared with the literature obtaining comparable results. Finally, a tumor of patients with suspected ovarian cancer were assess. The results suggests that RPL method and in general provides reasonable depictions of the reflectivity and attenuation of interrogated media.