Procesamiento de Señales e Imágenes Digitales.

Permanent URI for this collectionhttp://98.81.228.127/handle/20.500.12404/5040

Browse

Search Results

Now showing 1 - 10 of 22
  • Thumbnail Image
    Item
    Optimal vicinity 2D median filter for fixed-point or floating-point values
    (Pontificia Universidad Católica del Perú, 2024-06-19) Chang Fu, Javier; Carranza De La Cruz, Cesar Alberto
    Los filtros medianos son una técnica digital no lineal normalmente usada para remover ruido blanco, ’sal y pimienta’ de imágenes digitales. Consiste en reemplazar el valor de cada pixel por la mediana de los valores circundantes. Las implementaciones en punto flotante usan ordenamientos con técnicas de comparación para encontrar la mediana. Un método trivial de ordenar n elementos tiene una complejidad de O(n2), y los ordenamientos más rápidos tienen complejidad de O(n log n) al calcular la mediana de n elementos. Sin embargo, éstos algoritmos suelen tener fuerte divergencia en su ejecución. Otras implementaciones usan algoritmos basados en histogramas, y obtienen sus mejores desempeños cuando operan con filtros de ventanas grandes. Estos algoritmos pueden alcanzar tiempo constante al evaluar filtros medianos, es decir, presenta una complejidad de O(1). El presente trabajo propone un algoritmo de filtro mediano rápido y altamente paralelizable. Se basa en ordenamientos sin divergencia con ejecución O(n log2 n) y mezclas O(n) con los cuales se puede calcular grupos de pixeles en paralelo. Este método se beneficia de la redundancia de valores en pixeles próximos y encuentra la vecindad de procesamiento óptima que minimiza el número de operaciones promedio por pixel. El presente trabajo (i) puede procesar indiferentemente imágenes en punto fijo o flotante, (ii) aprovecha al máximo el paralelismo de múltiples arquitecturas, (iii) ha sido implementado en CPU y GPU, (iv) se logra una aceleración respecto al estado del arte.
  • Thumbnail Image
    Item
    Joint reconstruction techniques for ultrasonic attenuation imaging
    (Pontificia Universidad Católica del Perú, 2024-05-08) Miranda Zárate, Edmundo Arom; Lavarello Montero, Roberto Janniel
    El ultrasonido cuantitativo (QUS, por sus siglas en inglés) es una modalidad de imagen no invasiva que caracteriza numéricamente los tejidos para el diagnóstico médico. Los estimadores QUS se basan en parámetros acústicos como la pendiente del coeficiente de atenuación (ACS, por sus siglas en inglés). Un estudio anterior propuso eliminar el ruido de las relaciones logarítmicas espectrales utilizando una variación total de un solo canal a través de la frecuencia. El método espectral para estimar el ACS, conocido como diferencia logarítmica espectral (SLD, por sus siglas en inglés) no incorpora ninguna estrategia de reconstrucción conjunta para mejorar la imagen. Por lo tanto, este trabajo propone la integración de dos estrategias conjuntas compatibles con el marco SLD. Primero, un enfoque de regularización conjunta denominado variación total nuclear (TNV-SLD) es implementado, el cual combina información geométrica del ACS y el componente del coeficiente de retrodispersión (BSC, por sus siglas en inglés) para mejorar la calidad de las imágenes, logrando mejores resultados en términos de error porcentual medio (MPE) y relación contraste-ruido (CNR). Posteriormente, el estudio se amplía para eliminar conjuntamente los ratios logarítmicos espectrales del SLD en los canales de frecuencia. Se propone un método conjunto multifrecuencia para aumentar la calidad de las imágenes de atenuación. Se consideraron dos modificaciones de la variación total con base en las normas Frobenius (TFV) y nuclear (TNV). Las métricas se compararon con dos métodos de regularización anteriores denominados RSLD y TVSLD, basados en la variación total de un solo canal con datos de maniquíes simulados y experimentales, y una muestra de tejido ex vivo. Los resultados mostraron un mejor desempeño general del método TNV para ambas estrategias, produciendo mapas ACS mejorados y extendiendo el balance entre la resolución espacial y la variabilidad de la estimación en términos de CNR con un sesgo estable.
  • Thumbnail Image
    Item
    Novel Edge-Preserving Filtering Model Based on the Quadratic Envelope of the l0 Gradient Regularization
    (Pontificia Universidad Católica del Perú, 2023-01-26) Vásquez Ortiz, Eduar Aníbal; Rodríguez Valderrama, Paul Antonio
    In image processing, the l0 gradient regularization (l0-grad) is an inverse problem which penalizes the l0 norm of the reconstructed image’s gradient. Current state-of-the art algorithms for solving this problem are based on the alternating direction method of multipliers (ADMM). l0-grad however, reconstructs images poorly in cases where the noise level is large, giving images with plain regions and abrupt changes between them, that look very distorted. This happens because it prioritizes keeping the main edges but risks losing important details when the images are too noisy. Furthermore, since kÑuk0 is a non-continuous and non-convex regularizer, l0-grad can not be directly solved by methods like the accelerated proximal gradient (APG). This thesis presents a novel edge-preserving filtering model (Ql0-grad) that uses a relaxed form of the quadratic envelope of the l0 norm of the gradient. This enables us to control the level of details that can be lost during denoising and deblurring. The Ql0-grad model can be seen as a mixture of the Total Variation and l0-grad models. The results for the denoising and deblurring problems show that our model sharpens major edges while strongly attenuating textures. When it was compared to the l0-grad model, it reconstructed images with flat, texture-free regions that had smooth changes between them, even for scenarios where the input image was corrupted with a large amount of noise. Furthermore the averages of the differences between the obtained metrics with Ql0- grad and l0-grad were +0.96 dB SNR (signal to noise ratio), +0.96 dB PSNR (peak signal to noise ratio) and +0.03 SSIM (structural similarity index measure). An early version of the model was presented in the paper Fast gradient-based algorithm for a quadratic envelope relaxation of the l0 gradient regularization which was published in the international and indexed conference proceedings of the XXIII Symposium on Image, Signal Processing and Artificial Vision.
  • Thumbnail Image
    Item
    Soft tissue characterization using different quantitative ultrasound modalities
    (Pontificia Universidad Católica del Perú, 2019-10-24) Romero Gutierrez, Stefano Enrique; Castañeda Aphan, Benjamín; Lavarello Montero, Roberto Janniel
    Quantitative ultrasound has been used in several modalities for different experiments such as simulated phantom, physical phantoms, ex vivo and in vivo tissues. The potential of the ultrasound techniques could be useful to complemented medical diagnosis. In this work, two quantitative ultrasound techniques are applied on in vivo experiments: crawling waves sonoelastography applied to bicep brachii and a regularized power law for backscattering and attenuation coefficient for ovary tumor. A crawling waves sonoelastography (CWS) method was applied using two mini-shakers making parallel contact (conventional setup) and normal contact with the surface in two phantoms (homogeneous and inhomogeneous) using the phase derivative algorithm to assess the performance of the normal excitation with well-know metrics such as error, coefficient of variation, signal-to noise ratio and contrast-to noise ratio. The results suggest that the normal excitation provides comparable stiffness estimation in homogeneous and inhomogeneous phantoms. For in vivo test, a bicep barchii from healthy volunteers were assess in two experiments: relaxed-contracted and with a range weight of load. The application of normal setup indicated that a measurement of the relative stiffness on bicep brachii can be realized. The results indicated that a using the incremental weight causes a increase on the stiffness of the bicep following a linear behavior. A regularized power law (RPL) method was implemented and testing with simulated phantoms using a combination of the possible variables of data block size and the regularized parameters of the three variables of the backscattering and attenuation coefficients. The results showed that is possible provide accurate and precise backscattering and attenuation coefficient in the same algorithm. Additionally, in vivo breast experiments was performed and compared with the literature obtaining comparable results. Finally, a tumor of patients with suspected ovarian cancer were assess. The results suggests that RPL method and in general provides reasonable depictions of the reflectivity and attenuation of interrogated media.
  • Thumbnail Image
    Item
    Separable dictionary learning for convolutional sparse coding via split updates
    (Pontificia Universidad Católica del Perú, 2019-05-16) Quesada Pacora, Jorge Gerardo; Rodriguez Valderrama, Paul Antonio
    The increasing ubiquity of Convolutional Sparse Representation techniques for several image processing tasks (such as object recognition and classification, as well as image denoising) has recently sparked interest in the use of separable 2D dictionary filter banks (as alternatives to standard nonseparable dictionaries) for efficient Convolutional Sparse Coding (CSC) implementations. However, existing methods approximate a set of K non-separable filters via a linear combination of R (R << K) separable filters, which puts an upper bound on the latter’s quality. Furthermore, this implies the need to learn first the whole set of non-separable filters, and only then compute the separable set, which is not optimal from a computational perspective. In this context, the purpose of the present work is to propose a method to directly learn a set of K separable dictionary filters from a given image training set by drawing ideas from standard Convolutional Dictionary Learning (CDL) methods. We show that the separable filters obtained by the proposed method match the performance of an equivalent number of non-separable filters. Furthermore, the computational performance of this learning method is shown to be substantially faster than a state-of-the-art non-separable CDL method when either the image training set or the filter set are large. The method and results presented here have been published [1] at the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2018). Furthermore, a preliminary approach (mentioned at the end of Chapter 2) was also published at ICASSP 2017 [2]. The structure of the document is organized as follows. Chapter 1 introduces the problem of interest and outlines the scope of this work. Chapter 2 provides the reader with a brief summary of the relevant literature in optimization, CDL and previous use of separable filters. Chapter 3 presents the details of the proposed method and some implementation highlights. Chapter 4 reports the attained computational results through several simulations. Chapter 5 summarizes the attained results and draws some final conclusions.
  • Thumbnail Image
    Item
    Deep Learning for Semantic Segmentation versus Classification in Computational Pathology: Application to mitosis analysis in Breast Cancer grading
    (Pontificia Universidad Católica del Perú, 2019-04-12) Jiménez Garay, Gabriel Alexandro; Racoceanu, Daniel
    Existing computational pathology approaches did not allow, yet, the emergence of effective/efficient computer-aided tools used as a second opinion for pathologists in the daily practice. Focusing on the case of computer-based qualification for breast cancer diagnosis, the present article proposes two deep learning architectures to efficiently and effectively detect and classify mitosis in a histopathological tissue sample. The first method consisted of two parts, entailing a preprocessing of the digital histological image and a free-handcrafted-feature Convolutional Neural Network (CNN) used for binary classification. Results show that the methodology proposed can achieve 95% accuracy in testing with an F1-score of 94.35%, which is higher than the results from the literature using classical image processing techniques and also higher than the approaches using handcrafted features combined with CNNs. The second approach was an end-to-end methodology using semantic segmentation. Results showed that this algorithm can achieve an accuracy higher than 95% in testing and an average Dice index of 0.6 which is higher than the results from the literature using CNNs (0.9 F1-score). Additionally, due to the semantic properties of the deep learning approach, an end-to-end deep learning framework is viable to perform both tasks: detection and classification of mitosis. The results showed the potential of deep learning in the analysis of Whole Slide Images (WSI) and its integration to computer-aided systems. The extension of this work to whole slide images is also addressed in the last two chapters; as well as, some computational key points that are useful when constructing a computer-aided-system inspired by the described technology.
  • Thumbnail Image
    Item
    Corn crops identification using multispectral images from unmanned aircraft systems
    (Pontificia Universidad Católica del Perú, 2019-04-08) Trujillano Asato, Fedra Catherine; Racoceanu, Daniel
    Climate change and migration of population from rural to urban areas are affecting the agricultural production around the world. This study was based in the particular department of Ancash - Peru where corn is one of the most important crops of the region. Authorities in this region are concerned in finding a method, different from census; that can constantly monitor corn crops areas. This data is important to evaluate how these two causes will impact on food security in Ancash. The first part of the present thesis reviews the current techniques in the recognition of crop areas using remote sensing and multispectral images. The second part explains the methodology developed for this study, considering the data acquisition using Unmanned Aircraft Systems, the preparation of the acquired data and two deep learning model approaches. The first approach is based on binary classification of corn patches using Le Net model with near infrared images. The second one describes the segmentation of corn areas in different stages using the U-net model, in this case five band images were considered. The third part shows the results of both approaches. From these results it is concluded that training a model with data from different stages and scenarios of two campaigns (2016 and 2017) can achieve a 95% of accuracy in corn segmentation.
  • Thumbnail Image
    Item
    Efficient algorithms for convolutional dictionary learning via accelerated proximal gradient
    (Pontificia Universidad Católica del Perú, 2019-04-05) Silva Obregón, Gustavo Manuel; Rodríguez Valderrama, Paul Antonio
    Convolutional sparse representations and convolutional dictionary learning are mathematical models that consist in representing a whole signal or image as a sum of convolutions between dictionary filters and coefficient maps. Unlike the patch-based counterparts, these convolutional forms are receiving an increase attention in multiple image processing tasks, since they do not present the usual patchwise drawbacks such as redundancy, multi-evaluations and non-translational invariant. Particularly, the convolutional dictionary learning (CDL) problem is addressed as an alternating minimization between coefficient update and dictionary update stages. A wide number of different algorithms based on FISTA (Fast Iterative Shrinkage-Thresholding Algorithm), ADMM (Alternating Direction Method of Multipliers) and ADMM consensus frameworks have been proposed to efficiently solve the most expensive steps of the CDL problem in the frequency domain. However, the use of the existing methods on large sets of images is computationally restricted by the dictionary update stage. The present thesis report is strategically organized in three parts. On the first part, we introduce the general topic of the CDL problem and the state-of-the-art methods used to deal with each stage. On the second part, we propose our first computationally efficient method to solve the entire CDL problem using the Accelerated Proximal Gradient (APG) framework in both updates. Additionally, a novel update model reminiscent of the Block Gauss-Seidel (BGS) method is incorporated to reduce the number of estimated components during the coefficient update. On the final part, we propose another alternative method to address the dictionary update stage based on APG consensus approach. This last method considers particular strategies of theADMMconsensus and our first APG framework to develop a less complex solution decoupled across the training images. In general, due to the lower number of operations, our first approach is a better serial option while our last approach has as advantage its independent and highly parallelizable structure. Finally, in our first set of experimental results, which is composed of serial implementations, we show that our first APG approach provides significant speedup with respect to the standard methods by a factor of 1:6 5:3. A complementary improvement by a factor of 2 is achieved by using the reminiscent BGS model. On the other hand, we also report that the second APG approach is the fastest method compared to the state-of-the-art consensus algorithm implemented in serial and parallel. Both proposed methods maintain comparable performance as the other ones in terms of reconstruction metrics, such as PSNR, SSIM and sparsity, in denoising and inpainting tasks.
  • Thumbnail Image
    Item
    Characterization of healthy skin with high-frequency ultrasound using quantitative ultrasound
    (Pontificia Universidad Católica del Perú, 2018-08-20) Saavedra Bazán, Ana Cecilia; Castañeda Aphan, Benjamín
    The skin is the largest organ of the body that protects it from the external environment. High- frequency ultra sound (HF-US) has been used to visualize the skin in depth and to diagnose some pathologies in dermatological applications. Quantitative ultrasound (QUS) includes several techniques that provide values of particular physical properties. In this thesis work, three QUS parameters are explained and used to characterize healthy skin through HF-US: attenuation coefficient slope (ACS), backscatter coefficient (BSC) and shear wave speed (SWS). They were estimated with the regularized spectral-log difference (RSLD) method, the reference phan- tom method, and the crawling wave sonoelastography method, respectively. All the three parameters were assessed in phantoms, ex vivo and in vivo skin. In calibrated phantoms, RSLD showed a reduc- tion of up to 93% of the standard deviation concerning the estimation with SLD, and BSC showed an agreement with the Faran’s theoretical curve. In gelatin-based phantoms, surface acoustic waves (SAWs) were estimated in two interfaces: solid-water and solid-US gel, which all owed corroborating SAWs presence and finding an empirical compensation factor when the coupling interface is US gel. A correction factor of 0:97 for SAW-to-shear was found to avoid underestimation in phantoms. Porcine thigh was calculated in the range from 8 to 27 MHz, where the ACS was 4:08 _+_0:43 dB cm -1 MHz-1 and BSC was in the range from 10 1 to 10° sr-1 _cm-1. Crawling wave sonoelastography method was applied for the vibration frequencies between 200 Hz and 800 Hz, where SWS was in the range from 4:6 m/sto9:1 m/s. In vivo ACS and BSC were assessed in the healthy forearm and thigh, whereas SWS only in the thigh. The average ACS in the forearm dermis was 2.07dB cm-1 _MHz-1, which is in close agreement with the literature. A significant difference (p < 0.05) was found between the ACS in the forearm dermis and the thigh dermis (average ACS of 2.54dB cm-1 _MHz-1). The BSC of the forearm and thigh dermis were in the range from 10 -1 to 10° sr-1 _cm-1, and in the range from 10-1 to 10° sr-1 _cm-1, respectively. The SWS in the thigh dermis was 2:4 _+_0:38 m/s for a vibration frequency of 200Hz, with an increasing trend as frequency increases. Results suggest that these QUS parameters have the potential to be used as a tool for in vivo skin characterization and show potential for future application in skin lesions.
  • Thumbnail Image
    Item
    Object detection in videos using principal component pursuit and convolutional neural networks
    (Pontificia Universidad Católica del Perú, 2018-05-03) Tejada Gamero, Enrique David; Rodríguez Valderrama, Paul Antonio
    Object recognition in videos is one of the main challenges in computer vision. Several methods have been proposed to achieve this task, such as background subtraction, temporal differencing, optical flow, particle filtering among others. Since the introduction of Convolutonal Neural Networks (CNN) for object detection in the Imagenet Large Scale Visual Recognition Competition (ILSVRC), its use for image detection and classification has increased, becoming the state-of-the-art for such task, being Faster R-CNN the preferred model in the latest ILSVRC challenges. Moreover, the Faster R-CNN model, with minimum modifications, has been succesfully used to detect and classify objects (either static or dynamic) in video sequences; in such setup, the frames of the video are input “as is” i.e. without any pre-processing. In this thesis work we propose to use Robust PCA (RPCA, a.k.a. Principal Component Pursuit, PCP), as a video background modeling pre-processing step, before using the Faster R-CNN model, in order to improve the overall performance of detection and classification of, specifically, the moving objects. We hypothesize that such pre-processing step, which segments the moving objects from the background, would reduce the amount of regions to be analyzed in a given frame and thus (i) improve the classification time and (ii) reduce the error in classification for the dynamic objects present in the video. In particular, we use a fully incremental RPCA / PCP algorithm that is suitable for real-time or on-line processing. Furthermore, we present extensive computational results that were carried out in three different platforms: A high-end server with a Tesla K40m GPU, a desktop with a Tesla K10m GPU and the embedded system Jetson TK1. Our classification results attain competitive or superior performance in terms of Fmeasure, achieving an improvement ranging from 3.7% to 97.2%, with a mean improvement of 22% when the sparse image was used to detect and classify the object with the neural network, while at the same time, reducing the classification time in all architectures by a factor raging between 2% and 25%.