Person: Asencio Cortés, Gualberto
Profesor/a Titular de Universidad
Universidad Pablo de Olavide
Deporte e Informática
Lenguajes y Sistemas Informáticos
Now showing 1 - 10 of 10
PublicationPredicción de estructuras de proteínas basada en vecinos más cercanos(2013) Asencio Cortés, Gualberto; Aguilar-Ruiz, Jesús S.Las proteínas son las biomoléculas que tienen mayor diversidad estructural y desempeñan multitud de importantes funciones en todos los organismos vivos. Sin embargo, en la formación de las proteínas se producen anomalías que provocan o facilitan el desarrollo de importantes enfermedades como el cáncer o el Alzheimer, siendo de vital importancia el diseño de fármacos que permitan evitar sus desastrosas consecuencias. En dicho diseño de fármacos se precisa disponer de modelos estructurales de proteínas que, pese a que su secuencia es conocida, en la mayoría de los casos su estructura aún se ignora. Es por ello que la predicción de la estructura de una proteína a partir de su secuencia de aminoácidos resulta clave para la cura de este tipo de enfermedades. En la presente Tesis se ha analizado profundamente el estado del arte del problema de la predicción de la estructura terciaria y cuaternaria de una proteína, aportando diversos aspectos y puntos de vista de los métodos más actuales y relevantes presentes en la literatura. Por otra parte, se propone un método nuevo para la predicción de mapas de distancias que representan estructuras proteínicas mediante un esquema de vecinos más cercanos empleando propiedades físico-químicas de aminoácidos como entrada. Se ha realizado una exhaustiva experimentación y se han analizado los resultados desde varios puntos de vista y destacando diversos aspectos de interés. Finalmente, se ha aplicado la propuesta metodológica a dos grupos de proteínas de interés biológico: las proteínas de virus y de mitocondrias, obteniéndose resultados muy prometedores en ambos casos. PublicationEarthquake Prediction in California Using Feature Selection Techniques(Springer, Cham, 2021-09-23) Roiz-Pagador, J.; Chacón Maldonado, Andrés Manuel; Ruiz, R.; Asencio Cortés, GualbertoPredicting the magnitude of earthquakes is of vital importance and, at the same time, of extreme complexity, where each attribute contributes differently in the process, even introducing noise. Preprocessing using attribute selection techniques helps to alleviate this drawback. In this work, this is demonstrated through an extensive comparison of 47 years of data from the Northern California Earthquake Data Center, where a wide range of feature selection algorithms are applied composed by different search, like population, local and ranking search based; and evaluators, like Correlations, consistency and distance metrics. After that, prediction algorithms will allow to compare the result with and without the application of feature selection, showing that the number of existing attributes can be reduced by 80%, improving metrics of the original, ensuring that the use of attribute selection in this type of problem is quite promising. PublicationBig data time series forecasting based on pattern sequence similarity and its application to the electricity demand(Elsevier, 2020-07-07) Pérez Chacón, Rubén; Asencio Cortés, Gualberto; Martínez Álvarez, Francisco; Troncoso, AliciaThis work proposes a novel algorithm to forecast big data time series. Based on the well-established Pattern Sequence-based Forecasting algorithm, this new approach has two major contributions to the literature. First, the improvement of the original algorithm with respect to the accuracy of predictions, and second, its transformation into the big data context, having reached meaningful results in terms of scalability. The algorithm uses the Apache Spark distributed computation framework and it is a ready-to-use application with few parameters to adjust. Physical and cloud clusters have been used to carry out the experimentation, which consisted in applying the algorithm to real-world data from Uruguay electricity demand. PublicationEarthquake prediction in California using regression algorithms and cloud-based big data infrastructure(Elsevier, 2018-04-24) Asencio Cortés, Gualberto; Morales Esteban, Antonio; Shang, Xuei; Martínez-Álvarez, FranciscoEarthquake magnitude prediction is a challenging problem that has been widely studied during the last decades. Statistical, geophysical and machine learning approaches can be found in literature, with no particularly satisfactory results. In recent years, powerful computational techniques to analyze big data have emerged, making possible the analysis of massive datasets. These new methods make use of physical resources like cloud based architectures. California is known for being one of the regions with highest seismic activity in the world and many data are available. In this work, the use of several regression algorithms combined with ensemble learning is explored in the context of big data (1 GB catalog is used), in order to predict earthquakes magnitude within the next seven days. Apache Spark framework, H2O library in R language and Amazon cloud infrastructure were been used, reporting very promising results. PublicationA novel approach to forecast urban surface-level ozone considering heterogeneous locations and limited information(Elsevier, 2018-11-27) Gómez Losada, Álvaro; Asencio Cortés, Gualberto; Martínez Álvarez, Francisco; Riquelme Santos, José Cristobal; Martínez-Álvarez, FranciscoSurface ozone (O3) is considered an hazard to human health, affecting vegetation crops and ecosystems. Accurate time and location O3 forecasting can help to protect citizens to unhealthy exposures when high levels are expected. Usually, forecasting models use numerous O3 precursors as predictors, limiting the reproducibility of these models to the availability of such information from data providers. This study introduces a 24 h-ahead hourly O3 concentrations forecasting methodology based on bagging and ensemble learning, using just two predictors with lagged O3 concentrations. This methodology was applied on ten-year time series (2006–2015) from three major urban areas of Andalusia (Spain). Its forecasting performance was contrasted with an algorithm especially designed to forecast time series exhibiting temporal patterns. The proposed methodology outperforms the contrast algorithm and yields comparable results to others existing in literature. Its use is encouraged due to its forecasting performance and wide applicability, but also as benchmark methodology. PublicationPHILNet: A novel eﬃcient approach for time series forecasting using deep learning(Elsevier, 2023-03-15) Jiménez Navarro, Manuel Jesús; Martínez Ballesteros, María; Martínez Álvarez, Francisco; Asencio Cortés, GualbertoTime series is one of the most common data types in the industry nowadays. Forecasting the future of a time series behavior can be useful in planning ahead, saving time, resources, and helping avoid undesired scenarios. To make the forecasting, historical data is utilized due to the causal nature of the time series. Several deep learning algorithms have been presented in this area, where the input is processed through a series of non-linear functions to produce the output. We present a novel strategy to improve the performance of deep learning models in time series forecasting in terms of efficiency while reaching similar effectiveness. This approach separates the model into levels, starting with the easiest and continuing to the most difficult. The simpler levels deal with smoothed versions of the input, whereas the most sophisticated level deals with the raw data. This strategy seeks to mimic the human learning process, in which basic tasks are completed initially, followed by more precise and sophisticated ones. Our method achieved promising results, obtaining a 35% improvement in mean squared error and a 2.6 time decrease in training time compared with the best models found in a variety of time series. PublicationA new deep learning architecture with inductive bias balance for transformer oil temperature forecasting(Springer, 2023-05-28) Jiménez Navarro, Manuel Jesús; Martínez Ballesteros, María; Martínez Álvarez, Francisco; Asencio Cortés, GualbertoEnsuring the optimal performance of power transformers is a laborious task in which the insulation system plays a vital role in decreasing their deterioration. The insulation system uses insulating oil to control temperature, as high temperatures can reduce the lifetime of the transformers and lead to expensive maintenance. Deep learning architectures have been demonstrated remarkable results in various felds. However, this improvement often comes at the cost of increased computing resources, which, in turn, increases the carbon footprint and hinders the optimization of architectures. In this study, we introduce a novel deep learning architecture that achieves a comparable efcacy to the best existing architectures in transformer oil temperature forecasting while improving efciency. Efective forecasting can help prevent high temperatures and monitor the future condition of power transformers, thereby reducing unneces‑ sary waste. To balance the inductive bias in our architecture, we propose the Smooth Residual Block, which divides the original problem into multiple subproblems to obtain diferent representations of the time series, collaboratively achieving the fnal forecast‑ ing. We applied our architecture to the Electricity Transformer datasets, which obtain transformer insulating oil temperature measures from two transformers in China. The results showed a 13% improvement in MSE and a 57% improvement in performance compared to the best current architectures, to the best of our knowledge. Moreo‑ ver, we analyzed the architecture behavior to gain an intuitive understanding of the achieved solution. PublicationA new hybrid method for predicting univariate and multivariate time series based on pattern forecasting(Elsevier, 2021-12-18) Castán Lascorz, Miguel Ángel; Jiménez Herrera, Patricia; Troncoso, Alicia; Asencio Cortés, GualbertoTime series forecasting has become indispensable for multiple applications and industrial processes. Currently, a large number of algorithms have been developed to forecast time series, all of which are suitable depending on the characteristics and patterns to be inferred in each case. In this work, a new algorithm is proposed to predict both univariate and multivariate time series based on a combination of clustering, classification and forecasting techniques. The main goal of the proposed algorithm is first to group windows of time series values with similar patterns by applying a clustering process. Then, a specific forecasting model for each pattern is built and training is only conducted with the time windows corresponding to that pattern. The new algorithm has been designed using a flexible framework that allows the model to be generated using any combination of approaches within multiple machine learning techniques. To evaluate the model, several experiments are carried out using different configurations of the clustering, classification and forecasting methods that the model consists of. The results are analyzed and compared to classical prediction models, such as autoregressive, integrated, moving average and Holt-Winters models, to very recent forecasting methods, including deep, long short-term memory neural networks, and to well-known methods in the literature, such as k nearest neighbors, classification and regression trees, as well as random forest. PublicationDIAFAN-TL: An instance weighting-based transfer learning algorithm with application to phenology forecasting(Elsevier, 2022-08-22) Molina Cabanillas, Miguel Ángel; Jiménez Navarro, Manuel Jesús; Arjona Antolín, Ricardo; Martínez-Álvarez, Francisco; Asencio Cortés, GualbertoThe agricultural sector has been, and still is, the most important economic sector in many countries. Due to advances in technology, the amount and variety of available data have been increasing over the years. However, compared to other economic sectors, there is not always enough quality data for one particular domain (crops, plantations, plots) to obtain acceptable forecasting results with machine learning algorithms. In this context, transfer learning can help extract knowledge from different but related domains with enough data to transfer it to a target domain with scarce data. This process can overcome forecasting accuracy compared to training models uniquely with data from the target domain. In this work, a novel instance weighting-based transfer learning algorithm is proposed and applied to the phenology forecasting problem. A new metric named DIAFAN is proposed to weight samples from different source domains according to their relationship with the target domain, promoting the diversity of the information and avoiding inconsistent samples. Additionally, a set of validation schemes is specifically designed to ensure fair comparisons in terms of data volume with other benchmark transfer learning algorithms. The proposed algorithm, DIAFAN-TL, is tested with a proposed dataset of 16 plots of olive groves from different places, including information fusion from satellite images, meteorological stations and human field sampling of crop phenology. DIAFAN-TL achieves a remarkable improvement with respect to 15 other well-known transfer learning algorithms and three nontransfer learning scenarios. Finally, several performance analyses according to the different phenological states, prediction horizons and source domains are also performed. PublicationPattern sequence-based algorithm for multivariate big data time series forecasting: Application to electricity consumption(Elsevier, 2024-01-22) Pérez Chacón, Rubén; Asencio Cortés, Gualberto; Martínez-Álvarez, Francisco; Troncoso, AliciaSeveral interrelated variables typically characterize real-world processes, and a time series cannot be predicted without considering the influence that other time series might have on the target time series. This work proposes a novel algorithm to forecast multivariate big data time series. This new general-purpose approach consists first of a previous pattern recognition performed jointly using all time series that form the multivariate time series and then predicts the target time series by searching for similarities between pattern sequences. The proposed algorithm is designed to tackle multivariate time series forecasting problems within the context of big data. In particular, the algorithm has been developed with a distributed nature to enhance its efficiency in analyzing and processing large volumes of data. Moreover, the algorithm is straightforward to use, with only two parameters needing adjustment. Another advantage of the MV-bigPSF algorithm is its ability to perform multi-step forecasting, which is particularly useful in many practical applications. To evaluate the algorithm’s performance, real-world data from Uruguay’s power consumption has been utilized. Specifically, MV-bigPSF has been compared with both univariate and multivariate methods. Regarding the univariate ones, MV-bigPSF improved 12.8% in MAPE compared to the second-best method. Regarding the multivariate comparison, MV-bigPSF improved 44.8% in MAPE with respect to the second most accurate method. Regarding efficiency, the execution time of MV-bigPSF was 1.83 times faster than the second-fastest multivariate method, both in a single-core environment. Therefore, the proposed algorithm can be a valuable tool for practitioners and researchers working in multivariate time series forecasting, particularly in big data applications.