Publication:
Extracción y validación de biclusters a partir de bases de datos binarios

dc.contributor.advisorAguilar-Ruiz, Jesús Salvador
dc.contributor.authorRodríguez Baena, Domingo Savio
dc.date.accessioned2019-12-03T10:57:48Z
dc.date.available2019-12-03T10:57:48Z
dc.date.issued2012
dc.date.submitted2012-03-06
dc.descriptionPrograma de Doctorado en Ingeniería y Tecnología del Softwarees_ES
dc.description.abstractBinary datasets represent a compact and simple way to store data about the relationships between a group of objects and their possible properties. In the last few years, different biclustering algorithms have been specially developed to be applied to binary datasets. Several approaches based on matrix factorization or divide-and-conquer techniques have been proposed to extract useful biclusters from binary data, and these approaches provide information about the distribution of patterns and intrinsic correlations. We propose a novel approach to extracting biclusters from binary datasets, BiBit. The results obtained from different experiments with synthetic data reveal the excellent performance and the robustness of BiBit to density and size of input data. Also, BiBit is applied to a central nervous system embryonic tumor gene expression dataset to test the quality of the results. A novel gene expression pre-processing methodology, based on expression level layers, and the selective search performed by BiBit, based on a very fast bit-pattern processing technique, provide very satisfactory results in quality and computational cost. The power of biclustering in finding genes involved simultaneously in different cancer processes is also shown. Finally, a comparison with Bimax, one of the most cited binary biclustering algorithms, shows that BiBit is faster while providing essentially the same results. Besides, in this work, we introduce a software tool, named CarGene (Characterization of Genes), that helps scientists to validate sets of genes using biological knowledge. The integration of huge databases with searching techniques in order to automatically validate results from different sources is a key factor in bioinformatics. Several tools have been developed for analysing gene¿enrichment in terms. Most of them are Gene Ontology-based tools, i.e., these analyse gene-enrichment in GO annotations. CarGene uses metabolic pathways stored in the Kyoto Encyclopedia of Genes and Genomes (Kegg) and provides a friendly graphical environment to analyse and compare results generated by different clustering and/or biclustering techniques. CarGene is based on the degree of coherence of genes in (bi)clusters with respect to metabolic pathways of organisms stored in Kegg, and provides an estimate of obtaining results by chance, including two statistical corrections (Bonferroni, andWestfall and Young). One of the most important features of CarGene is the possibility of simultaneously comparing and statistically analysing the information about many groups of genes in both visual and textual manner. Furthermore, it includes its own web browser to explore in detail the information extracted from Kegg.es_ES
dc.description.sponsorshipUniversidad Pablo de Olavide de Sevilla. Departamento de Deporte e Informáticaes_ES
dc.description.versionPostprintes_ES
dc.format.mimetypeapplication/pdf
dc.identifier.urihttp://hdl.handle.net/10433/7298
dc.language.isoeses_ES
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 Internacional*
dc.rights.accessRightsopen accesses_ES
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.subjectDatasets binarioses_ES
dc.subjectAlgoritmoses_ES
dc.subjectBiclusterses_ES
dc.titleExtracción y validación de biclusters a partir de bases de datos binarioses_ES
dc.typedoctoral thesises_ES
dspace.entity.typePublication
relation.isAdvisorOfPublication5ca8a962-86a4-4465-aad6-508a8e70adc7
relation.isAdvisorOfPublication.latestForDiscovery5ca8a962-86a4-4465-aad6-508a8e70adc7
relation.isAuthorOfPublicationfcc78511-f641-4285-9e74-d071e3e3c0e3
relation.isAuthorOfPublication.latestForDiscoveryfcc78511-f641-4285-9e74-d071e3e3c0e3

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
rodriguez-baena-tesis-11-12.pdf
Size:
8.33 MB
Format:
Adobe Portable Document Format
Description:

Collections