%0 Thesis
%A Rodríguez&#x20;Baena,&#x20;Domingo&#x20;Savio
%T Extracción&#x20;y&#x20;validación&#x20;de&#x20;biclusters&#x20;a&#x20;partir&#x20;de&#x20;bases&#x20;de&#x20;datos&#x20;binarios
%D 2012
%U http:&#x2F;&#x2F;hdl.handle.net&#x2F;10433&#x2F;7298
%X Binary&#x20;datasets&#x20;represent&#x20;a&#x20;compact&#x20;and&#x20;simple&#x20;way&#x20;to&#x20;store&#x20;data&#x20;about&#x20;the&#x20;relationships&#x20;between&#x20;a&#x20;group&#x20;of&#x20;objects&#x20;and&#x20;their&#x20;possible&#x20;properties.&#x20;In&#x20;the&#x20;last&#x20;few&#x20;years,&#x20;different&#x20;biclustering&#x20;algorithms&#x20;have&#x20;been&#x20;speciallydeveloped&#x20;to&#x20;be&#x20;applied&#x20;to&#x20;binary&#x20;datasets.&#x20;Several&#x20;approaches&#x20;based&#x20;on&#x20;matrix&#x20;factorization&#x20;or&#x20;divide-and-conquer&#x20;techniques&#x20;have&#x20;been&#x20;proposed&#x20;to&#x20;extract&#x20;useful&#x20;biclusters&#x20;from&#x20;binary&#x20;data,&#x20;and&#x20;these&#x20;approaches&#x20;provide&#x20;information&#x20;about&#x20;the&#x20;distribution&#x20;of&#x20;patterns&#x20;and&#x20;intrinsic&#x20;correlations.We&#x20;propose&#x20;a&#x20;novel&#x20;approach&#x20;to&#x20;extracting&#x20;biclusters&#x20;from&#x20;binary&#x20;datasets,BiBit.&#x20;The&#x20;results&#x20;obtained&#x20;from&#x20;different&#x20;experiments&#x20;with&#x20;synthetic&#x20;datareveal&#x20;the&#x20;excellent&#x20;performance&#x20;and&#x20;the&#x20;robustness&#x20;of&#x20;BiBit&#x20;to&#x20;density&#x20;andsize&#x20;of&#x20;input&#x20;data.&#x20;Also,&#x20;BiBit&#x20;is&#x20;applied&#x20;to&#x20;a&#x20;central&#x20;nervous&#x20;system&#x20;embryonictumor&#x20;gene&#x20;expression&#x20;dataset&#x20;to&#x20;test&#x20;the&#x20;quality&#x20;of&#x20;the&#x20;results.&#x20;Anovel&#x20;gene&#x20;expression&#x20;pre-processing&#x20;methodology,&#x20;based&#x20;on&#x20;expression&#x20;levellayers,&#x20;and&#x20;the&#x20;selective&#x20;search&#x20;performed&#x20;by&#x20;BiBit,&#x20;based&#x20;on&#x20;a&#x20;very&#x20;fastbit-pattern&#x20;processing&#x20;technique,&#x20;provide&#x20;very&#x20;satisfactory&#x20;results&#x20;in&#x20;qualityand&#x20;computational&#x20;cost.&#x20;The&#x20;power&#x20;of&#x20;biclustering&#x20;in&#x20;finding&#x20;genes&#x20;involvedsimultaneously&#x20;in&#x20;different&#x20;cancer&#x20;processes&#x20;is&#x20;also&#x20;shown.&#x20;Finally,&#x20;a&#x20;comparisonwith&#x20;Bimax,&#x20;one&#x20;of&#x20;the&#x20;most&#x20;cited&#x20;binary&#x20;biclustering&#x20;algorithms,shows&#x20;that&#x20;BiBit&#x20;is&#x20;faster&#x20;while&#x20;providing&#x20;essentially&#x20;the&#x20;same&#x20;results.Besides,&#x20;in&#x20;this&#x20;work,&#x20;we&#x20;introduce&#x20;a&#x20;software&#x20;tool,&#x20;named&#x20;CarGene&#x20;(Characterizationof&#x20;Genes),&#x20;that&#x20;helps&#x20;scientists&#x20;to&#x20;validate&#x20;sets&#x20;of&#x20;genes&#x20;usingbiological&#x20;knowledge.&#x20;The&#x20;integration&#x20;of&#x20;huge&#x20;databases&#x20;with&#x20;searching&#x20;techniquesin&#x20;order&#x20;to&#x20;automatically&#x20;validate&#x20;results&#x20;from&#x20;different&#x20;sources&#x20;isa&#x20;key&#x20;factor&#x20;in&#x20;bioinformatics.&#x20;Several&#x20;tools&#x20;have&#x20;been&#x20;developed&#x20;for&#x20;analysinggene¿enrichment&#x20;in&#x20;terms.&#x20;Most&#x20;of&#x20;them&#x20;are&#x20;Gene&#x20;Ontology-based&#x20;tools,i.e.,&#x20;these&#x20;analyse&#x20;gene-enrichment&#x20;in&#x20;GO&#x20;annotations.&#x20;CarGene&#x20;uses&#x20;metabolicpathways&#x20;stored&#x20;in&#x20;the&#x20;Kyoto&#x20;Encyclopedia&#x20;of&#x20;Genes&#x20;and&#x20;Genomes(Kegg)&#x20;and&#x20;provides&#x20;a&#x20;friendly&#x20;graphical&#x20;environment&#x20;to&#x20;analyse&#x20;and&#x20;compareresults&#x20;generated&#x20;by&#x20;different&#x20;clustering&#x20;and&#x2F;or&#x20;biclustering&#x20;techniques.CarGene&#x20;is&#x20;based&#x20;on&#x20;the&#x20;degree&#x20;of&#x20;coherence&#x20;of&#x20;genes&#x20;in&#x20;(bi)clusters&#x20;withrespect&#x20;to&#x20;metabolic&#x20;pathways&#x20;of&#x20;organisms&#x20;stored&#x20;in&#x20;Kegg,&#x20;and&#x20;provides&#x20;anestimate&#x20;of&#x20;obtaining&#x20;results&#x20;by&#x20;chance,&#x20;including&#x20;two&#x20;statistical&#x20;corrections(Bonferroni,&#x20;andWestfall&#x20;and&#x20;Young).&#x20;One&#x20;of&#x20;the&#x20;most&#x20;important&#x20;featuresof&#x20;CarGene&#x20;is&#x20;the&#x20;possibility&#x20;of&#x20;simultaneously&#x20;comparing&#x20;and&#x20;statisticallyanalysing&#x20;the&#x20;information&#x20;about&#x20;many&#x20;groups&#x20;of&#x20;genes&#x20;in&#x20;both&#x20;visual&#x20;andtextual&#x20;manner.&#x20;Furthermore,&#x20;it&#x20;includes&#x20;its&#x20;own&#x20;web&#x20;browser&#x20;to&#x20;explore&#x20;indetail&#x20;the&#x20;information&#x20;extracted&#x20;from&#x20;Kegg.
%K Datasets&#x20;binarios
%K Algoritmos
%K Biclusters
%~