Mostra el registre parcial de l'element
Big data in multi-block data analysis: An approach to parallelizing Partial Least Squares Mode B algorithm
dc.contributor | Universitat Ramon Llull. IQS | |
dc.contributor.author | Martinez Ruiz, Alba | |
dc.contributor.author | Montañola i Sales, Cristina | |
dc.date.accessioned | 2024-02-05T20:28:29Z | |
dc.date.available | 2024-02-05T20:28:29Z | |
dc.date.issued | 2019-04-29 | |
dc.identifier.issn | 2405-8440 | ca |
dc.identifier.uri | http://hdl.handle.net/20.500.14342/3857 | |
dc.description.abstract | Partial Least Squares (PLS) Mode B is a multi-block method and a tightly coupled algorithm for estimating structural equation models (SEMs). Describing key aspects of parallel computing, we approach the parallelization of the PLS Mode B algorithm to operate on large distributed data. We show the scalability and performance of the algorithm at a very fine-grained level thanks to the versatility of pbdR, a R-project library for parallel computing. We vary several factors under different data distribution schemes in a supercomputing environment. Shorter elapsed times are obtained for the square-blocking factor 16 × 16 using a grid of processors as square as possible and non-square blocking factors 1000 × 4 and 10000 × 4 using an one-column grid of processors. Depending on the configuration, distributing data in a larger number of cores allows reaching speedups of up to 121 over the CPU implementation. Moreover, we show that SEMs can be estimated with big data sets using current state-of-the-art algorithms for multi-block data analysis. | ca |
dc.format.extent | 29 p. | ca |
dc.language.iso | eng | ca |
dc.publisher | Elsevier | ca |
dc.relation.ispartof | Heliyon | ca |
dc.rights | © L'autor/a | ca |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject.other | Computer science | ca |
dc.subject.other | Computational mathematics | ca |
dc.subject.other | Big data | ca |
dc.subject.other | Dades massives | ca |
dc.title | Big data in multi-block data analysis: An approach to parallelizing Partial Least Squares Mode B algorithm | ca |
dc.type | info:eu-repo/semantics/article | ca |
dc.rights.accessLevel | info:eu-repo/semantics/openAccess | |
dc.embargo.terms | cap | ca |
dc.subject.udc | 004 | ca |
dc.identifier.doi | https://doi.org/10.1016/j.heliyon.2019.e01451 | ca |
dc.description.version | info:eu-repo/semantics/publishedVersion | ca |