Use este identificador para citar ou linkar para este item:
http://repositorio.ufla.br/jspui/handle/1/49940
Registro completo de metadados
Campo DC | Valor | Idioma |
---|---|---|
dc.creator | Ribeiro, Leonardo Andrade | - |
dc.creator | Borges, Felipe Ferreira | - |
dc.creator | Oliveira, Diego | - |
dc.date.accessioned | 2022-05-13T19:55:59Z | - |
dc.date.available | 2022-05-13T19:55:59Z | - |
dc.date.issued | 2021-09 | - |
dc.identifier.citation | RIBEIRO, L. A.; BORGES, F. F.; OLIVEIRA, D. Efficient set similarity join on multi-attribute data using lightweight filters. Journal of Information and Data Management, [S.l.], v. 12, n. 3, p. 226-241, Sept. 2021. | pt_BR |
dc.identifier.uri | https://sol.sbc.org.br/journals/index.php/jidm/article/view/1969 | pt_BR |
dc.identifier.uri | http://repositorio.ufla.br/jspui/handle/1/49940 | - |
dc.description.abstract | We consider the problem of efficiently answering set similarity joins on multi-attribute data. Traditionalset similarity join algorithms assume string data represented by a single set and, thus, miss the opportunity to exploitpredicates over multiple attributes to reduce the number of similarity computations. In this article, we present a frame-work to enhance existing algorithms with additional filters for dealing with multi-attribute data. We then instantiatethis framework with a lightweight filtering technique based on a simple, yet effective data structure, for which exact andprobabilistic implementations are evaluated. In this context, we devise a cost model to identify the best attribute order-ing to reduce processing time. Moreover, alternative approaches are also investigated and a new algorithm combiningkey ideas from previous work is introduced. Finally, we present a thorough experimental evaluation, which demonstratesthat our main proposal is efficient and significantly outperforms competing algorithms. | pt_BR |
dc.language | en_US | pt_BR |
dc.publisher | Brazilian Computer Society | pt_BR |
dc.rights | restrictAccess | pt_BR |
dc.source | Journal of Information and Data Management | pt_BR |
dc.subject | Advanced query processing | pt_BR |
dc.subject | Data cleaning | pt_BR |
dc.subject | Data integration | pt_BR |
dc.subject | Multi-attribute data | pt_BR |
dc.subject | Similarity join | pt_BR |
dc.title | Efficient set similarity join on multi-attribute data using lightweight filters | pt_BR |
dc.type | Artigo | pt_BR |
Aparece nas coleções: | DCC - Artigos publicados em periódicos |
Arquivos associados a este item:
Não existem arquivos associados a este item.
Os itens no repositório estão protegidos por copyright, com todos os direitos reservados, salvo quando é indicado o contrário.
Ferramentas do administrador