Enriching an authority file of scientific conferences with information extracted from the web

dc.creatorJesus, Heider Alvarenga de
dc.creatorPereira, Denilson Alves
dc.date.accessioned2018-07-27T11:50:24Z
dc.date.available2018-07-27T11:50:24Z
dc.date.issued2017
dc.description.abstractAuthority files maintain variant forms to refer to the same entity and they are very useful in digital libraries. However, collect data and keep an updated authority file is not a trivial task. This paper proposes an approach for the enrichment of a publication venue authority file by extracting information on conferences from their web pages. Collecting additional data is important to improve the effectiveness of data disambiguation tools and information retrieval, such as those that measure the quality of a scientific publication based on bibliometrics (e.g., Journal Impact Factor). Most applications use only basic citation metadata, such as author's names, work and publication venue titles. However, data external to the publication, contained in the publication venue web page, can be very useful in the disambiguation task. Our approach includes the steps for querying a web search engine, classifying documents obtained in the result sets and extracting information from the relevant pages. We evaluated two methods for classifying documents, one based on genre and content and one based on content only. The experiments show good results to trace a history of conference editions, with data such as URL, year of each edition and dates of changing in their names.pt_BR
dc.description.provenanceSubmitted by André Calsavara (andre.calsavara@biblioteca.ufla.br) on 2018-07-17T12:50:53Z No. of bitstreams: 0en
dc.description.provenanceApproved for entry into archive by André Calsavara (andre.calsavara@biblioteca.ufla.br) on 2018-07-27T11:50:24Z (GMT) No. of bitstreams: 0en
dc.description.provenanceMade available in DSpace on 2018-07-27T11:50:24Z (GMT). No. of bitstreams: 0 Previous issue date: 2017en
dc.identifier.citationJESUS, H. A. de; PEREIRA, D. A. Enriching an authority file of scientific conferences with information extracted from the web. Journal of Computer Science, [S. l.], v. 13, n. 4, p. 68-77, 2017.pt_BR
dc.identifier.urihttps://repositorio.ufla.br/handle/1/29781
dc.identifier.urihttp://thescipub.com/abstract/10.3844/jcssp.2017.68.77pt_BR
dc.languageen_USpt_BR
dc.publisherScience Publicationspt_BR
dc.rightsrestrictAccesspt_BR
dc.sourceJournal of Computer Sciencept_BR
dc.subjectAuthority filept_BR
dc.subjectPublication venuept_BR
dc.subjectWeb search enginept_BR
dc.subjectInformation extractionpt_BR
dc.subjectArquivo de autoridadept_BR
dc.subjectLocal de publicaçãopt_BR
dc.subjectMotor de buscapt_BR
dc.subjectExtração de informaçõespt_BR
dc.titleEnriching an authority file of scientific conferences with information extracted from the webpt_BR
dc.typeArtigopt_BR

Arquivos

Licença do pacote

Agora exibindo 1 - 1 de 1
Carregando...
Imagem de Miniatura
Nome:
license.txt
Tamanho:
953 B
Formato:
Item-specific license agreed upon to submission
Descrição: