AUTOMATIC AND INCREMENTAL KEYWORD EXPANSION FROM SOCIAL MEDIA DATA IN DISASTER
When a disaster occurs, data from social media is one of the essential resources to relevant authorities for the decision making, rescue and replenishment work. Therefore, many researchers have focused on studies to extract more commonly used or disaster-related keywords from the social media data, since people has various ways of expressions and habits to speak about special or day-to-day events. However, existing researches require specific expertise to improve the performance of their approaches. Also, some of them are not able to ensure flexibility of their results according to over time. In particular, most of studies have focused on academic accomplishment than the practical applications. In this paper, we propose an Automatic Disaster Keyword Expansion (ADKE) framework collecting and analysing tweets in real-time to extract localized keywords or general keywords related to disasters. It consists of three modules such as collecting tweets in real-time via Twitter Stream APIs (TSA), identifying localized information using Named Entities Recognition (NER), and topic modelling by Latent Dirichlet allocation (LDA). Based on localized keywords, the proposed framework can be applied to disaster management system. The results of several evaluations by using existing tweet dataset about Tropical Cyclone Oswald in Australia, 2013, showed that the proposed ADKE outperforms the other approach (i.e., pure NER and LDA), without human or expert interventions. We also found that our method combining named entities and keywords for queries to collect tweets via TSA impacts the performance of statistic models of pure LDA.