favorite123. word-distributions, per-document topicdistribution, or in the case of CorrLDA2, the per-topic aspect-distribution, introduced in section 3.3, are modeled by the multinomial distribution.
favorite0This is done by leveraging the magnitudes and signs of the feature weights of a linear SVM, allowing us to quantitatively show that the inferred topic-aspect relations form consistently correct topic-viewpoint associations..
favorite2We improve the vocabulary partition used in previous related work to represent bimodal data of topical words and opinion words, resulting in more consistently correct topic-viewpoint associations.
favorite1Although they are not directly related to the focus of modeling viewpoints in this paper, certain features employed in these models could be useful, such as having separate word-distributions for global topics and local aspects .
favorite3Our goal is to learn both the topics and the viewpoints expressed in a corpus of text documents about current events, in a general and completely unsupervised setting.
favorite11As a more informed model, we filter the words in the feature set to find the most locationally appropriate terms, in order to reduce noise and computational effort, but above all, in keeping with our hypothesis that the locational signal is present in only part of the texts.
favorite8After some initial parameter exploration as shown in Figure 3, we settle on three Gaussian functions as a reasonable model: words with more than three distributional peaks are likely to be of less utility for locating texts.
favorite16In these experiments we will compare using a list of known places with a model where we aggregate the locational information provided by words (and potentially other linguistic items such as constructions) trained on longitude and latitude either by letting the words vote for place or by averaging the information on a word-by-word basis.
favorite17Both tasks, that of identifying the location of an author, or that of a text, have been addressed by recent experiments with various points of departure: knowledgebased, making use of recorded points of interest in a location, modelling the geographic distribution of topics, or using social network analysis to find additional information about the author.
favorite6We find that modelling word distributions to account for several locations and thus several Gaussian distributions per word, defining a filter which picks out words with high placeness based on their local distributional context, and aggregating locational information in a centroid for each text gives the most useful results.