ML p(r)ior | "Draw My Topics": Find Desired Topics fast from large scale of Corpus

"Draw My Topics": Find Desired Topics fast from large scale of Corpus

2016-02-03
1602.01428 | cs.CL
We develop the "Draw My Topics" toolkit, which provides a fast way to incorporate social scientists' interest into standard topic modelling. Instead of using raw corpus with primitive processing as input, an algorithm based on Vector Space Model and Conditional Entropy are used to connect social scientists' willingness and unsupervised topic models' output. Space for users' adjustment on specific corpus of their interest is also accommodated. We demonstrate the toolkit's use on the Diachronic People's Daily Corpus in Chinese.
PDF

Highlights - Most important sentences from the article

Login to like/save this paper, take notes and configure your recommendations

Related Articles

2016-12-20

We present a feature vector formation technique for documents - Sparse Composite Document Vector (SC… show more
PDF

Highlights - Most important sentences from the article

2018-09-04

Neural conversation models tend to generate safe, generic responses for most inputs. This is due to … show more
PDF

Highlights - Most important sentences from the article

2015-07-20
1507.05523 | cs.CL

We analyze three critical components of word embedding training: the model, the corpus, and the trai… show more
PDF

Highlights - Most important sentences from the article

2016-05-06
1605.02019 | cs.CL

Distributed dense word vectors have been shown to be effective at capturing token-level semantic and… show more
PDF

Highlights - Most important sentences from the article

2016-09-27

Inferring topics from the overwhelming amount of short texts becomes a critical but challenging task… show more
PDF

Highlights - Most important sentences from the article

2018-11-07
1811.02820 | cs.IR

In our work, we propose to represent HTM as a set of flat models, or layers, and a set of topical hi… show more
PDF

Highlights - Most important sentences from the article

2016-06-21

We consider incorporating topic information into the sequence-to-sequence framework to generate info… show more
PDF

Highlights - Most important sentences from the article

2014-12-17
1412.5404 | cs.CL

The short text has been the prevalent format for information of Internet in recent decades, especial… show more
PDF

Highlights - Most important sentences from the article

2019-04-11

In traditional Distributional Semantic Models (DSMs) the multiple senses of a polysemous word are co… show more
PDF

Highlights - Most important sentences from the article

2016-04-30
1605.00090 | cs.CL

We consider incorporating topic information into message-response matching to boost responses with r… show more
PDF

Highlights - Most important sentences from the article

2019-04-13
1904.06483 | cs.IR

We introduce Topic Grouper as a complementary approach in the field of probabilistic topic modeling.… show more
PDF

Highlights - Most important sentences from the article

2017-11-12

Topic modeling is one of the most powerful techniques in text mining for data mining, latent data di… show more
PDF

Highlights - Most important sentences from the article

2017-09-19
1709.06365 | cs.CL

Besides the text content, documents and their associated words usually come with rich sets of meta i… show more
PDF

Highlights - Most important sentences from the article

2016-06-02

A popular approach to topic modeling involves extracting co-occurring n-grams of a corpus into seman… show more
PDF

Highlights - Most important sentences from the article

2019-04-20
1904.09442 | cs.CL

The author-specific word usage is a vital feature to let readers perceive the writing style of the a… show more
PDF

Highlights - Most important sentences from the article

2019-04-15

In the Humanities and Social Sciences, there is increasing interest in approaches to information ext… show more
PDF

Highlights - Most important sentences from the article