text mining - Linking related topics IR -


how link terms(keywords entities) have relation among them through text documents . example of google when search person shows recommendations of other people related person .

enter image description here

in picture figured out spouse , presidential candidate , , equal designation

i using frequency count technique . more 2 terms occur in same document more chance of them have relation. links unrelated terms pagemarks , verbs , page refences in text document .

how should improve , there other easy reliable technique ?

you should few techniques

1.) stop word filtering: common in text mining 2 filter words typically not important 2 frequent. the, a, is , on. there predefined dictionaries.

2.) tf/idf: tf/idf re-weights words on how separate documents.

3.) named entity recognition: task @ hand might sufficient focus on names. named entity recognition can extract names documents

4.) linear dirichlet allocation: lda finds concept in documents. concept set of words appear together.


Comments

Popular posts from this blog

Capture and play voice with Asterisk ARI -

java - Why database contraints in HSQLDB are only checked during a commit when using transactions in Hibernate? -

visual studio - Installing Packages through Nuget - "Central Directory corrupt" -