text mining - Linking related topics IR -


how link terms(keywords entities) have relation among them through text documents . example of google when search person shows recommendations of other people related person .

enter image description here

in picture figured out spouse , presidential candidate , , equal designation

i using frequency count technique . more 2 terms occur in same document more chance of them have relation. links unrelated terms pagemarks , verbs , page refences in text document .

how should improve , there other easy reliable technique ?

you should few techniques

1.) stop word filtering: common in text mining 2 filter words typically not important 2 frequent. the, a, is , on. there predefined dictionaries.

2.) tf/idf: tf/idf re-weights words on how separate documents.

3.) named entity recognition: task @ hand might sufficient focus on names. named entity recognition can extract names documents

4.) linear dirichlet allocation: lda finds concept in documents. concept set of words appear together.


Comments

Popular posts from this blog

ruby - Trying to change last to "x"s to 23 -

jquery - Clone last and append item to closest class -

c - Unrecognised emulation mode: elf_i386 on MinGW32 -