The Most Common Subject Words In This Forum
The entire development forum is used to extract over 8900 thread titles. Each titles has the words broken out by spaces. Each word only allows alphanumerical characters and all uppercase letters are converted into lower case. While this happens each word is counted from zero to one. So each of the counts to the right of the word are really +1.
1. to * 718
2. in * 649
3. os * 616
4. and * 614
5. a * 562
6. kernel * 491
7. the * 449
8. with * 442
9. problem * 398
10. help * 391
11. c * 367
12. how * 364
13. memory * 318
14. mode * 301
15. i * 281
16. for * 267
17. question * 236
18. of * 235
19. on * 213
20. bochs * 195
21. my * 191
22. floppy * 188
23. paging * 184
24. driver * 181
25. pmode * 177
26. about * 176
27. is * 175
28. code * 168
29. system * 166
30. grub * 164
31. from * 160
32. what * 159
33. problems * 152
34. an * 141
35. file * 141
36. keyboard * 133
37. not * 130
38. need * 127
39. gcc * 124
40. new * 120
41. do * 118
42. interrupt * 116
43. can * 116
44. questions * 115
45. idt * 113
46. boot * 113
47. error * 112
48. stack * 107
49. multitasking * 107
The Most Common Subject Words In This Forum
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
- AndrewAPrice
- Member
- Posts: 2300
- Joined: Mon Jun 05, 2006 11:00 pm
- Location: USA (and Australia)
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact:
forumdown
I will give it a try. It actually seems a little more complicated then what you would think with the initial thought, but I have confidence that it is possible.
I got a initial tool written. A program forumdown which will download a entire sub forum and store the linked list structures of threads and posts into a local data file that can be loaded.
I did a little thinking. I came up with the conclusion that I can use a website that provides a dictionary, thesaurus, and encyclopedia to allow some degree of spell checking and mapping of similar words such as IDT and Interrupt Descriptor Table and allow some sort of primitive comprehension of sentences to get an idea of exactly what people are talking about in the posts.
I will try to use this site to provide the English word database, and add some cache to prevent it from taking a excess amount of time.
http://www.reference.com/browse/
http://kmcguire.jouleos.galekus.com/dok ... orum_tools
Lets see if I can get the other part working.
I got a initial tool written. A program forumdown which will download a entire sub forum and store the linked list structures of threads and posts into a local data file that can be loaded.
I did a little thinking. I came up with the conclusion that I can use a website that provides a dictionary, thesaurus, and encyclopedia to allow some degree of spell checking and mapping of similar words such as IDT and Interrupt Descriptor Table and allow some sort of primitive comprehension of sentences to get an idea of exactly what people are talking about in the posts.
I will try to use this site to provide the English word database, and add some cache to prevent it from taking a excess amount of time.
http://www.reference.com/browse/
http://kmcguire.jouleos.galekus.com/dok ... orum_tools
Lets see if I can get the other part working.
- Kevin McGuire
- Member
- Posts: 843
- Joined: Tue Nov 09, 2004 12:00 am
- Location: United States
- Contact: