DennisGorelik
Registered Senior Member
G71 said:Only 5 WordIDs (4 real words) per phrase? How about keeping just a single WordID field in the Phrase table and adding an extra unsigned int sequence number field to track the order so you would have N rows (/WordIDs) per phrase with the same PhraseIDs and unique sequence number?
I thought about composing <a href="http://www.dennisgorelik.com/ai/Phrase.htm">phrases</a> out of N words.
But this is wrong because of two reasons.
Reason #1:
Long phrases are not typical. That means that long phrases are used very rarely. That means that long phrases have very weak association with other concepts.
That's why it has no sense to remember long phrases.
Humans also don't remember long phrases.
Typical phrases consist of 2-3 words. 4 and 5 words phrases are relatively rare. So I think that phrase lenght should be restricted to 3 or 4 <a href="http://www.dennisgorelik.com/ai/Word.htm">words</a>.
(Not extended to N words as you proposed).
Reason #2:
From database performance point of view it's better to keep phrase in one record (PhraseId, WordId1, WordId2, WordId3) instead of keeping phrase in multiple records:
(
PhraseId, 1, WordId1
PhraseId, 2, WordId2
PhraseId, 3, WordId3
)