Abstract:
The degree of noun bias di ers between children. This project focused
on trying to uncover the di erences in word knowledge between children
with a weak degree of noun bias and children with a strong degree of noun
bias. By using data from Wordbank, which contains parent-completed
reports of the Communicative Development Inventory (CDI), di erent
random forest models and decision trees were created for four di erent
age groups: 16 to 19 months, 20 to 23 months, 24 to 27 months and 28
to 30 months. The random forests were used to see which words were
important for classifying if a child has a weak or strong degree of noun
bias. It was found that the classi cation worked quite well and that all of
the important nouns were indicative of a strong degree of noun bias and
all of the important verbs for a weak degree of noun bias. The decision
trees were simple and had a high accuracy when trained and tested with
the Wordbank data, but failed to get a high accuracy when tested on
CHILDES data, which is data that is collected during interactions with
children. This suggests that there is a relationship between words and the
degree of noun bias when the data that is used comes from CDI forms.