McAnulty College and Graduate School of Liberal Arts
Frequency, Gender, Java, Tweets, Twitter
In 2011, Internet users spent almost 23% of their time on social media sites such as Twitter and Facebook. Twitter alone was estimated to have over 200 million active users. With social media being such a popular online pastime, a tremendous amount of information becomes available from the posts that users put on social media sites. This information has the potential to reveal details about the social media users, such as the relationship between characteristics of the users and what they post. This relationship is a hot research topic and one of the most frequently studied characteristic is the gender of a user. Feature frequency is often included in such a task, but this thesis shows that for Twitter tweets it either does not contribute significantly to gender classification or hinders classification.
Kroft, A. (2013). The Insignificance of Feature Frequency in Classifying Gender of Twitter Tweets (Master's thesis, Duquesne University). Retrieved from https://dsc.duq.edu/etd/781