Most used variables during the project:
Variable | Discription |
---|---|
tweetid | tweet indefication number (unique per tweet) |
userid | hashed if less than 5k followers |
user_profile_description | description at the time of suspension |
account_language | user specified account language |
tweet_time | according to the UTC |
tweet_text | text of the tweet |
hashtags | a list of hashtags used in a tweet |
retweet_userid | for retweets, the userid who authored the original tweet |
user_mentions | a list fo userids who were mentioned in the tweet |
likes_count | likes count for a tweet |
and many more...
Loglog-plot of the like distribution of all the Tweets in the dataset.
Loglog-plot of the retweet distribution of all the Tweets in the dataset.
Loglog-plot of the follower distribution of all the account in the dataset