Home > python > Detailed Twitter info in JSON: an undocumented feature

Detailed Twitter info in JSON: an undocumented feature

Problem
Using a script, I wanted to figure out the number of my followers on Twitter. Here is my (mostly abandoned) Twitter page: https://twitter.com/szathmar . I didn’t want to use any API since I didn’t want to register for an API key so I went on the easy way: let’s scrape the necessary data out :) Digging in the HTML code I found the number of followers, but I also found a hidden treasure!

Solution
And the hidden treasure is a long json string that contains all kinds of information about a twitter user:

hidden_json2

Here on the screenshot you can see just an extract, the json string is much longer. Fine, let’s get it!

#!/usr/bin/env python3
# coding: utf-8

import json
import readline
import sys
from pprint import pprint

import requests
from bs4 import BeautifulSoup

def main():
    url = input("Full twitter URL: ")
    html = requests.get(url).text
    soup = BeautifulSoup(html, "lxml")

    tag = soup.find('input', {'class': 'json-data'})
    j = tag['value']
    d = json.loads(j)
    json_out = json.dumps(d, indent=4)
    print(json_out)

    # followers = d['profile_user']['followers_count']
    # print(followers)

##############################################################################

if __name__ == "__main__":
    main()

If you want the number of followers for instance, then uncomment the last two lines.

Thank you Twitter! It’s really nice of you to provide all these data in JSON!

Sample
The JSON that I could extract from my page is 743 lines long! Here is an extract of it:

...
"profile_image_url": "http://pbs.twimg.com/profile_images/459783802395430912/vcMT0CGX_normal.png",
"business_profile_state": "none",
"url": null,
"profile_background_image_url_https": "https://abs.twimg.com/images/themes/theme6/bg.gif",
"screen_name": "szathmar",
"is_translator": false,
"friends_count": 123,
"followers_count": 70,
"profile_text_color": "333333",
"profile_link_color": "FF3300",
"translator_type": "none",
"profile_background_color": "709397",
...
Categories: python Tags: , , ,
  1. No comments yet.
  1. No trackbacks yet.

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: