Explorer API – API to Gather List of Top Token Holders in Ether

etherexplorer-api

I working on a Ethereum project and I would like to gather a list of top X (20 for example) Ethereum accounts. This would behave similar to etherscan.io (https://etherscan.io/accounts)

Is there an API that does this ? If not does anyone know of a way to accomplish this. Im curious how etherscan accomplished this. I looked at their and there is no endpoint that behaves like it.

I would like to do this for other ERC20 tokens as well.

Best Answer

I don't know of any APIs that will achieve what you want.

If not does anyone know of a way to accomplish this?

Here's some fairly dumb Python code that scrapes that page and writes it to a .csv file:

#!/usr/bin/env python

import requests
from bs4 import BeautifulSoup
import csv

URL = "https://etherscan.io/accounts"
resp = requests.get(URL)
sess = requests.Session()
soup = BeautifulSoup(sess.get(URL).text, 'html.parser')

with open('output.csv', 'wb') as f:
    wr = csv.writer(f, quoting=csv.QUOTE_ALL)
    wr.writerow(map(str, "Rank Address Balance Percentage TxCount".split()))

    for tr in soup.find_all('tr'):
        tds = tr.find_all('td')
        rows = [0] * len(tds)
        for i in xrange(len(tds)):
            rows[i] = tds[i].get_text()

        try:
            wr.writerow(rows)
        except:
            # The page contains another table that we're
            # not worried about but which contains special 
            # characters...
            pass

Im curious how etherscan accomplished this.

Probably by parsing the state data and creating their own internal representation of it. This would then allow them to manipulate and present it in any way they like.

I would like to do this for other ERC20 tokens as well.

Here's a slightly more complicated script (that I wrote a while ago) that lists all addresses and balances - in rank order - associated with a given token contract, across multiple Etherscan pages. You can poke around with it to suit your needs. (There's an example contract address currently hard-coded into it.)

#!/usr/bin/env python

from __future__ import print_function
import os
import requests
from bs4 import BeautifulSoup
import csv
import time

RESULTS = "results.csv"
URL = "https://etherscan.io/token/generic-tokenholders2?a=0x6425c6be902d692ae2db752b3c268afadb099d3b&s=0&p="

def getData(sess, page):
    url = URL + page
    print("Retrieving page", page)
    return BeautifulSoup(sess.get(url).text, 'html.parser')

def getPage(sess, page):
    table = getData(sess, str(int(page))).find('table')
    try:
        data = [[X.text.strip() for X in row.find_all('td')] for row in table.find_all('tr')]
    except:
        data = None
    finally:
        return data

def main():
    resp = requests.get(URL)
    sess = requests.Session()

    with open(RESULTS, 'wb') as f:
        wr = csv.writer(f, quoting=csv.QUOTE_ALL)
        wr.writerow(map(str, "Rank Address Quantity Percentage".split()))
        page = 0
        while True:
            page += 1
            data = getPage(sess, page)

            if data == None:
                break
            else:
                for row in data:
                    wr.writerow(row)
                time.sleep(1)

if __name__ == "__main__":
    main()
Related Topic