Published on: Jun 20, 2022 Last edited: Jul 25, 2023

Click above for a list of FAQs

A few of your more frequently asked questions

Are there ways to get ERC20 token balance for a given account address?

Yes. There is. In fact, this is one of TrueBlocks’ most important features. Simply do chifra export --statements <address(es)> or query the API serve with http://localhost:8080/export?addr=<address>&statements.

Note that chifra export has many, many other options which produce similarly-informative data such as --logs, --appearances, --neighbors, --accounting, and so on. See the entire help file with chifra export --help.

How can you reliably detect ERC20 transfers if four-byte function IDs overlap?

Here’s an example.

The purpose of chifra export is to extract all transactions necessary to do 18-decimal-place-accurate accounting for a given address (or addresses).

If we encounter such as case (where there is a conflict in the four-byte or event topic), we pull that transaction as we would any other, but…when we query for the accounting (i.e. we query the smart contract for ERC20 balances), we get either an errored response in the conflicting case. This “mistaken ERC20 transfer” will have no value. The transaction may appear in the list of all transactions, but it will have no effect on the accounting.

In other words, regular old-fashioned, off-chain double-entry accounting will comes to the rescue. This is by design because as you point out, it’s not possible to be perfect using purely on-chain data.

Why would I want to run my own, local Ethereum node?

It’s faster, cheaper, uncensorable, and private.

Faster: You can hit your own RPC endpoints as fast as you could possible want. No rate limiting. It’s surprsing how important this is. It transforms a “difficult-to-use node” into a perfectly fine data server.

Cheaper: Over time, it’s way, way cheaper to run your own infrastructure. There’s only a single initial capital outlay. Plus – you don’t need a huge memory machine. Once the node and our scraper are caught up to the front of the chain, there’s only a single block at every so many seconds. This is easily handled for both the local node and TrueBlocks. More memory is better, of course, but is also not the main bottleneck. (In fact, there are no bottlenecks–they system is mostly waiting for new blocks from the network.)

In the past, large disc space requirements used to be a problem – especially with an archive node such as OpenEthereum (or Geth), each of which required 10-12 TB. Erigon requires only 2-3TB for the exact same data. You need at least a 4TB hard drive, but these are increasing more available.

One could, if they wished, used “node-as-a-service” such as Infura or QuickNode, but the monthly cost is high – up to $250.00 for a base-layer (i.e. no tracing) archive node access with ‘dedicated servers’ going up from there. The former suffers from the ‘rate-limiting’ problem, and the later is probably way more expensive. A local machine quickly pays for itself.

Uncensorable / private: You’re running your own servers inside your own building. If someone is either censoring your data or invading your privacy, you have only yourself to blame. These two aspects of the data access should be your responsibility, not a third-party provider.

Our recommendation is definitely a local machine running Erigon, with TrueBlocks installed on the same machine. An excellent option is dAppNode.

Can I run against a Geth node?

If you’re running against a non-archive Geth node (or any other node), then TrueBlocks will not work very well. After all, TrueBlocks’ entire purpose is to study the transactional histories of an address. Non-archive nodes do not provide any historical transactional data.

If you’re running against an archive Geth node (or other node software that does not support Parity traces), again, things won’t work very well. TrueBlocks requires traces to dig fully into an account’s transactional history. This is not a choice of TrueBlocks, it’s a choice if the node software. Without Parity traces, the node simply can’t keep up with the requirement to produce an accurate index.

If you’re running against node software that is both an archive node and supports Parity tracing in a performant way (such as Erigon and Nethermind), then you’ll run into one more problem. Disc space usage. Geth and Nethermind (and the old OpenEthereum) take up more than 10TB of disc space if you’re running an archive node. Erigon takes up 2TB. Five times less.

Upshot: Erigon is our greatly preferred node software. Geth is basically unsupported by TrueBlocks. Nethermind is possible, but only if you have a very large hard drive.

I have an ABI spec as a JSON file, may I configure TrueBlocks so that chifra uses it for the --articulate option?

The chifra abis routine will try to find the ABI in the local folder by looking for <address>.json, although you may specify the –sol option and feed it the solidity code. Failing that, chifra looks to EtherScan for the ABI. Failing that it falls back to a collection of about 2,800 ‘known’ signatures from EIP standards (ERC20, etc.) and some popular smart contracts ENS, Zephlin, etc.

In scraper mode, what does finalized, ripe, unripe, and staging mean?

Run this command: chifra config. You will see output similar to this:

2022/10/24 07:21:20 Client:       erigon/2022.09.3/linux-amd64/go1.18.2 (archive, tracing)
2022/10/24 07:21:20 TrueBlocks:   GHC-TrueBlocks//0.41.0-beta-20b34d9e0-20221024 (eskey, pinkey)
2022/10/24 07:21:20 RPC Provider: http://localhost:23456 - mainnet (1,1)
2022/10/24 07:21:20 Config Path:  <local path>
2022/10/24 07:21:20 Cache Path:   <local path>
2022/10/24 07:21:20 Index Path:   <local path>
2022/10/24 07:21:20 Progress:     15817943, 15817229, 15817806, 15817941, ts: 15817942

Notice the last line labeled “progess”. What do these numbers mean? They are, in order, latest, finalized, staged, ripe, and timestamp. (unripe is not included.)

Here’s what these numbers mean:

from head
configurablewill be
latestThe latest block on the chain. Same as `eth_getBlockByNumber(’latest’).0 blocks--
finalizedThe last block that has been consilidated into a “chunk”. (i.e. an index portion).dependsyesno
stagingThe latest block “on the stage”. (i.e. awaiting inclusion in the next chunk).dependsyesno
ripeBlocks that have been scraped, but not yet staged.28 blocksyesno
unripeBlocks that are “too close to the head” to be reliable.0 blocksnoyes
timestampThe latest scraped timestamp (used for debugging).n/a-no

For a much better explaination of these numbers (and more generally the scraper), please see the TrueBlocks Spec.

Error message: current file does not sequentially follow previous file

When using chifra scrape indexer you may get the above message. What this means is a completely empty block was returned from the RPC. When I say completely empty, this means there’s not even a miner address. Our scraper thinks that every block must contain at least one address, but on some chains this is not true (for example, on some private chains).

Alternatively, this sometimes happens when you run chifra scrape and chifra init incorrectly, although this last issue should be fixed if you’ve kept up with the latest updates.

You may turn this warning off by editing a per-chain configuration file. Search for the word ‘allow_missing’ on this page: https://trueblocks.io/chifra/configs/.

The file you want to edit is in $CONFIG_FOLDER/config/$CHAIN/blockScrape.toml where $CONFIG_FOLDER is dependent on your operating system (~/.local/share/trueblocks for linux, ~/Library/Application Support/TrueBlocks for Mac) and $CHAIN is the name of the chain you’re scraping.

You can find $CONFIG_FOLDER by typing chifra config --paths.

Create the above file if it doesn’t exist and add the following value (creating the section inside the file if you need to):

allow_missing = true

This error may also manifest itself with the message “A block was not processed.”

The docs say we should use Erigon, may I use Geth (or some other node software) instead?

There’s four reasons we suggest Erigon (the last is a deal-breaker).

  1. Erigon is MUCH faster syncing – two weeks vs many months for archive node
  2. Erigon takes up MUCH less disc space - 2 TB vs. 12 TB for an archive node
  3. Erigon’s RPC is faster
  4. Erigon natively supports the trace_ namespace. Geth supports it but only through a JavaScript emulator – tracing is literally unusable in Geth. TrueBlocks needs tracing.

Item 1, is not that bad – if you have the time to wait.

Item 3, is dependent on which RPC endpoints you use – particularly tracing.

Item 2 matters immensely to us since we are so focused on running on small, decentralized machines and makes all other nodes not viable.

Item 4 is a deal breaker. Without traces, we would have to re-write the internals of our scraper. Plus, without traces, the “quality” or “completeness” of our solution is seriously compromised. We could index just events (like The Graph), but that will never allow you to reconcile (in an accounting sense) which is one of our priorities as well.

Is the Unchained Index a requirement of using this tool? Since TrueBlocks only provides Unchained Index data for Eth mainnet, Sepolia, and Gnosis, how can I run the chifra init for Polygon?

This is a very, very good question. TrueBlocks is not a “service.” By that, I mean that we do not provide you (the user) with anything other than the ability to create and use the Unchained Index yourself. It’s as if we were giving you a hammer as opposed to, say, being carpenter that you can hire to complete a project. We provide Eth mainnet, Sepolia, and Gnosis because we need those chains. If someone else needs a different chain, they need to provide it for themselves. The innovation that TrueBlocks makes is that (if it makes any), is to allow you to provide the index for yourself. Furthermore, with TrueBlocks, the index is shared with other people without doing anything special. Super importantly – other people can share perhaps other data with you. And it flows out from there. TrueBlocks is purposefully designed this way because “decentralization,” which we believe must work by default.

How can I make a list of all tokens I own (and their current or historical) balances?

An article on why this is hard: https://trueblocks.io/blog/how-many-erc20-tokens-do-you-have/

A first draft of an article on how to accomplish this: https://github.com/TrueBlocks/tokenomics/blob/main/explorations/accounting-03/notes.md

Build problems:

When run the make command, I got this error:

/data/github/trueblocks-core/src/libs/utillib/sfos.cpp:21:10: fatal error: filesystem: No such file or directory
 #include <filesystem>
compilation terminated.
libs/utillib/CMakeFiles/util.dir/build.make:902: recipe for target 'libs/utillib/CMakeFiles/util.dir/sfos.cpp.o' failed

Answer: Upgrade Ubuntu to latest version. See this https://stackoverflow.com/questions/39231363/fatal-error-filesystem-no-such-file-or-directory.

Are there any commands to check the consistency of the finalized blocks? That is, for blockchains that don’t have a pinned unchained index.

Answer: chifra chunks index --check does some high level consistency checks (but this does require someone to have published the manifest hash to the smart contract, so maybe that doesn’t work.

There’s also chifra chunks index --check --deep which digs into the files themselves and checks that each address in the file reports true when the associated bloom filter is queried.

As far as answering the question, “Is this the exact result that one should get from building the index X?” I’m not sure that’s even possible. The way “we” handled that issue is by allowing anyone to publish the hash of the manifest to the Unchained Index smart contract and then we compare the results. Another way is to by rebuilding it from scratch against different client software. We’ve done that twice since inception

You may also export “everything” in each chunk with chifra chunks addresses --verbose --fmt json but this is very, very verbose. It produces data per-chunk, so you must combine results for a single address, for example from many reports. Not optimal.

What’s the long term vision for TrueBlocks?


  • 30-year vision: you can’t buy a computer of any type without a blockchain node inside and that blockchain node is so well indexed, anyone can get any portion of the entire history of the world without asking permission.
  • 15-year vision: a special type of node software called an indexing node that does not carry the actual details of the chain, but can build the index and share it for free.
  • 5-year vision: a large number of end-user (probably desktop) application built upon an excellent, complete, automatically-shared, super-fast index.
  • 1-year vision: complete the work we promised to the Ethereum Foundation as described here: Ethereum Foundation Grant - TrueBlocks
  • 1-month vision: get a speaking gig at EthDenver.
  • 1-day vision: finish porting chifra traces to GoLang.

What is your policy on new features?


  • New features are “sort of” on hold for now as we port the entire C++ code base to GoLang. Once that’s completed, we will focus on improving speed by taking advantage of GoLang’s natural concurrency. Throughout our development, as new features are requested/suggested, if the feature can be added relatively easily to the GoLang code, we may add them. If the suggested feature needs to be added to the C++ code, it probably won’t be added.

Future Questions

Future work: explain using `chifra export –statements | jq “.data[]” …

Future work: An example of historical holdings using Dune (for comparison): https://dune.xyz/0xrusowsky/Olympus-Wallet-History).

Future work: List token swaps, transfers, and fees paid by a list of wallet addresses (to help with the taxman)

Edit this page on GitHub