Table of contents
- What is Blockchain Analytics?
- Importance of Blockchain Analytics in Understanding and Optimizing Blockchain Systems
- How is Data Created on the Blockchain?
- Data Sources
- APIs
- Advantage
- Disadvantages
- Node Software
- Advantages
- Disadvantage
- Data Providers
- Advantages
- Disadvantage:
- Techniques for extracting and cleaning blockchain data for analysis
- Analyzing blockchain data: code-based implementations
- Standard data analysis techniques possible for blockchain adoption
- Analysis of blockchain data
- Conclusion
Data analytics has slowly been making its way into the blockchain world, and there are now increasingly more organizations that are seeing the value in bringing these two together. The blockchain industry is currently filled with so many players, all of whom have varying goals and values, but one thing that just about everyone can agree on is data analytics. In this article, we will look at Blockchain Analytics, types of Blockchain Data, Data sources, how it's collected and preprocessed for analysis, and do some analytics ourselves.
What is Blockchain Analytics?
Blockchain analytics combines blockchain technology and data analysis to gain insights from blockchain data. It involves collecting, processing, and analyzing data from blockchain platforms, such as Bitcoin, Ethereum, or Near, to understand the behavior of blockchain participants and transactions, identify patterns and trends, and predict future activities. The insights gained from blockchain analytics can be used to improve the performance and security of blockchain systems, as well as to inform business decisions in various industries such as finance and supply chain management, and also monitor the movement of funds in Defi projects.
Importance of Blockchain Analytics in Understanding and Optimizing Blockchain Systems
The decentralized nature of blockchain technology and the large amount of data generated by transactions make traditional data analysis methods inadequate for understanding the behavior of blockchain participants. The following are some of the benefits of Blockchain analysis:
Identification of patterns and trends in blockchain data. e.g., Analyzing transaction data can help understand how funds move between addresses and how smart contracts are being executed. This information can be used to improve performance.
It can inform business decisions in various industries that rely on blockchain technology, such as Defi. By analyzing market trends in decentralized exchanges (DEXs) and lending platforms, organizations can identify which assets are currently in high demand and adjust investment strategies.
High-level techniques like Predictive modeling, anomaly detection, and natural language processing (NLP) can help improve the security of blockchain systems by identifying and preventing fraudulent transactions.
Now, let's look at how data can be generated in Blockchain, the types of data, and their sources.
How is Data Created on the Blockchain?
Data is generated through various transactions that transfer NFTs and execute smart contracts and cryptocurrencies.
The image below shows a typical blockchain transaction:
The data generated by blockchain transactions can be divided into several categories:
Metadata: This information includes the timestamp, transaction fee, and transaction size.
Network data: This includes information about the nodes on the network and the relationships between them, such as the number of active nodes and the number of transactions they process.
Contract data includes information about the execution of smart contracts, such as code and the variables used.
Token data: This includes information about the issuance, transfer, and ownership of tokens, such as the total supply, the number of holders, and the transaction history.
Data Sources
Blockchain Data sources refer to the various methods by which data can be collected from blockchain networks. Due to the decentralized nature of Blockchain, data collection is quite different. Several data sources have been adapted for collecting blockchain data, each with advantages and limitations. Let's take a look at some:
APIs
APIs: Application programming interfaces (APIs) allow developers to access data from a blockchain network. Bitcoin and Ethereum provide APIs that enable developers to retrieve transaction data, block data, and other information.
Advantage
- They are well-documented and easy to use.
Disadvantages
- There is a limited amount of data that can be requested and a frequency of the requests that can be made.
Node Software
Running a full node of a blockchain network allows access to all data stored on the network, including every transaction and block ever mined.
Advantages
- Access to all data on the network
Disadvantage
- It requires a high amount of resources and technical knowledge.
Data Providers
There are typically DAOs that provide blockchain data as a service. They collect, process, and store data from various blockchain networks and make it available for customers. Examples are Flipside and Dune.
Advantages
- Availability of a large amount of historical and current data.
Disadvantage:
- It can be pretty costly.
Techniques for extracting and cleaning blockchain data for analysis
In this article, we will be working with the data provider Flipside. Flipside is a DAO that provides blockchain data, and they work with data curators to engineer the data for analysis. They collect raw on-chain data from nodes and move them for ingestion; with a robust ETL tool, the data is cleaned, then goes into curation, where the data is arranged into their respective database, schemas, and tables.
The image below shows a typical example of how data is extracted and cleaned for analysis. To learn more about how data is extracted, check out Flipside Chainwalkers.
Analyzing blockchain data: code-based implementations
Structured Query Language (SQL) is the programming language used to analyze blockchain data on platforms like Flipside and Dune.
SQL is a programming language used to query and extract data from databases; it is used to join, filter, aggregate, and transform the data to make it suitable for analysis.
On Flipside, we can use SQL to run queries on their blockchain data to extract relevant information such as transactions, token balances, and smart contract data. Let's take a look at some examples.
Before we do some analysis, let us walk through the Flipside platform.
The above image shows what Flipside looks like; we can see the database (e.g., Ethereum), schema (e.g., core), and table (e.g., dim_contracts). Also, the tables have different types, which are:
fact
Describes observations or events such as blocks, transactions (tx), and swaps.ez
Are aggregated models built by data curators and are more user-friendly.dim
Describes entities- the things you analyze, such as labels, decimals, and tags.
For more information, check out Flipside docs.
Let's do some simple queries using Basic SQL Syntax.
Let's examine the fact_transactions table:
The query starts with a SELECT
statement, which is used to specify which columns to retrieve from the table. In this case, the asterisk (*) symbol is used, which means that all columns in the table will be selected. The FROM
clause specifies the table from which the query should retrieve data; in this case, the table is called fact_transactions
and is in a database named ethereum.core
. Finally, the LIMIT
clause limits the number of rows the query returns to 10, and only the first ten rows from the table will be retrieved.
We can also examine the transactions that were performed by a specific address.
Let's look at this Ethereum wallet address 0x05E793cE0C6027323Ac150F6d45C2344d28B6019
and find out when it conducted its last transaction and to whom it was sent.
We are looking for the date and receiver's address for this query.
The SELECT
statement is used to specify the columns to retrieve, in this case, the block_timestamp
and the to_address
. The FROM
clause specifies the table from which to retrieve the data, in this case, the table fact_transactions
in the ethereum.core
schema.
The WHERE
clause filters the rows based on a specific condition, in this case, the from_address being equal to the Ethereum address, which is passed as a string parameter and cast to lowercase with the LOWER
function. The query then uses an ORDER BY
clause to sort the result set based on the block_timestamp
column in descending order and a LIMIT
clause to limit the number of rows returned to 1.
We have looked at some basic SQL syntax we can use to make some data analysis. Let's look at standard data analysis techniques possible for blockchain adoption.
Standard data analysis techniques possible for blockchain adoption
Several data analysis techniques can be used to gain valuable insights into blockchain adoption, which include:
Descriptive Statistics: This provides a summary of the main features of a dataset, such as mean, median, and standard deviation. These statistics can help us understand the distribution of data and identify outliers.
Data Visualization: Line charts, Bar charts, and heat maps can be used to represent data and identify patterns and trends visually. This can be useful for understanding the usage of blockchain networks over time and identifying wallet addresses and smart contracts that are most active.
Correlation and Regression Analysis: They can be used to identify the relationship between the number of transactions on a blockchain network and the price of the underlying cryptocurrency.
Natural Language Processing (NLP): It can extract insights from unstructured data, such as social media posts or news articles. This can be useful for understanding how a blockchain network or project is being discussed in public.
These are just a few examples of typical data analysis techniques. The specific methods used will depend on the data available and the purpose of the question being addressed.
Analysis of blockchain data
Blockchain analysis is used to understand, monitor, and gain insights into the behavior of blockchain participants and transactions. Several analyses can be performed to improve the scalability, transaction speed, etc. Here are a few examples:
Transaction Analysis
This is the process of studying the details of individual transactions on a blockchain network. It includes examining information such as assets transferred, the transaction fees paid, and the number of inputs and outputs. This can be used to identify behavior patterns, such as specific transaction types commonly associated with money laundering activities.
Smart Contract Analysis
This process involves the studying of code and execution of smart contracts. This can be used to identify the usage of specific types of smart contracts, such as NFTs or Decentralized Finance applications (DeFi).
Network Analysis
It involves studying the relationship between different entities in a network, such as a relationship between addresses on a blockchain network. This can be used to cluster addresses likely controlled by the same entity.
Token Analysis
This process involves studying tokens' issuance, transfer, and ownership on a blockchain network. It can be used to identify the use of specific types of tokens.
Market Trend/Analysis
This involves the study of the price and trading volume of different assets on various markets or decentralized exchanges (DEX). It can be used to compare the volume of assets and swaps.
In our next article, we will walk you through a market analysis using Flipside and create a visualization.
Conclusion
In this article, you have learned what Blockchain Analytics is, the benefits of using analytics to improve blockchain systems, how data is created on the Blockchain, where blockchain data can be gotten, techniques used in extracting and cleaning data used by data providers, we also did some data analysis using basic SQL syntax and finally analysis that can be done to gain valuable insights to inform business decisions and support development of new blockchain-based products and services.
It is important to note that analyzing blockchain data requires a good understanding of the underlying technology, the specific blockchain network being investigated, and the particular use case. With the rise of blockchain networks and the increasing amount of data generated, blockchain analytics is becoming increasingly important. As blockchain technology continues to mature, it will be crucial for blockchain enthusiasts and data professionals to develop the skills and knowledge necessary to analyze blockchain data effectively. This guide provided a comprehensive introduction to blockchain analytics's key concepts and techniques, and I hope it will be a valuable resource for those interested in exploring this exciting field.
I would love to connect with you on LinkedIn | Twitter | Github | Portfolio