Skip to main content

Quick Start

Seshat helps you build and productionize predictive features for Blockchain applications. It streamlines preprocessing and cleaning data, create test and train data, and eventually deploy your model for inference with up-to-date data.

In this document, we help you to quickly get up and running with Seshat. The main goal of this tutorial is to build a vector representation for blockchain addresses (i.e., users). You can check our use case on SocialFi Token Recommendation for an comprehensive tutorial that starts from explaining required raw data to model inference.

System Requirements

This project requires Python 3.9 or higher. Make sure you have Python installed on your system before proceeding with the installation. You can download the latest version of Python from the official Python website.

Install Seshat

Use the following command to install the latest version of Seshat SDK. For early beta access, please contact us at info@seshatlabs.xyz

pip install sdk-seshat-python

If you want to use flipside

pip install sdk-seshat-python[flipside_support]

And for postgres support:

pip install sdk-seshat-python[postgres_support]

Required Data

With Seshat, you can use both online and offline data. Online data can be obtained from data providers such as Flipside

For the purpose of this quick start tutorial, download a sanpshot of token transfer transaction data from this link. This data shows which token is sent or received to/by which address and with which amount.

Running the First Pipeline

In this part, we show you how to get a vector representation for each address in the dataset. These vectors can then be used for downstream tasks, such as recommendation. We also show you how to run a data preprocessing pipeline with Seshat. For the data preprocessing, we only run one pipeline that removes addresses with low transactions. There are plenty of other data cleaning and preprocessing pipelines you can use within Seshat.

In Seshat SDK, the entire ML operation is handled by FeatureView. Within the FeatureView, we define the data source. Different pipelines are defined inside the Pipeline object. Here, we only have two pipelines: LowTransactionTrimmer and TokenPivotVectorizer. After running the pipeline, we store the vectors of addresses in address_vectors.

from seshat.feature_view.base import FeatureView
from seshat.source.local import LocalSource
from seshat.transformer.pipeline import Pipeline
from seshat.transformer.trimmer.base import LowTransactionTrimmer
from seshat.transformer.vectorizer.pivot import TokenPivotVectorizer


class AddressVectorsView(FeatureView):
name = "Vectorizing blockchain addresses using token transfer transactions"
offline_source = LocalSource(path="data/token_transfer_19000000_19005100.csv")

offline_pipeline = Pipeline(
[
LowTransactionTrimmer(min_transaction_num=20),
TokenPivotVectorizer()
]
)


vector_view = AddressVectorsView()
view = vector_view()
address_vectors = view.data["vector"].to_raw()