Skip to main content

SQL Database Source

Fetching data from a SQL database is very common. You can use a database URL and your custom query to fetch data from it. Using this source is very useful for inference, and we will discuss it further in this section.

DB URL

To access the database, you must provide a URL for your database.

from seshat.source.database import SQLDBSource

URL = "postgresql://postgres:postgres@localhost:5432/your_database_name"
source = SQLDBSource(url=URL)

Define Query

To define a query to fetch data from the database, you have several options. You can directly write the raw query:

URL = "postgresql://postgres:postgres@localhost/your_database_name"
query = "SELECT * FROM address_info;"

source = SQLDBSource(url=URL, query=query)

Or you can define a function with no args and kwargs that returns the raw query when called:

URL = "postgresql://postgres:postgres@localhost/your_database_name"
def get_query():
return "SELECT * FROM address_info;"

source = SQLDBSource(url=URL, query_fn=get_query)

Another way is to pass table_name and a filters dictionary if you want to filter the result. By default, passing table_name generates a select statement, and if filters are also passed, a where clause will be generated and added to the query.

URL = "postgresql://postgres:postgres@localhost:5432/your_database_name"

source = SQLDBSource(
url=URL,
table_name="address_info",
filters={
"address": {"val": "0xB3bBDe2E19a3f713772467b4164180Eb14471fBa", "op": "="}
},
)

source.get_query()
>>> 'SELECT * FROM address_info WHERE address = 0xB3bBDe2E19a3f713772467b4164180Eb14471fBa'

Note that you can add filters to the query even if you directly pass the raw query to the source or use a query function.

URL = "postgresql://postgres:postgres@localhost:5432/your_database_name"

source = SQLDBSource(
url=URL,
query="SELECT * FROM address_info",
filters={
"address": {"val": "0xB3bBDe2E19a3f713772467b4164180Eb14471fBa", "op": "="}
},
)

source.get_query()
>>> 'SELECT * FROM address_info WHERE address = 0xB3bBDe2E19a3f713772467b4164180Eb14471fBa'

Fetching

After you define the query and database URL for the source, you can get the result by calling it. Like any other source, you can pass the schema to the source, and the result after fetching will be passed to the schema.

source = SQLDBSource(
url=URL,
query="SELECT * FROM address_info",
)
sf = source()
sf.data
>>>
address ... amount_avg
0 0xcceef331edabd5e7ce6519f4f7a22b7739152e9a ... 2.176124e+12
1 0x1a67c5a7eee3cfc4a3cfa2b5f9677d751f39919f ... 2.166698e+08
2 0x3fc91a3afd70395cd496c647d5a6cc9d4b2b7fad ... 1.231771e+13
3 0x6b75d8af000000e20b7a7ddf000ba900b4009a80 ... 3.753357e+13
4 0xdc6b2c2dc744fd06c4f074bde90c99332d0fda51 ... 1.691497e+06
... ... ...
6972 0x9043238929c377977c48ff5b9c81a16ffe995185 ... 1.912416e-01
6973 0xe1ae3bbc16fbc9237e29b9fdb78a933eacd39461 ... 6.605779e-02
6974 0xd61d95ac586eb35d9280f6b054d4a0d58bc820c1 ... 2.201770e-01
6975 0xe922f98f156ab3b4e4b7fb3801d85e2d9e35a6f4 ... 1.293662e-01
6976 0x734ec340250d3268e7d7104aedaa426686345504 ... 1.423449e+02
[6977 rows x 7 columns]