PyVerdict Documentation¶
pyverdict
is a Python interface for VerdictDB.
Install PyVerdict¶
The easiest way to get pyverdict
is installing from PyPI. You can also compile from source code for developing purpose.
Note: Prerequisites
pyverdict
requires miniconda for Python 3.7,
which can be installed for local users (i.e., without sudo access).
Install from PyPI¶
pyverdict
is distributed with PyPI. Use the following command for installation.
pip install pyverdict
pip install pyverdict --upgrade
Note: Dependencies
pyverdict
ships with a latest VerdictDB jar in it, so no separate installation is necessary.
pyverdict
ships with the JDBC drivers for MySQL, PostgreSQL, Redshift, Impala and Presto currently. More database supports will be added in the future.
Compile from source code¶
Get the newest version of VerdictDB from our repo.
git clone git@github.com:mozafari/verdictdb.git cd verdictdb
pyverdict
root directory and install. You may need extra configurations to compile the VerdictDB jar.
cd pyverdict python setup.py install
Connect to Databases¶
pyverdict.{CONNECT_METHOD}(host, user, password, port, ...)
pyverdict
uses different methods to connect to different databases, and returns a pyverdict.VerdictContext
object. See Connecting to Databases for more details about CONNECT_METHOD
.
import pyverdict verdict_conn = pyverdict.mysql(host, user, password, port)
Make Queries¶
pyverdict.VerdictContext@sql(query)
pyverdict
provides a simple api to make queries. It returns a pandas DataFrame object which contains the query result.
df = verdict_conn.sql('SHOW SCHEMAS')
Close Connection¶
pyverdict.VerdictContext@close()
pyverdict
will close all connections automatically when the python process exits. You can also call this method to close a connection manually.
verdict_conn.close()
Data Type Conversion Rules¶
Here lists the conversion rules pyverdict
uses for different databases. Refer to the pyverdict
source code for more details.
MySQL¶
'bit' => boolean, 'tinyint' => int, 'bool' => boolean, 'smallint' => int, 'medimumInteger' => int, 'int' => int, 'integer => int, 'bigint' => int, 'decimal' => decimal.Decimal, 'dec' => decimal.Decimal, 'real' => float, 'double' => float, 'doubleprecision' => float, 'date' => JavaObject => datetime.date, 'datetime' => JavaObject => datetime.datetime, 'timestamp' => JavaObject => datetime.datetime, 'time' => JavaObject => datetime.timedelta, 'year(2)' => JavaObject => datetime.date, 'year(4)' => JavaObject => datetime.date, 'char' => str, 'varchar' => str, 'binary' => bytes, 'varbinary' => bytes, 'tinyblob' => bytes, 'tinytext' => str, 'blob' => bytes, 'text' => str, 'mediumBlob' => bytes, 'medimumText' => str, 'longBlob' => bytes, 'longText' => str, 'enumCol' => str, 'setCol' => str
PostgreSQL:¶
'bigint' => int, 'bigserial' => int, 'bit' => bool => str, 'varbit' => JavaObject => str, 'boolean' => bool, 'box' => JavaObject => str, 'bytea' => bytes => memoryview, 'char' => str, 'varchar' => str, 'cidr' => JavaObject => str, 'circle' => JavaObject => str, 'date' => JavaObject => datetime.date, 'float8' => float, 'inet' => JavaObject => str, 'integer' => int, 'json' => JavaObject => dict, 'line' => JavaObject => str, 'lseg' => JavaObject => str, 'macaddr' => JavaObject => str, 'macaddr8' => JavaObject => str, 'money' => float => str, 'numeric' => decimal.Decimal, 'path' => JavaObject => str, 'point' => JavaObject => str, 'polygon' => JavaObject => str, 'real' => float, 'smallint' => int, 'smallserial' => int, 'serial' => int, 'text' => str, 'time' => JavaObject => datetime.time, 'timestamp' => JavaObject => datetime.datetime, 'uuid' => JavaObject => str, 'xml' => JavaObject => str, 'bit' => JavaObject => str, 'int8' => int, 'bool' => bool, 'character' => str, 'character' => str, 'int' => int, 'int4' => int, 'double' => float, 'decimal' => decimal.Decimal, 'float' => float, 'int2' => int, 'serial2' => int, 'serial4' => int, 'timetz' => JavaObject => datetime.time, 'timestamptz' => JavaObject => datetime.datetime, 'serial8' => int
Redshift:¶
BIGINT => int, BOOLEAN => bool, BOOL => bool, BPCHAR => str, CHAR => str, CHARACTER => str, CHARACTER VARYING => str, DATE => JavaObject => datetime.datetime, DECIMAL => decimal.Decimal, DOUBLE PRECISION => float, FLOAT => float, FLOAT4 => float, FLOAT8 => float, INT => int, INT2 => int, INT4 => int, INTEGER => int, NCHAR => str, NUMERIC => decimal.Decimal, NVARCHAR => str, REAL => float, SMALLINT => int, TEXT => str, TIMESTAMP => JavaObject => datetime.datetime, TIMESTAMP WITHOUT TIME ZONE => JavaObject => datetime.datetime, TIMESTAMPTZ => JavaObject => datetime.datetime, TIMESTAMP WITH TIME ZONE => JavaObject => datetime.datetime, VARCHAR => str
Impala¶
BIGINT => int, BOOLEAN => bool, CHAR => str, DECIMAL => decimal.Decimal, DOUBLE => float, FLOAT => float, REAL => float, SMALLINT => int, STRING => str, TIMESTAMP => datetime.datetime, TINYINT => int, VARCHAR => str
Presto¶
'tinyint' => int 'boolean' => int 'smallint' => int 'integer' => int 'bigint' => int 'decimal' => float 'real' => float 'double' => float 'date' => str 'timestamp' => str 'char' => str 'varchar' => str