(持续更新)Databend SDK 接入最佳实践&常见问题

本文档包含 Databend 支持的所有 SDK 接入的最佳实践以及在使用过程中可能会出现的问题以及对应的解决方案。希望可以成为用户接入 Databend 的一把金钥匙,打开通向 Databend 的大门。

Databend JDBC

创建 JDBC Connection

1
2
3
4
5
private Connection createConnection()
throws SQLException {
String url = "jdbc:databend://localhost:8000";
return DriverManager.getConnection(url, "databend", "databend"); // user, password
}

建表

1
2
Connection c = createConnection();
c.createStatement().execute("create table test_basic_driver.table1(i int)");

单条插入数据

1
2
3
Connection c = createConnection();
c.createStatement().execute("create table test_basic_driver.table1(i int, j varchar)");
c.createStatement().execute("insert into test_basic_driver.table1 values(1,'j')");

单条插入无法发挥 databend 的性能,写入性能较差,只能作为测试使用。推荐使用批量插入(Batch Insert)

批量插入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
public void BatchInsert() throws SQLException {
Connection c = createConnection();
c.setAutoCommit(false);

c.createStatement().execute("create table test_basic_driver.test_prepare_statement(i int, j varchar)");

PreparedStatement ps = c.prepareStatement("insert into test_prepare_statement values");
ps.setInt(1, 1);
ps.setString(2, "a");
ps.addBatch();
ps.setInt(1, 2);
ps.setString(2, "b");
ps.addBatch();
System.out.println("execute batch insert");
int[] ans = ps.executeBatch();
ps.close();
Statement statement = c.createStatement();

System.out.println("execute select");
statement.execute("SELECT * from test_prepare_statement");
ResultSet r = statement.getResultSet();
while (r.next()) {
System.out.println(r.getInt(1));
System.out.println(r.getString(2));
}
statement.close();
c.close();
}

在实际使用中,推荐增大 Batch size 到 10w~100w 之间

连接参数

https://github.com/datafuselabs/databend-jdbc/blob/main/docs/Connection.md

常见问题

Q: Upload to stream failed 报错

A: 检查 Client 到 OSS 的网络情况

Q: Spring boot 等项目 Slf4j provider not found

A: 这可能是引入的 slf4j 包产生了冲突,检查 pom 中 slf4j 版本是否一致

Q: 如何将 NULL 写入表

A: ps.setNull(index, Types.NULL)

Q: Spring boot JDBCTemplate getParameterType not implement

A: getParameterType 在 databend JDBC 中目前还没有实现,已在开发计划中

Golang SDK

创建 sql.db client

1
2
3
4
5
6
7
8
9
10
11
12
import (
"database/sql"
_ "github.com/datafuselabs/databend-go"
)
func main() {
db, err := sql.Open("databend", dsn)
if err != nil {
return err
}
db.Ping()
db.Close()
}

执行 SQL

1
2
3
4
5
6
7
8
9
10
conn, err := sql.Open("databend", dsn)
if err != nil {
fmt.Println(err)
}
conn.Exec(`DROP TABLE IF EXISTS data`)
_, err = conn.Exec(` CREATE TABLE IF NOT EXISTS data( Col1 TINYINT, Col2 VARCHAR )`)
if err != nil {
fmt.Println(err)
}
_, err = conn.Exec("INSERT INTO data VALUES (1, 'test-1')")

批量插入

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
func main() {
conn, err := sql.Open("databend", "http://databend:databend@localhost:8000/default?sslmode=disable")
tx, err := conn.Begin()
if err != nil {
fmt.Println(err)
}
batch, err := tx.Prepare(fmt.Sprintf("INSERT INTO %s VALUES", "test"))
for i := 0; i < 10; i++ {
_, err = batch.Exec(
"1234",
"2345",
"3.1415",
"test",
"test2",
"[4, 5, 6]",
"[1, 2, 3]",
"2021-01-01",
"2021-01-01 00:00:00",
)
}
err = tx.Commit()
}

请求单行/多行 (Querying Row/s)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
func main() {
// create table data (col1 uint8, col2 string);
// insert into data values(1,'col2');conn, err := sql.Open("databend", "http://databend:databend@localhost:8000/default?sslmode=disable")
if err != nil {
fmt.Println(err)
}
row := conn.QueryRow("SELECT * FROM data")
var (
col1 uint8col2 string
)
if err := row.Scan(&col1, &col2); err != nil {
fmt.Println(err)
}
fmt.Println(col2)
}

Python SDK

databend-py

创建 client
1
c = Client.from_url("http://user:password@host:port/db?secure=false")

Databend py DSN 中支持的参数可以参考: https://github.com/datafuselabs/databend-py/blob/main/docs/connection.md

建表&查询
1
2
3
4
5
c = Client.from_url(databend_url)
c.execute('CREATE TABLE if not exists test (x Int32,y VARCHAR)')
t, r = c.execute('INSERT INTO test VALUES', [(3, 'aa'),(4,'bb')],with_column_types=True)
# execute 返回值有两个,第一个是(column_name, column_type):[('a', 'Int32'), ('b', 'String')], 只有当 with_column_types=True 的时候才返回,默认为 False;
# 第二个返回值是数据集: [(1, 'a'), (2, 'b')]
批量插入
1
2
3
4
5
6
7
8
c = Client.from_url(databend_url)
c.execute('DROP TABLE IF EXISTS test')
c.execute('CREATE TABLE if not exists test (x Int32,y VARCHAR)')
c.execute('DESC test')
_, r1 = c.execute('INSERT INTO test VALUES', [(3, 'aa'), (4, 'bb')])
assertEqual(r1, 2)
_, ss = c.execute('select * from test')
assertEqual(ss, [(3, 'aa'), (4, 'bb')])
Upload data to stage

databend py 可以直接将 python slice 数据以 csv 格式导入到 databend stage

1
2
3
4
from databend_py import Client
client = Client.from_url(databend_url)
stage_path = client.upload_to_stage('@~', "upload.csv", [(1, 'a'), (1, 'b')])
# stage_path is @~/upload.csv
Upload file to stage

Databend py 也可以直接将文件上传到 stage

1
2
3
4
5
from databend_py import Client
client = Client.from_url(self.databend_url)
with open("upload.csv", "rb") as f:
stage_path = client.upload_to_stage('@~', "upload.csv", f)
print(stage_path)

Databend sqlalchemy

创建 sqlalchemy connect 并查询
1
2
3
4
5
6
7
8
9
from sqlalchemy import create_engine, text
from sqlalchemy.engine.base import Connection, Engine

engine = create_engine(
f"databend://{username}:{password}@{host_port_name}/{database_name}?sslmode=disable"
)
connection = engine.connect()
result = connection.execute(text("SELECT 1"))
print(result.fetchall())

databend-sqlalchemy 在版本<v0.4.0 使用[databend-py](https://github.com/datafuselabs/databend-py)时作为内部 Driver,>= v0.4.0 时使用 databend driver python binding作为内部驱动程序。两者之间的唯一区别是,DSN中提供的连接参数不同。使用相应的版本时应该参考相应驱动程序提供的连接参数。

Bendsql

Exec

1
2
3
4
5
6
7
8
9
10
11
12
13
14
use databend_driver::Client;

let dsn = "databend://root:@localhost:8000/default?sslmode=disable".to_string();
let client = Client::new(dsn);
let conn = client.get_conn().await.unwrap();

let sql_create = "CREATE TABLE books (
title VARCHAR,
author VARCHAR,
date Date
);";
conn.exec(sql_create).await.unwrap();
let sql_insert = "INSERT INTO books VALUES ('The Little Prince', 'Antoine de Saint-Exupéry', '1943-04-06');";
conn.exec(sql_insert).await.unwrap();

请求单行数据

1
2
3
let row = conn.query_row("SELECT * FROM books;").await.unwrap();
let (title,author,date): (String,String,i32) = row.unwrap().try_into().unwrap();
println!("{} {} {}", title, author, date);

请求多行

1
2
3
4
5
let mut rows = conn.query_iter("SELECT * FROM books;").await.unwrap();
while let Some(row) = rows.next().await {
let (title,author,date): (String,String,chrono::NaiveDate) = row.unwrap().try_into().unwrap();
println!("{} {} {}", title, author, date);
}

DSN 参数

参考:https://github.com/datafuselabs/bendsql/blob/main/README.md#dsn


-------------The End-------------

本文标题:(持续更新)Databend SDK 接入最佳实践&常见问题

文章作者:cloud sjhan

发布时间:2024年01月06日 - 17:01

最后更新:2024年01月06日 - 20:01

原始链接:https://cloudsjhan.github.io/2024/01/06/Databend-SDK-接入最佳实践-常见问题/

许可协议: 署名-非商业性使用-禁止演绎 4.0 国际 转载请保留原文链接及作者。

cloud sjhan wechat
subscribe to my blog by scanning my public wechat account
坚持原创技术分享,您的支持将鼓励我继续创作!
0%
;