unitxt.sql_utils module

class unitxt.sql_utils.DatabaseConnector(db_config: SQLDatabase)[source]

Bases: ABC

Abstract base class for database connectors.

abstract execute_query(query: str) Any[source]

Abstract method to execute a query against the database.

abstract get_table_schema() str[source]

Abstract method to get database schema.

class unitxt.sql_utils.InMemoryDatabaseConnector(db_config: SQLDatabase)[source]

Bases: DatabaseConnector

Database connector for mocking databases with in-memory data structures.

execute_query(query: str) Any[source]

Simulates executing a query against the mock database.

get_table_schema(select_tables: List[str] | None = None) str[source]

Generates a mock schema from the tables structure.

class unitxt.sql_utils.LocalSQLiteConnector(db_config: SQLDatabase)[source]

Bases: DatabaseConnector

Database connector for SQLite databases.

download_database(db_id)[source]

Downloads the database from huggingface if needed.

execute_query(query: str) Any[source]

Executes a query against the SQLite database.

get_db_file_path(db_id)[source]

Gets the local path of a downloaded database file.

get_table_schema() str[source]

Extracts schema from an SQLite database.

class unitxt.sql_utils.RemoteDatabaseConnector(db_config: SQLDatabase)[source]

Bases: DatabaseConnector

Database connector for remote databases accessed via HTTP.

execute_query(query: str) Any[source]

Executes a query against the remote database, with retries for certain exceptions.

get_table_schema() str[source]

Retrieves the schema of a database.

unitxt.sql_utils.collect_clause(statement, clause_keyword)[source]

Parse SQL statement and collect clauses.

unitxt.sql_utils.execute_query_local(db_path: str, query: str) Any

Executes a query against the SQLite database.

unitxt.sql_utils.execute_query_remote(api_url: str, database_id: str, api_key: str, query: str, retryable_exceptions: tuple = (<class 'requests.exceptions.ConnectionError'>, <class 'requests.exceptions.ReadTimeout'>), max_retries: int = 3, retry_delay: int = 5, timeout: int = 30) -> (typing.Union[dict, NoneType], <class 'str'>)

Executes a query against the remote database, with retries for certain exceptions.

unitxt.sql_utils.extract_select_columns(statement)[source]

Parse SQL using sqlparse and extract columns.

unitxt.sql_utils.extract_select_info(sql: str)[source]

Parse SQL using sqlparse and return a dict of extracted columns and clauses.

unitxt.sql_utils.get_db_connector(db_type: str)[source]

Creates and returns the appropriate DatabaseConnector instance based on db_type.

unitxt.sql_utils.is_sqlglot_parsable(sql: str, db_type='sqlite') bool[source]

Returns True if sqlglot does not encounter any error, False otherwise.

unitxt.sql_utils.is_sqlparse_parsable(sql: str) bool[source]

Returns True if sqlparse does not encounter any error, False otherwise.

unitxt.sql_utils.sql_exact_match(sql1: str, sql2: str) bool[source]

Return True if two SQL strings match after very basic normalization.

unitxt.sql_utils.sqlglot_optimized_equivalence(expected: str, generated: str) int[source]

Checks if SQL queries are equivalent using SQLGlot parsing, so we don’t run them.

unitxt.sql_utils.sqlglot_parsed_queries_equivalent(sql1: str, sql2: str, dialect: str = '') bool[source]
unitxt.sql_utils.sqlparse_queries_equivalent(sql1: str, sql2: str) bool[source]

Return True if both SQL queries are naively considered equivalent.

unitxt.sql_utils.strip_alias(col: str) str[source]

Remove any AS alias from a column.