datapyground.sql.planner¶
Manages creation of a query plan from a parsed SQL query.
The SQLQueryPlanner class is responsible for creating a compute engine
query plan from the AST of the parsed SQL query.
Example:
>>> import pyarrow as pa
>>> sales_table = pa.table({"Product": ["Videogame", "Laptop", "Videogame"], "Quantity": [2, 1, 3], "Price": [50, 1000, 60]})
>>>
>>> from datapyground.sql.parser import Parser
>>> from datapyground.sql.planner import SQLQueryPlanner
>>> sql = "SELECT Product, Quantity, Price, Quantity*Price AS Total FROM sales WHERE Product='Videogame' OR Product='Laptop'"
>>> query = Parser(sql).parse()
>>> planner = SQLQueryPlanner(query, catalog={"sales": sales_table})
>>> str(planner.plan())
"ProjectNode(select=[], project={'Product': ColumnRef(sales.Product), 'Quantity': ColumnRef(sales.Quantity), 'Price': ColumnRef(sales.Price), 'Total': pyarrow.compute.multiply(ColumnRef(sales.Quantity),ColumnRef(sales.Price))}, child=FilterNode(filter=pyarrow.compute.or_(pyarrow.compute.equal(ColumnRef(sales.Product),Literal(<pyarrow.StringScalar: 'Videogame'>)),pyarrow.compute.equal(ColumnRef(sales.Product),Literal(<pyarrow.StringScalar: 'Laptop'>))), child=ProjectNode(select=[], project={'sales.Product': ColumnRef(Product), 'sales.Quantity': ColumnRef(Quantity), 'sales.Price': ColumnRef(Price)}, child=PyArrowTableDataSource(columns=['Product', 'Quantity', 'Price'], rows=3))))"
Classes
- class datapyground.sql.planner.SQLQueryPlanner(query: dict, catalog: dict[str, str | Table] | None = None)[source]¶
Create a compute engine query plan from a parsed SQL query.
- Parameters:
query – The parsed SQL query AST as returned by
datapyground.sql.Parser.catalog – An optional dictionary mapping table names to file paths. if not provided, it will guess based on files in the current directory.
- plan() QueryPlanNode[source]¶
Generate a query plan from the parsed SQL query.