DataFrameTransformer
laktory.models.dataframe.DataFrameTransformer
¤
Bases: BaseModel, PipelineChild
A chain of transformations to be applied to a DataFrame. Transformations can be SQL- or DataFrame API-based.
Examples:
import polars as pl
import laktory as lk
df0 = pl.DataFrame(
{
"id": ["a", "b", "c"],
"x1": [1, 2, 3],
}
)
node0 = lk.models.DataFrameMethod(
func_name="with_columns",
func_kwargs={
"y1": "x1",
},
)
node1 = lk.models.DataFrameExpr(expr="select id, x1, y1 from {df}")
transformer = lk.models.DataFrameTransformer(nodes=[node0, node1])
df = transformer.execute(df0).collect()
print(df)
'''
┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| | id | x1 | y1 | |
| |----|----|----| |
| | a | 1 | 1 | |
| | b | 2 | 2 | |
| | c | 3 | 3 | |
└──────────────────┘
'''
| PARAMETER | DESCRIPTION |
|---|---|
nodes
|
List of transformations
TYPE:
|
| METHOD | DESCRIPTION |
|---|---|
execute |
Execute transformation nodes on provided DataFrame |
| ATTRIBUTE | DESCRIPTION |
|---|---|
data_sources |
Get all sources feeding the Transformer
|
upstream_node_names |
Pipeline node names required to apply transformer
TYPE:
|
data_sources
property
¤
Get all sources feeding the Transformer
upstream_node_names
property
¤
Pipeline node names required to apply transformer
execute(df, named_dfs=None)
¤
Execute transformation nodes on provided DataFrame df
| PARAMETER | DESCRIPTION |
|---|---|
df
|
Input dataframe
|
named_dfs
|
Other DataFrame(s) to be passed to the method.
DEFAULT:
|
| RETURNS | DESCRIPTION |
|---|---|
Output dataframe
|
|
Source code in laktory/models/dataframe/dataframetransformer.py
119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 | |