Discover data lineage from sub-query and CTE
SQL’s flexibility allows multiple approaches to achieve the same result in a statement. Below, three variations of SQL statements are shown to update the PARENT_TABLE
column in the CCF
table based on values from the PARENT_TABLE
column in the CCF_BAK
table.
- Using
UPDATE
with a single sub-query. - Using
UPDATE
with two sub-queries. - Using a
CTE
(Common Table Expression).
SQLFlow can accurately map the data lineage for all three approaches.
-- sql server
UPDATE s
SET s.PARENT_TABLE = u.PARENT_TABLE
FROM dwh_user.jv.CCF s
JOIN (
SELECT TOP 1 *
FROM dwh_user.jv.CCF_BAK u
WHERE u.PRODUCT_NAME = 'Business loan'
) u ON 1 = 1
WHERE s.PRODUCT_NAME = 'Business loan'
Data lineage of the first SQL statement in xml format generated by SQLFlow:
<relationship id="10" type="fdd" effectType="update" processId="5" processType="sstupdate">
<target id="19" column="PARENT_TABLE" parent_id="4" parent_name="dwh_user.jv.CCF" parent_alias="s"/>
<source id="13" column="PARENT_TABLE" parent_id="11" parent_name="dwh_user.jv.CCF_BAK" parent_alias="u"/>
</relationship>
-- sql server
UPDATE s
SET s.PARENT_TABLE = u.PARENT_TABLE
FROM (
SELECT *
FROM dwh_user.jv.CCF s
WHERE s.PRODUCT_NAME = 'Business loan'
) s
JOIN (
SELECT TOP 1 *
FROM dwh_user.jv.CCF_BAK u
WHERE u.PRODUCT_NAME = 'Business loan'
) u ON 1 = 1
Data lineage of the second SQL statement in xml format generated by SQLFlow:
<relationship id="12" type="fdd" effectType="update" processId="5" processType="sstupdate">
<target id="24" column="PARENT_TABLE" parent_id="4" parent_name="dwh_user.jv.CCF" parent_alias="s"/>
<source id="18" column="PARENT_TABLE" parent_id="16" parent_name="dwh_user.jv.CCF_BAK" parent_alias="u"/>
</relationship>
-- sql server
WITH
s AS (
SELECT PARENT_TABLE
FROM dwh_user.jv.CCF s
WHERE s.PRODUCT_NAME = 'Business loan'
),
u AS (
SELECT TOP 1 PARENT_TABLE
FROM dwh_user.jv.CCF_BAK u
WHERE u.PRODUCT_NAME = 'Business loan'
)
UPDATE s
SET s.PARENT_TABLE = u.PARENT_TABLE
FROM s
JOIN u ON 1 = 1
Data lineage of the third SQL statement in xml format generated by SQLFlow:
<relationship id="19" type="fdd" effectType="update" processId="5" processType="sstupdate">
<target id="10" column="PARENT_TABLE" parent_id="4" parent_name="dwh_user.jv.CCF" parent_alias="s"/>
<source id="24" column="PARENT_TABLE" parent_id="23" parent_name="dwh_user.jv.CCF_BAK" parent_alias="u"/>
</relationship>