How to use Spark Sql to do recursive query


I'm trying to use spark sql to recursively query over hierarchal dataset and identifying the parent root of the all the nested children.

I've tried using self-join but it only works for 1 level.

Any ideas or pointers ?



You can use a Graphx-based solution to perform a recursive query (parent/child or hierarchical queries) . This is a functionality provided by many databases called Recursive Common Table Expressions (CTE) or Connect by SQL Clause

See this article for more information: <a href="https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/" rel="nofollow">https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/</a>


