896

How to use Spark Sql to do recursive query

Question:

I'm trying to use spark sql to recursively query over hierarchal dataset and identifying the parent root of the all the nested children.

I've tried using self-join but it only works for 1 level.

Any ideas or pointers ?

Thanks

Answer1:

You can use a Graphx-based solution to perform a recursive query (parent/child or hierarchical queries) . This is a functionality provided by many databases called Recursive Common Table Expressions (CTE) or Connect by SQL Clause

See this article for more information: <a href="https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/" rel="nofollow">https://www.qubole.com/blog/processing-hierarchical-data-using-spark-graphx-pregel-api/</a>

Recommend

  • efficiently calculating connected components in pyspark
  • Symfony2: login does not work on first try after clearing cookies
  • How to store tree structure in sql?
  • Opencv not finding all contours
  • Get tree structure of folders
  • Efficiently calculating a segmented regression on a large dataset
  • eC (Ecere) how to not worry about private data fields of a class
  • Recursion in ASP.NET Core Razor views
  • including Python.h in C++ file CDT
  • Cassandra 2.1: Recursion by nesting UDT's
  • Copying rows in a database when rows have children
  • Passing information to server-side function in a Google Docs Add On
  • @tailrec why does this method not compile with 'contains a recursive call not in tail position&
  • Why doesnt this Java loop in a thread work?
  • What and where is mdimport
  • Does it make sense to call System.gc() and Thread.sleep() when working on Bitmaps?
  • How to write order and limit within cakephp joins array
  • How to define and use opencv mat of user type
  • Use of this Javascript
  • Record samples being played with OpenAL
  • C++ Partial template specialization - design simplification
  • Xamarin Forms - UWP Fonts
  • Read text file and split every line in MSBuild
  • Fetching methods from BroadcastReceiver to update UI
  • Get object from AWS S3 as a stream
  • Can Jackson SerializationFeature be overridden per field or class?
  • How to recover from a Spring Social ExpiredAuthorizationException
  • Excel - Autoshape get it's name from cell (value)
  • Check if a string to interpolate provides expected placeholders
  • ILMerge & Keep Assembly Name
  • Redux, normalised entities and lodash merge
  • Large data - storage and query
  • How to get next/previous record number?
  • WOWZA + RTMP + HTML5 Playback?
  • RestKit - RKRequestDelegate does not exist
  • Traverse Array and Display in markup
  • Python: how to group similar lists together in a list of lists?
  • Android Studio and gradle
  • How to get Windows thread pool to call class member function?
  • How to get NHibernate ISession to cache entity not retrieved by primary key