Scala DataFrame: Explode an array

I am using the spark libraries in Scala. I have created a DataFrame using

val searchArr = Array(
  StructField("user", StructType(Array(
    StructField("q1",ArrayType(IntegerType, true),true),
    StructField("q2",ArrayType(IntegerType, true),true),

val searchSt = new StructType(searchArr)    

val searchData = sqlContext.jsonFile(searchPath, searchSt)

I am now what to explode the field what.q1, which should contain an array of integers, but the documentation is limited:,%20java.lang.String,%20scala.Function1,%20scala.reflect.api.TypeTags.TypeTag)

So far I tried a few things without much luck

val searchSplit = searchData.explode("q1", "rb")(q1 => q1.getList[Int](0).toArray())

Any ideas/examples of how to use explode on an array?


Did you try with an UDF on field "what"? Something like that could be useful:

val explode = udf {
(aStr: GenericRowWithSchema) => 
  aStr match {
      case null => ""
      case _  =>  aStr.getList(0).get(0).toString()

val newDF = df.withColumn("newColumn", explode(col("what")))


  • getList(0) returns "q1" field
  • get(0) returns the first element of "q1"

I'm not sure but you could try to use getAs[T](fieldName: String) instead of getList(index: Int).

