scala - Does Spark data set method serialize the computation itself?

Question

Welcome To Ask or Share your Answers For Others

scala - Does Spark data set method serialize the computation itself?

posted Jan 31, 2022 in Technique[技术] by 深蓝 (71.8m points)

scala - Does Spark data set method serialize the computation itself?

I have a data set with multiple columns. A function needs to be invoked to compute result using the data available within a row. So I used a case class with a method and created a data set using it. As example,

case class testCase(x: Double, a1: Array[Double], a2: Array[Double]) {
    var someInt = 0
    def myMethod1(): Unit = {...}    // use x, a1 and a2
    def myMethod2(): Unit = {...}    // use x, a1 and a2
    def result(): { return someInt }

It is called from the main() as

val res = myDS.map(_.result()).toDF("result")

The problem I am facing is that while the code works correctly, no matter how I invoke, unlike for the other parts of the program, the above statement does not work concurrently. Irrespective of the number executors, cores and repartitioning, only one instance of a method seems to work at time!

Any hints to what I should look at would be appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙；凝视深渊过久,深渊将回以凝视…

1 Reply

深蓝 · Answer 1 · 2022-01-31T07:16:08+0000

testCase case class should not be mutable, if you concurrently modify the state of the object your program will be non deterministic because of that. What is looking wrong with the few information you are giving is this snippet

var someInt = 0

that value is probably being modified concurrently by several tasks and I am pretty sure you don't want that.

Can you explain what are you trying to do?

Categories

scala - Does Spark data set method serialize the computation itself?

scala - Does Spark data set method serialize the computation itself?

Please log in or register to add a comment.

Please log in or register to reply this article.

1 Reply

Please log in or register to add a comment.

Just Browsing Browsing

Most popular tags