I need to write my spark dataset to oracle database table. I am using dataset write method with append mode. But getting analysis exception,
when the spark job was triggered on cluster using spark2-submit command.
I have read the json file, flattened it and set into a dataset as abcDataset.
Spark Version - 2
Oracle Database
JDBC Driver - oracle.jdbc.driver.OracleDriver
Programming Language - Java
Dataset<Row> abcDataset= dataframe.select(col('abc').....{and other columns};
Properties dbProperties = new Properties();
InputStream is = SparkReader.class.getClassLoader().getResourceAsStream("dbProperties.yaml");
dbProperties.load(is);
String jdbcUrl = dbProperties.getProperty("jdbcUrl");
dbProperties.put("driver","oracle.jdbc.driver.OracleDriver");
String where = "USER123.PERSON";
abcDataset.write().format("org.apache.spark.sql.execution.datasources.jdbc.DefaultSource").option("driver", "oracle.jdbc.driver.OracleDriver").mode("append").jdbc(jdbcUrl, where, dbProperties);
Expected - to write into database but getting the error below -
org.apache.spark.sql.AnalysisException: Multiple sources found for jdbc (org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider, org.apache.spark.sql.execution.datasources.jdbc.DefaultSource), please specify the fully qualified class name.;
at org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:670)
Do we need to set any additional property in spark submit command, as i am running this on cluster, or any step is missing ?
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…