Databricks spark dataframe api question analysis
posted on 02 Sep 2020
Which of the following operations can be used to create a new DataFrame with a new column and all previously existing columns from an existing DataFrame?
A. DataFrame.withColumn()
B. DataFrame.drop()
C. DataFrame.withColumnRenamed()
D. DataFrame.head()
E. DataFrame.filter()
Which of the following code blocks returns a DataFrame with a new column aSqaured
and all previously existing columns from DataFrame df
?
A. df.withColumn("aSquared", col("a") * col("a"))
B. df.withColumnRenamed("aSquared", col("a") * col("a"))
C. df.select("aSquared")
D. df.withColumn(col("a") * col("a"), "aSquared")
E. df.withColumnRenamed("aSquared", col("a") * col("a"))
The code block show below contains an error. The code block is intended to return a DataFrame with a new column aSquared
and all previously existing columns from DataFrame df
. Identify the error. Code block:
df.withColumn(col("a") * col("a"), "aSquared")
A. The arguments to df.withColumn
are provided in reverse order. “aSquared
” should be first, and col("a") * col("a)
should be second.
B. The df.withColumn()
operation does no create new columns. The df.newColumn()
operation should be used instead.
C. The argument “aSquared
” must be wrapped in the col()
function becaouse it is a column name.
D. The withColumn()
operation is not a DataFrame method. It should be called on its own with the first argument being df
.
E. The df.withColumn()
operation does not create new columns. The df.withColumnsRenamed()
operation should be used instead.
The code block show below should return a DatFrame with a new column aSquared
and all previously existing columns from DataFrame df
. Choose the response that correctly fills in the number blanks within the code block to complete this task.
Code block:
df._1_(_2_, _3_)
A.
1. withColumn
2. "aSquared"
3. col("a") * col("a")
B.
1. withColumnRenamed
2. "aSquared"
3. col("a") * col("a")
C.
1. withColumn
2. col("aSquared")
3. col("a") * col("a")
D.
1. withColumn
2. "aSquared"
3. "a" * "a"
E.
1. withColumnRenamed
2. "aSquared"
3. "a" * "a"
In what order should the below line of code be run in order to return a DataFrame with a new column aSquared
and all previously existing columns from DataFrame df
?
1. df
2. .withColumn("aSquared", "a" * "a")
3. .withColumn("aSquared", col("a") * col("a"))
4. DataFrame
5. .withColumn(col("aSquared"), col("a") * col("a")
A. 1,3
B. 1,2
C. 1,5
D. 4,2
E. 4,3