バージョンに関係なくRDD、 、map、および に変換して戻すことができDataFrameます。
df = spark.createDataFrame(
    [(0, 1, 23, 4, 8, 9, 5, "b1"), (1, 2, 43, 8, 10, 20, 43, "e1")], 
    ("id", "a1", "b1", "c1", "d1", "e1", "f1", "ref")
)
df.rdd.map(lambda row: row + (row[row.ref], )).toDF(df.columns + ["out"])
+---+---+---+---+---+---+---+---+---+
| id| a1| b1| c1| d1| e1| f1|ref|out|
+---+---+---+---+---+---+---+---+---+
|  0|  1| 23|  4|  8|  9|  5| b1| 23|
|  1|  2| 43|  8| 10| 20| 43| e1| 20|
+---+---+---+---+---+---+---+---+---+
スキーマを保持することもできます
from pyspark.sql.types import LongType, StructField
spark.createDataFrame(
    df.rdd.map(lambda row: row + (row[row.ref], )), 
    df.schema.add(StructField("out", LongType())))
DataFrames複雑Columnsな. 1.6 では:
from pyspark.sql.functions import array, col, udf
from pyspark.sql.types import  LongType, MapType, StringType
data_cols = [x for x in df.columns if x not in {"id", "ref"}]
# Literal map from column name to index
name_to_index = udf(
    lambda: {x: i for i, x in enumerate(data_cols)},
    MapType(StringType(), LongType())
)()
# Array of data
data_array = array(*[col(c) for c in data_cols])
df.withColumn("out", data_array[name_to_index[col("ref")]])
+---+---+---+---+---+---+---+---+---+
| id| a1| b1| c1| d1| e1| f1|ref|out|
+---+---+---+---+---+---+---+---+---+
|  0|  1| 23|  4|  8|  9|  5| b1| 23|
|  1|  2| 43|  8| 10| 20| 43| e1| 20|
+---+---+---+---+---+---+---+---+---+
2.x では、中間オブジェクトをスキップできます。
from pyspark.sql.functions import create_map, lit, col
from itertools import chain
# Map from column name to column value
name_to_value = create_map(*chain.from_iterable(
    (lit(c), col(c)) for c in data_cols
))
df.withColumn("out", name_to_value[col("ref")])
+---+---+---+---+---+---+---+---+---+
| id| a1| b1| c1| d1| e1| f1|ref|out|
+---+---+---+---+---+---+---+---+---+
|  0|  1| 23|  4|  8|  9|  5| b1| 23|
|  1|  2| 43|  8| 10| 20| 43| e1| 20|
+---+---+---+---+---+---+---+---+---+
最後に、次を使用できますwhen。
from pyspark.sql.functions import col, lit, when
from functools import reduce
out = reduce(
    lambda acc, x: when(col("ref") == x, col(x)).otherwise(acc), 
    data_cols,
    lit(None)
)
+---+---+---+---+---+---+---+---+---+
| id| a1| b1| c1| d1| e1| f1|ref|out|
+---+---+---+---+---+---+---+---+---+
|  0|  1| 23|  4|  8|  9|  5| b1| 23|
|  1|  2| 43|  8| 10| 20| 43| e1| 20|
+---+---+---+---+---+---+---+---+---+