Skip to content

Intro

This section present several use cases to give ideas how each function may be used.

Each time a method is used in a use cases, it's full documentation will be available in a collapsible section. For instance, if a use-case uses the method spark_frame.functions.nullable you will see this section at the end of the section:

nullable

nullable(col: Column) -> Column

Make a pyspark.sql.Column nullable. This is especially useful for literal which are always non-nullable by default.

Examples:

>>> from pyspark.sql import SparkSession
>>> spark = SparkSession.builder.appName("doctest").getOrCreate()
>>> from pyspark.sql import functions as f
>>> df = spark.sql('''SELECT 1 as a''').withColumn("b", f.lit("2"))
>>> df.printSchema()
root
 |-- a: integer (nullable = false)
 |-- b: string (nullable = false)

>>> res = df.withColumn('a', nullable(f.col('a'))).withColumn('b', nullable(f.col('b')))
>>> res.printSchema()
root
 |-- a: integer (nullable = true)
 |-- b: string (nullable = true)
Source code in spark_frame/functions.py
def nullable(col: Column) -> Column:
    """Make a `pyspark.sql.Column` nullable.
    This is especially useful for literal which are always non-nullable by default.

    Examples:
        >>> from pyspark.sql import SparkSession
        >>> spark = SparkSession.builder.appName("doctest").getOrCreate()
        >>> from pyspark.sql import functions as f
        >>> df = spark.sql('''SELECT 1 as a''').withColumn("b", f.lit("2"))
        >>> df.printSchema()
        root
         |-- a: integer (nullable = false)
         |-- b: string (nullable = false)
        <BLANKLINE>
        >>> res = df.withColumn('a', nullable(f.col('a'))).withColumn('b', nullable(f.col('b')))
        >>> res.printSchema()
        root
         |-- a: integer (nullable = true)
         |-- b: string (nullable = true)
        <BLANKLINE>
    """
    return f.when(~col.isNull(), col)

You can also find a comprehensive list of all methods in the reference.