Writing and Reading a Text File

Saving & Loading list of strings to a .txt file

.txt

Writing a List of Strings To a Text File (With New Line Character)

Main idea linkarrow-up-right

def writeFile(filePath:String, listObject:List[String]):Unit = {
    import java.io._
    val stringSeq = listObject.map(r => r + "\n").toSeq
    val file = new File(filePath)
    val bw = new BufferedWriter(new FileWriter(file))
    for (line <- lines) {
        bw.write(line)
    }
    bw.close
}

To use it, do writeFile("path/mytextfile.txt", myList)

This function adds a new line character to the end of each element of your list, otherwise it would save as one long string in the text file, without quotes.

Reading a List of Strings From a Text File

def readFile(filePath:String):List[String] = {
    val stringList = spark.sparkContext.textFile(filePath).collect.toList
    return stringList
}

To use it, do val myList = readFile("path/mytextfile.txt")

☞ String formatting is left to "Spark Scala Fundamentals" page of this book.

Saving Without a New Line Character

By default, it saves it as one long string, no New Line characters. As you saw from above, we had to manually add the New Line character to save a list. But sometimes we need to save as a long string, like what we did when we extracted, and saved the schema of a data frame as JSON. Find all details in page "Schema: Extracting, Reading, Writing to a Text File" page of this book.

Reading One Long String, No New Line Character

Sourcearrow-up-right

☞ If Reading a Text File Errored Out

Sometimes, when you're on a cluster, trying to read a text file using .collect() you might get an error related to Hadoop and complier saying,

Solve it with reading text files through Hadoop commands instead, like so

Flatten and Read JSONs

Converts a Row to an RDD. In case you need it

Last updated