python - How to split a line and map it in pairs? -
I am trying to convert a file into a (key, value) pair. Each row is given as
1 ABCD2-03EFF and I want to add them
(1, a), (1, b), (1, c), (1, d), (3, e), (4, f), (4, g) I was trying to do this in Python / Scala, but did not come up with a solution, unless I read the file line from the line and use the loop to do so. The code is in the scale:
val fileRDD = sc.textFile ("input.txt") val map = fileRDD.filter (_. Partition ("\ t"). Length & gt; 1 ). Map {line = & gt; Val fields = line.split ("\ t") var i = 1; While (i & lt; fields.length) {(field (0), field (1)} i = i + 1}}
Here is a loopless scala version:
def source = scala.io.Source from" input.txt "def lines = source.getLines def Tokens = lines map (_ split "\ t") DEF pairs = token flatmap (line => lineal maple (line.head-> gt; _)) pairs.toList < P> source definition should be clear: I'm getting source to read. line defines a for the rows in the Iterator [string] file. Note, after using you, you need off source , which I do not do here.
I take the token each row and divide it, it's going to be an iter [string]] where the array Each element is a word, and each element in the iterator has an array of words for that line.
To better understand it, the code should be rewritten by pairs . Note that this, translate into the exact same commands as below, as the original definition:
for def pairs = {line & lt; - The token letter & lt; - line.tail} yield line.head - & gt; Therefore, for each line, and every letter in theis going to be,tailof the line, I return a pair of head and tail of the line. Remember that each row is an array of words in that line, the first word in the lineline.head line.tailare all other words in that line therefore ,letterall words are going to be, but first of all, whatever you ask for is the same.Going back to the original definition, a
Flatmapandmapis,Flatmaphas been flattened, OK
pairisiterator [(string, string)]. If I had usedflatmapinstead ofmap, then this type was going to beIterator [Array [(string, string)]].
Comments
Post a Comment