f = open("c:\\testfile.txt")Now say you wanted to do the same thing, but only to the first 5 lines of the file. The impulse for a Python newbie would be to write the following code:
for line in f.readlines():
';'.join(line.split(','))
f = open("c:\\testfile.txt")A more savvy python newbie would write the following code taking the advantage of the beautiful enumerate() function:
counter = 0
for line in f.readlines():
if counter <> 5
';'.join(line.split(','))
counter += 1
f = open("c:\\testfile.txt")The problem with is code is that it has two concepts intermingled, one to read 5 lines and the other for the actual process of changing delimiters. If we wanted to keep the concepts separate, we would have to be able to write something like this:
for counter,line in enumerate(f.readlines()):
if counter <> 5
';'.join(line.split(','))
f = open("c:\\testfile.txt")Here the process of stopping after 5 lines is encapsulated by the readfirst5lines() method in the file object and the method body only does the changing of delimiters as before. Now this code will actually work! The reason that we can write code like this is due the python generators feature. readfirst5lines will look like this:
for line in f.readfirst5lines():
';'.join(line.split(','))
import itertoolsThis is of course more lines of code, but the concept is abstracted away nicely. We are separating the conditions for processing the file from the actual processing. And whats more, this method can be slightly changed to take number of lines as the argument. So if you want choose to read the first 2, 3, 5 or how many ever lines, the method will look like this:
class myfile(file):
def readfirst5lines(self):
for i in itertools.count():
if i == 5:
break
next = self.readline()
i += 1
yield next
import itertoolsAnd you can use it like this:
class myfile(file):
def readfirstfewlines(self, n):
for i in itertools.count():
if i == n:
break
next = self.readline()
i += 1
yield next
f = myfile("c:\\testfile.txt")This will process the first 3 lines.
for line in f.readfirstfewlines(3):
';'.join(line.split(','))
(Note that we used the 'myfile' constructor method instead of the 'open' function to open the file, since we need an object of type 'myfile' and not 'file'. There are other ways to downcast in python but this is probably the simplest way to do it in this case.)
Python generators go a long way in making the code more elegant and encapsulating separate concepts. I will blog about the mechanics of how generators work and its other uses as and when I learn more about them. For Rubyists, this concept of generators is pretty much similar to "blocks".
2 comments:
Nice! You could also do this with a function:
def read_n_lines(inbuffer, num_lines):
for i, line in enumerate(inbuffer):
if i >= num_lines:
return
yield line
And call it like this:
read_n_lines( open("/python25/readme.txt"), 5 )
And make a read_5 function like this:
read_5 = lambda inbuffer : read_n_lines(inbuffer, 5)
The cool thing about the function version is that it abstracts away the file part -- you could pass any iterable object (such as cStringIO for testing). The class version is simpler, though, because it wraps the file-open mechanics.
thanks for the inputs, thats slick indeed! :)
Post a Comment