Thursday, May 24, 2018

Encounter with Python Generators

Recently faced a problem in analyzing log.
"Single line log file of 10GB size needs to be read and all IP address must be printed"

Issue: Cannot read line by line to avoid memory corruption. Have to go for character by character.

Solution:

#!/usr/bin/python
import re
def getIP():
        ip = re.compile('\d+|\\.')
        out = []
        with open("./ipaddr","r") as f:
                while True:
                        c = f.read(1)
                        if not c:
                                break
                        if ip.match(c):
                                out.append(c)
                                for i in range(14):
                                        c = f.read(1)
                                        if ip.match(c):
                                                out.append(c)
                                        else:
                                                if out:
                                                        yield "".join(out)
                                                out = []

print str([ipad for ipad in getIP()])


Any ideas to simplify ??

No comments:

Post a Comment

Python3 generator with recursion

Requirement: Generate sequential MAC addresses without any duplicates Input: integer Output: MAC Address Problem: Python2 does not support y...