python通过mmap库映射文件到内存用法详解,pythonmmap,


            示例使用的文本如下lorem.txt:

Lorem ipsum dolor sit amet, consectetuer adipiscing elit.; Donec

egestas, enim et consectetuer ullamcorper, lectus ligula rutrum leo,

a elementum elit tortor eu quam.; Duis tincidunt nisi ut ante.; Nulla

facilisi. Sed tristique eros eu libero.; Pellentesque vel

arcu. Vivamus purus orci, iaculis ac, suscipit sit amet, pulvinar eu,

lacus. Praesent placerat tortor sed nisl.; Nunc blandit diam egestas

dui. Pellentesque habitant morbi tristique senectus et netus et

malesuada fames ac turpis egestas.; Aliquam viverra fringilla

leo. Nulla feugiat augue eleifend nulla.; Vivamus mauris.; Vivamus sed

mauris in nibh placerat egestas.; Suspendisse potenti.; Mauris

massa. Ut eget velit auctor tortor blandit sollicitudin.; Suspendisse

imperdiet justo.;

数据读取:

            使用mmap()函数可以创建内存映射文件。第一个参数是一个文件描述符,可以来自一个文件对象的fileno()方法或从os.open()。调用者要在调用mmap()前打开文件,并调用结束后关闭它。第二个参数以字节为单位,是映射文件的大小。如果值是0,映射整个文件。如果大于当前文件大小,则扩展这个文件。注意可选参数access:ACCESS_READ,ACCESS_WRITE,ACCESS_COPY。
import mmapimport contextlibwith open('lorem.txt', 'r') as f:    with contextlib.closing(mmap.mmap(f.fileno(), 0,                                      access=mmap.ACCESS_READ)                            ) as m:        print 'First 10 bytes via read :', m.read(10)        print 'First 10 bytes via slice:', m[:10]        print '2nd   10 bytes via read :', m.read(10)
                                 执行结果:
$ python mmap_read.pyFirst 10 bytes via read : Lorem ipsuFirst 10 bytes via slice: Lorem ipsu2nd 10 bytes via read : m dolor si
                                数据写入
import mmapimport shutilimport contextlib# Copy the example fileshutil.copyfile('lorem.txt', 'lorem_copy.txt')word = 'consectetuer'reversed = word[::-1]print 'Looking for    :', wordprint 'Replacing with :', reversedwith open('lorem_copy.txt', 'r+') as f:    with contextlib.closing(mmap.mmap(f.fileno(), 0)) as m:        print 'Before:'        print m.readline().rstrip()        m.seek(0) # rewind        loc = m.find(word)        m[loc:loc+len(word)] = reversed        m.flush()        m.seek(0) # rewind        print 'After :'        print m.readline().rstrip()        f.seek(0) # rewind        print 'File  :'        print f.readline().rstrip()
                                执行结果:
$ python mmap_write_slice.pyLooking for : consectetuerReplacing with : reutetcesnocBefore:Lorem ipsum dolor sit amet, consectetuer adipiscing elit. DonecAfter :Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. DonecFile :Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. Donec
                                使用ACCESS_COPY则不会改变实际存储的文件
import mmapimport shutilimport contextlib# Copy the example fileshutil.copyfile('lorem.txt', 'lorem_copy.txt')word = 'consectetuer'reversed = word[::-1]with open('lorem_copy.txt', 'r+') as f:    with contextlib.closing(mmap.mmap(f.fileno(), 0,                                      access=mmap.ACCESS_COPY)                            ) as m:        print 'Memory Before:'        print m.readline().rstrip()        print 'File Before  :'        print f.readline().rstrip()        print        m.seek(0) # rewind        loc = m.find(word)        m[loc:loc+len(word)] = reversed        m.seek(0) # rewind        print 'Memory After :'        print m.readline().rstrip()        f.seek(0)        print 'File After   :'        print f.readline().rstrip()
                                执行结果:
$ python mmap_write_copy.pyMemory Before:Lorem ipsum dolor sit amet, consectetuer adipiscing elit. DonecFile Before :Lorem ipsum dolor sit amet, consectetuer adipiscing elit. DonecMemory After :Lorem ipsum dolor sit amet, reutetcesnoc adipiscing elit. DonecFile After :Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Donec
                                正则表达式可以与正则表达式配合使用:
import mmapimport reimport contextlibpattern = re.compile(r'(\.\W+)?([^.]?nulla[^.]*?\.)',                     re.DOTALL | re.IGNORECASE | re.MULTILINE)with open('lorem.txt', 'r') as f:    with contextlib.closing(mmap.mmap(f.fileno(), 0,                                      access=mmap.ACCESS_READ)                            ) as m:        for match in pattern.findall(m):            print match[1].replace('\n', ' ')
                                执行结果:
$ python mmap_regex.pyNulla facilisi.Nulla feugiat augue eleifend nulla.
                                参考资料:mmap (http://docs.python.org/lib/module-mmap.html)

评论关闭