python源码分析----内存分配(1)


上面的一篇粗略的介绍了一下python的对象结构,这篇来分析一个非常重要的部分,内存分配。。。

好像自己看的源代码,只要是跟C语言相关的,都在内存处理方面做了相当多的工作。。。。例如nginx,它也有实现自己的pool,python当然也不例外。。。。

 

python在内存分配上面分成了4个层次吧。。。

_____ ______ ______ ________
[ int ] [ dict ] [ list ] ... [ string ] Python core |
+3 | <----- Object-specific memory -----> | <-- Non-object memory --> |
_______________________________ | |
[ Python's object allocator ] | |
+2 | ####### Object memory ####### | <------ Internal buffers ------> |
______________________________________________________________ |
[ Python's raw memory allocator (PyMem_ API) ] |
+1 | <----- Python memory (under PyMem manager's control) ------> | |
__________________________________________________________________
[ Underlying general-purpose allocator (ex: C library malloc) ]
0 | <------ Virtual memory allocated for the python process -------> |

 

上面是直接从源代码里面copy出来的注释。。。。

(0)这个是最底层的C层面上的内存操作,也就是malloc和free了。。

(1)这个事python在C层面上的操作做了一层简单的封装。。例如PyMem_MALLOC,PyMem_FREE,它们是在malloc和free上面做了很简单的包装:

//这里对malloc,realloc与free做了简单的宏封装,对于malloc,如果为0,然么分配1
#define PyMem_MALLOC(n)		((size_t)(n) > (size_t)PY_SSIZE_T_MAX ? NULL \
				: malloc((n) ? (n) : 1))
#define PyMem_REALLOC(p, n)	((size_t)(n) > (size_t)PY_SSIZE_T_MAX  ? NULL \
				: realloc((p), (n) ? (n) : 1))
#define PyMem_FREE		free

#endif	/* PYMALLOC_DEBUG */

/*
 * Type-oriented memory interface
 * ==============================
 *
 * Allocate memory for n objects of the given type.  Returns a new pointer
 * or NULL if the request was too large or memory allocation failed.  Use
 * these macros rather than doing the multiplication yourself so that proper
 * overflow checking is always done.
 */

//通过感知type的大小来分内存
#define PyMem_New(type, n) \
  ( ((size_t)(n) > PY_SSIZE_T_MAX / sizeof(type)) ? NULL :	\
	( (type *) PyMem_Malloc((n) * sizeof(type)) ) )
#define PyMem_NEW(type, n) \
  ( ((size_t)(n) > PY_SSIZE_T_MAX / sizeof(type)) ? NULL :	\
	( (type *) PyMem_MALLOC((n) * sizeof(type)) ) )

 

上面还有New啥的,也就是扩展了类型大小的感知部分。。。。

(3)这一层就是最重点的部分了,PyObject_Malloc,PyObject_Free就属于这一层。。。。python就是在这一层实现了内存池,用于高效的进行内存分配。。。。(源代码集中在obmalloc.c里面)

 

几点python内存分配方面的常识:

(1)python的内存分配大体分为了两个部分,首先是小内存分配,这里主要是512字节以内,以及大于512字节的分配两类

(2)在内存分配方面按照8字节对齐的方式,例如要分配12字节的内存,其实最终将会占用16字节的大小

* Request in bytes Size of allocated block Size class idx
* ----------------------------------------------------------------
* 1-8 8 0
* 9-16 16 1
* 17-24 24 2
* 25-32 32 3
* 33-40 40 4
* 41-48 48 5
* 49-56 56 6
* 57-64 64 7
* 65-72 72 8
* ... ... ...
* 497-504 504 62
* 505-512 512 63

上面也是直接从源代码里面copy出来的注释,很鲜明的表现出了python内存分配方面的对齐策略。。。

 

接下来来看几个非常重要的宏定义:

 

#define ALIGNMENT               8               /* must be 2^N */
#define ALIGNMENT_SHIFT         3
#define ALIGNMENT_MASK          (ALIGNMENT - 1)

/* Return the number of bytes in size class I, as a uint. */
#define INDEX2SIZE(I) (((uint)(I) + 1) << ALIGNMENT_SHIFT)


#define SMALL_REQUEST_THRESHOLD 512 
#define NB_SMALL_SIZE_CLASSES   (SMALL_REQUEST_THRESHOLD / ALIGNMENT)   //==64


 //页大小4kb
#define SYSTEM_PAGE_SIZE        (4 * 1024)
#define SYSTEM_PAGE_SIZE_MASK   (SYSTEM_PAGE_SIZE - 1)



#define ARENA_SIZE              (256 << 10)     /* 256KB */

#ifdef WITH_MEMORY_LIMITS
#define MAX_ARENAS              (SMALL_MEMORY_LIMIT / ARENA_SIZE)
#endif

 //这里定义的pool的大小为4k
#define POOL_SIZE               SYSTEM_PAGE_SIZE        /* must be 2^N   4kb*/
#define POOL_SIZE_MASK          SYSTEM_PAGE_SIZE_MASK  //4*1024-1

上面的宏这里就不详细具体的说明了,,,它主要是确定了如下的信息

 

(1)一个Arena的大小为256KB(它用来管理pool)

(2)一个pool的大小为4kb

 

好了,接下来来看比较重要的pool的头定义:

//内存池头部
//通过szidx编号可以知道当前这个pool是用来分配多大大小的内存的pool
struct pool_header {
    union { block *_padding;
            uint count; } ref;          /* number of allocated blocks    */  //当前pool上面分配的block的数量
    block *freeblock;                   /* pool's free list head         */  //指向下一个可用的block,这里构成了一个链表, 它是一个离散的链表,很有意思
    struct pool_header *nextpool;       /* next pool of this size class  */   //通过这两个指针形成pool的双链表
    struct pool_header *prevpool;       /* previous pool       ""        */
    uint arenaindex;                    /* index into arenas of base adr */  //在arena里面的索引
    uint szidx;                         /* block size class index        */   //分配内存的类别,8字节,16或者。。。
    uint nextoffset;                    /* bytes to virgin block         */   //下一个可用的block的内存偏移量
    uint maxnextoffset;                 /* largest valid nextoffset      */  //最后一个block距离开始位置的距离
};

 

上面的注释非常详细的说明了各个字段的用处。。。另外这里可以看到有一个szidx字段,它与上面内存分配时候的内存对齐表上的szidx相对应,其实每一个pool都是用来分配固定大小的内存的,例如szidx为0,那么这个pool就是用来分配8字节的,szidx为1就是用来分配16个字节的。。。。这个以后看代码就能明白。。。

这个样子每个pool都只分配一种大小的内存块就方便的多了。。特别是对于内存偏移的计算都相当的方便

python在分配小内存的时候,是按照block的单位来进行分配的,例如szidx为0的pool,它的一个block大小就是8字节。。通过freeblock指针来形成一个block的离散的单链表(嗯,这个实现也是非常的trick,看了好久才看明白)。。。

(嗯,其实内存池这部分的实现还有很多的trick,尼玛。。。看这些trick的实现真心消耗脑细胞啊。。。擦。。只能怪自己这方面确实才疏学浅,,要看这么久才能理解。。。)

 

 

//这个可以理解为用来管理pool
struct arena_object {
    /* The address of the arena, as returned by malloc.  Note that 0
     * will never be returned by a successful malloc, and is used
     * here to mark an arena_object that doesn't correspond to an
     * allocated arena.
     */
    uptr address;     //指向分配的256kb的首地址,这里通过0来表明当前没有进行分配

    /* Pool-aligned pointer to the next pool to be carved off. */
    block* pool_address;

    /* The number of available pools in the arena:  free pools + never-
     * allocated pools.
     */
    uint nfreepools;            //可用的pool

    /* The total number of pools in the arena, whether or not available. */
    uint ntotalpools;       //在当前arena的pool的总数

    /* Singly-linked list of available pools. */
    struct pool_header* freepools;        //pool链表的头部

    /* Whenever this arena_object is not associated with an allocated
     * arena, the nextarena member is used to link all unassociated
     * arena_objects in the singly-linked `unused_arena_objects` list.
     * The prevarena member is unused in this case.
     *
     * When this arena_object is associated with an allocated arena
     * with at least one available pool, both members are used in the
     * doubly-linked `usable_arenas` list, which is maintained in
     * increasing order of `nfreepools` values.
     *
     * Else this arena_object is associated with an allocated arena
     * all of whose pools are in use.  `nextarena` and `prevarena`
     * are both meaningless in this case.
     */
    struct arena_object* nextarena;
    struct arena_object* prevarena;
};

上面这个是另外一个非常重要的结构,可以理解为它是用来管理pool的,它的address指针将会指向一个分配的256kb内存,pool将会在这个上面产生。。。。。

 

 

接下来先来看看Arena结构的创建过程吧:

 //分配一个arena_object,其实这个也是做了缓存的
static struct arena_object*
new_arena(void)
{
    struct arena_object* arenaobj;    //这里先创建一个arena的指针
    uint excess;        /* number of bytes above pool alignment */
    void *address;               //如果新创建的话,这个用来指向申请的256KB内存        
    int err;

#ifdef PYMALLOC_DEBUG
    if (Py_GETENV("PYTHONMALLOCSTATS"))
        _PyObject_DebugMallocStats();
#endif
    if (unused_arena_objects == NULL) {   //当前没有可用的arena,那么这里需要创建
        uint i;
        uint numarenas;
        size_t nbytes;

        /* Double the number of arena objects on each allocation.
         * Note that it's possible for `numarenas` to overflow.
         */
         //最开始maxarenas为0,也就是说第一次创建Arena结构的时候,就将会一次性创建16个,以后直接翻倍
        numarenas = maxarenas ? maxarenas << 1 : INITIAL_ARENA_OBJECTS;  //INITIAL_ARENA_OBJECTS=16
        if (numarenas <= maxarenas)   //出现这种情况,只能说尼玛,这都能整形溢出 啊
            return NULL;                /* overflow */
#if SIZEOF_SIZE_T <= SIZEOF_INT
        if (numarenas > PY_SIZE_MAX / sizeof(*arenas))
            return NULL;                /* overflow */
#endif
        nbytes = numarenas * sizeof(*arenas);    //接下来分配arenas结构体所需要的内存
        arenaobj = (struct arena_object *)realloc(arenas, nbytes);   //分配内存地址
        if (arenaobj == NULL)
            return NULL;
        arenas = arenaobj;

        /* We might need to fix pointers that were copied.  However,
         * new_arena only gets called when all the pages in the
         * previous arenas are full.  Thus, there are *no* pointers
         * into the old array. Thus, we don't have to worry about
         * invalid pointers.  Just to be sure, some asserts:
         */
        assert(usable_arenas == NULL);
        assert(unused_arena_objects == NULL);

        /* Put the new arenas on the unused_arena_objects list. */
        //这里相当于是初始化刚刚创建的arena结构体
        for (i = maxarenas; i < numarenas; ++i) {
            arenas[i].address = 0;       //通过将这个地址赋值为0,表示当前arena没有分配可用的内存
            arenas[i].nextarena = i < numarenas - 1 ?
                                   &arenas[i+1] : NULL;
        }

        unused_arena_objects = &arenas[maxarenas];  //这里将unused_arena_objects指向当前可用的第一个
        maxarenas = numarenas;
    }

    /* Take the next available arena object off the head of the list. */
    assert(unused_arena_objects != NULL);
    arenaobj = unused_arena_objects;
    unused_arena_objects = arenaobj->nextarena;  //将unused_arena_objects指针指向下一个arena结构
    assert(arenaobj->address == 0);
    //接下来分配数据内存
#ifdef ARENAS_USE_MMAP
    address = mmap(NULL, ARENA_SIZE, PROT_READ|PROT_WRITE,
                   MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
    err = (address == MAP_FAILED);
#else
    address = malloc(ARENA_SIZE);   //这里分配256字节的内存
    err = (address == 0);
#endif    
    if (err) {
        /* The allocation failed: return NULL after putting the
         * arenaobj back.
         */
        arenaobj->nextarena = unused_arena_objects;
        unused_arena_objects = arenaobj;
        return NULL;
    }
    //将Arena结构的address指向分配的地址
    arenaobj->address = (uptr)address;

    //更新计数器
    ++narenas_currently_allocated;
#ifdef PYMALLOC_DEBUG
    ++ntimes_arena_allocated;
    if (narenas_currently_allocated > narenas_highwater)
        narenas_highwater = narenas_currently_allocated;
#endif
    arenaobj->freepools = NULL;   //这里pool头部指针设置为null
    /* pool_address <- first pool-aligned address in the arena
       nfreepools <- number of whole pools that fit after alignment */
    arenaobj->pool_address = (block*)arenaobj->address;
    arenaobj->nfreepools = ARENA_SIZE / POOL_SIZE;    //其实这里是64个可用的pool,正好对象64中类型的小内存分配
    assert(POOL_SIZE * arenaobj->nfreepools == ARENA_SIZE);
    //下面是做一次内存对齐,最终保证pool_address的地址是4kb的整数倍,这个主要是方便以后内存计算
    excess = (uint)(arenaobj->address & POOL_SIZE_MASK);
    if (excess != 0) {   //这个意思是当前的内存地址不是4kb的整数倍,那么需要进行一次pool地址的对齐
        --arenaobj->nfreepools;   //可用数-1
        arenaobj->pool_address += POOL_SIZE - excess;
    }
    arenaobj->ntotalpools = arenaobj->nfreepools;   //总共可用的pool的数量

    return arenaobj;
}

 

嗯,代码上面的注释应该说的很清楚了吧。。。。创建结构体对象,然后分配内存,用adress指针来指向。。

初始化能分配的pool的数量,以及起始的地址。。。

 

好了。。。今天就先写到这吧。。。好晚了。。。感觉上班还真心有点累啊。。。。。还是在学校轻松。。。

明天在来分析最为重要的PyObject_Malloc和PyObject_Free两个函数吧。。。

评论关闭