Coder Social home page Coder Social logo

stl's People

Contributors

dutor avatar

stl's Issues

STL_allocator note

STL_allocator note (not finished yet)


Fundamental

Allocation is implemented by alloc:allocate, memory free is implemented by alloc:deallocate(); constructing objects is implemented by ::construct(), and destructing objects is implemented by ::destory().
The whole process is divided into 4 parts:
memory allocation ---> construct objects --> destruct objects --> free memory
In header file <memory>, there's such declaration.

#include <stl_alloc.h>      // allocate & free
#include <stl_construct.h>  // construction & destroy

1
(Pictures below were all from a chinese book《STL源码剖析》by Houjie)


allocate() & deallocate()

两级配置。第一级配置直接通过malloc()实现,而第二级配置是通过 内存池 实现。
内存池

construct() & destroy()

destroy有多个版本。
第一个版本接收指针,并析构掉指针所指范围的对象;
第二个版本接受两个迭代器我们并不知道这个范围有多大,万一范围很大而每个对象的析构函数都是默认析构,那么一次次调用这些没啥用的析构函数对凶啊率有明显下降。所以这里先利用value_type获得迭代器所指的对象类型,再用__type_traits<T>判断这个类型对应的析构函数是否是默认析构。如果是true_type就什么都不做;如果是false_type,就循环访问整个范围,并在循环的过程中调用第一个版本的析构。

template <class _ForwardIterator>
void __destroy_aux(_ForwardIterator __first, _ForwardIterator __last, __false_type) {
  for ( ; __first != __last; ++__first)
    destroy(&*__first);
}

2

String Vector

Introduction

A text query program, using a string vector, a simplification of the library vector class. So it's just for practicing, with a little practical significance.
This program has two parts: strVector and TextQuery.
You can edit file.txt and query some word , just have a try!

strVector

Class design

Class strVector preallocates enough storage named capacity() to hold more elements than needed. And those storgae will be constructed until we need them. Constructed storage is named size().
Each action to add element checks if there is enough space available in capacity(). If not, we need to reallocate more storgae(2 times than before) and move elements to new strogae.
To obtain acceptable performance, we need an allocator to get raw memeory, construct/destory them, and free them at last.

[ 0 ][ 1 ][ 2 ][ 3 ][  unconstructed   ]
  ^                   ^                ^
elements           first_free         cap

Each StrVec will have three pointers into the space it uses for its elements:

  • elements, which points to the first element in the allocated memory
  • first_free, which points just after the last actual element
  • cap, which points just past the end of the allocated memory

Memory manage

constructor and destructor

When a strVector is constructed first time, elements, first_free and cap are all set to nullptr. And when destructed, we call member free().
Member free() need to destory element and deallocate memory.
We should check if element is nullptr, which means the vector is empty.
For convenience, destory begins at first_free. And because block that first_free points to is unconstructed, we need to move it forward before destorying.

[ 1 ][ 2 ][  ][  ]
      ^    ^
     --p   p
void strVector::free() {
  if (elements != nullptr) {
    for (auto p = first_free; p != elements; /* empty */)
      alloc.destroy(--p);
    alloc.deallocate(elements, capacity());
  }
}

element access

Recall vector in standard library, we can use an iterator to access element from both sides. begin() points to the first element, end() points to next block of the last element.
Add and remove element are restricted from back. Take push_back() as an example, we need to check if there is enough memeory. If so, we construct first_free and move it to next block.

void strVector::push_back(const string& s) {
  check_alloc();
  alloc.construct(first_free++, s);
}

element move

In member reallocate(), we need move element from old strVector to new strVector. But copying the data in these strings is unnecessary. Our StrVec’s performance will be much better if we can avoid the overhead of allocating and deallocating the strings themselves each time we reallocate.
To avoid copying, we use std::move, which is defined in <utility>

Text Query

Class design

Targets

  1. When it reads the input, it should remember every line that every word appears.
  2. When it generates output,
  3. The program must be able to fetch the line numbers associated with a given
    word
  4. The line numbers must appear in ascending order with no duplicates
  5. The program must be able to print the text appearing in the input file at a given line number.

Data structure

  • We use strVector to store every line of the text file and std::isstringstream to desperate word.
  • We use std::set to store the line number, with which we don't need to sort the line number.
  • We use std::map to associate a word with its line number(std::set).

Class declaration

To query word, we define a class TextQuery. It obtains a std::map and a std::vector. So searching finishes after the class being constructed.
And we define another class QueryResult to print search answer. It overrides the operator <<.

Details of definition

share_ptr

QueryResult need to express the searching answer, which is saved in TextQuery. If we copy a duplicate, it wastes time and space, which may cause a copy of the whole file. And it's not safe to return a pointer/iterator to another class, so share_ptr couldn't be a better choice. Auto pointers in both class pointed to the same storage, and it'll auomatically free memory.

class TextQuery {
/* other members */
  map<string, shared_ptr<set<line_no>>> file_map;
  shared_ptr<strVector> fptr_vec;
};

class QueryResult {
/* other members */
  shared_ptr<strVector> fptr_vec;
  shared_ptr<set<line_no>> lines;

Reference

Knowledge about allocator

<cstddef>

ptrdiff_t 用于被用于表示两个指针之差。
size_t 被用于表示任意一个对象(或者数组)的最大大小。
函数 size_t offsetof(type, number) 表示number在数据结构type中距离开始处的位置。
This macro with functional form returns the offset value in bytes of member member in the data structure or union type type. The value returned is an unsigned integral value of type size_t with the number of bytes between the specified member and the beginning of its structure.

std::cerr

std::cerr对应标准错误流,用于显示错误消息。默认情况下被关联到标准输出流,但它不被缓冲,也就说错误消息可以直接发送到显示器,而无需等到缓冲区或者新的换行符时,才被显示。一般情况下不被重定向。

void*

void指针可以指向任意类型的数据,就是说可以用任意类型的指针对void指针对void指针赋值。如果要将void指针p赋给其他类型的指针,则需要强制类型转换,就本例而言:a=(int *)p

在内存的分配中我们可以见到void指针使用:内存分配函数malloc()函数返回的指针就是void*型,用户在使用这个指针的时候,要进行强制类型转换,也就是显式说明该指针指向的内存中是存放的什么类型的数据(int*)malloc(1024)表示强制规定malloc返回的void*指针指向的内存中存放的是一个个的int型数据。

new_handler

new_handler set_new_handler (new_handler new_p) throw();
The new-handler function is a function which is called by the default allocation functions (operator new and operator new[]) when they fail to allocate storage.

Return value
The value of the current new-handler function if this has already been set by this function previously, or a null-pointer if this is the first call to set_new_handler (or if it was reset by a previous call).

C++中的new的主要工作如下:

  1. 调用malloc函数分配内存空间
  2. 如果是new一个对象的话,会调用改对象的构造函数进行初始化操作
  3. 返回对象的指针

iterator

Input Iterator:只能单步向前迭代元素,不允许修改由该类迭代器引用的元素。
Output Iterator:该类迭代器和Input Iterator极其相似,也只能单步向前迭代元素,不同的是该类迭代器对元素只有写的权力。
Forward Iterator:该类迭代器可以在一个正确的区间中进行读写操作,它拥有Input Iterator的所有特性,和Output Iterator的部分特性,以及单步向前迭代元素的能力。
Bidirectional Iterator:该类迭代器是在Forward Iterator的基础上提供了单步向后迭代元素的能力。
Random Access Iterator:该类迭代器能完成上面所有迭代器的工作,它自己独有的特性就是可以像指针那样进行算术计算,而不是仅仅只有单步向前或向后迭代。
这五类迭代器的从属关系,如下图所示,其中箭头A→B表示,A是B的强化类型,这也说明了如果一个算法要求B,那么A也可以应用于其中。

  input output  
    \ /  
  forward  
     |  
bidirectional  
     |  
random access  

new (p2) MyClass
does not allocate memory -- calls: operator new (sizeof(MyClass),p2)
but constructs an object at p2

Iterator traits

Traits class defining properties of iterators.
member description
difference_type Type to express the result of subtracting one iterator from another.
value_type The type of the element the iterator can point to.
pointer The type of a pointer to an element the iterator can point to.
reference The type of a reference to an element the iterator can point to.
iterator_category The iterator category. It can be one of these:
input_iterator_tag
output_iterator_tag
forward_iterator_tag
bidirectional_iterator_tag
random_access_iterator_tag

</tr>
<tr>
    <td>value_type</td>
    <td>The type of the element the iterator can point to.</td>
</tr>
<tr>
    <td>pointer</td>
    <td>The type of a pointer to an element the iterator can point to.</td>
</tr>
<tr>
    <td>reference</td>
    <td>The type of a reference to an element the iterator can point to.</td>

<tr>
    <td>iterator_category</td>
    <td>The iterator category. It can be one of these:  
    - input_iterator_tag  
    - output_iterator_tag  
    - forward_iterator_tag  
    - bidirectional_iterator_tag  
    - random_access_iterator_tag</td>  
</tr>
**member** **description**
difference_type Type to express the result of subtracting one iterator from another.

type traits

在STL中为了提供通用的操作而又不损失效率,我们用到了一种特殊的技巧,叫traits编程技巧。具体的来说,traits就是通过定义一些结构体或类,并利用模板类特化和偏特化的能力,给类型赋予一些特性,这些特性根据类型的不同而异。在程序设计中可以使用这些traits来判断一个类型的一些特性,引发C++的函数重载机制,实现同一种操作因类型不同而异的效果。traits的编程技巧极度弥补了C++语言的不足 。
(还是没有搞太懂...reference from cplusplus

Memory pool

Memory pool

Erc23


In STL, memory allocation is divided to 2 level. It means there are two versions of allocator. And we call it sub-allocation.

Level 1:

In level 1, allocate() uses malloc() directly, and deallocate() uses free() directly.

However, a new operation is expensive for it cost a lot of time to traversal the free blocks of the system heap and choose the best blocks for allocation. And sometimes, we allocate memory and free it, then we allocate some memory with nearly the same size again. Then we're wondering if we can use a pool to store those memory so that we don't need allocate from system every time.

Level 2:

In level 2, we build a memory pool to store those idle memory. This pool is not so big and if the allocation is huge, we have to allocate from system. When some memory need retrieving, we put it to the memory pool instead of giving it back to system.
Memory pool is just like your purse. You put some money to your purse so that you don't need to go to bank every time you need money. But if you need much, you have to go to bank.


So the allocation is divided into 3 steps:
Step 1: check if there's enough (128 bytes in standard) memory in memory pool (we get memory in memory pool by malloc()).
Step 2: if so, allocate memory from memory pool and finish. if not, go to Step 3.
Step 3: use malloc() to allocate memory from system.

And the deallocation is divided into 2 steps:
Step 1: put it to the memory pool
Step 2: after the whole program done, we free() all the memory in memory pool.


So how to build a memory pool get the memory in memory pool? I can only tell you how to build and get in STL but it's still difficult for me to build one. (I'm glad if someone could share your idea about it). I'll show a simple version of memory pool at the end of this article.

How to manage a memory pool (not finished yet)

In STL, a 16-nodes linked list is used to realize the memory pool. Nodes of the list is created through union. It means that if some nodes are used, they will be removed out of the linked list. That's brilliant!

union obj {
  union obj* free_list_link;
  char client_data[1];  // the client sees this
};

1

A simple version of memory pool

This is a very simple version of memory pool. It can only allocate/deallocate one block once and those blocks must be the same size. You can view the code in StackAlloc.hpp if you want to know more about the limit of this memory manager.
The core tasks of our memory pool are:

  1. Use memory pool to allocate
// decide if the memory pool is empty
if (_M_pool == NULL) {
  Node* temp;
  // allocate one block of memory using operator new
  void* p = operator new(sizeof(_Tp));
  // assert this node to memory pool's linked list
  temp = reinterpret_cast<Node* >(p);
  temp->next = _M_pool;
  _M_pool = temp;
}
// if the memory pool is not empty
// convert type Node* to pointer
pointer result = reinterpret_cast<pointer>(_M_pool);
_M_pool = _M_pool->next;
return result;
  1. Gather memory when deallocate (which will be free later)
// gather it directly
if (__p != NULL) {
  // type conversion
  reinterpret_cast<Node *>(__p)->next = _M_pool;
  _M_pool = reinterpret_cast<Node *>(__p);
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.