This CPA project provides an integration guide and tests for the Compound Page Allocator (CPA). CPA is an ION heap which allocates fixed-size large pages (> 4kB) from the system, which might not otherwise be guaranteed. Certain use-cases require larger contiguous regions of physical memory when mapped into a device IOMMU. For example, high resolution buffers require a greater number of page tables which can impact display hardware performance, especially when rotated. Also, memory might be protected at a fixed (and often large) physical address granule. Large pages can be allocated from a carveout but this region is exclusively reserved at boot-time and unavailable to the system. The Contiguous Memory Allocator (CMA) is a slight improvement but might incur delay while migrating pages back from the system.
CPA holds a configurable, pre-allocated large page pool to reduce the overhead of memory allocations. There are three water-marks (low/high and fill) and one asynchronous kernel thread to fill/drain the page pool. When the pool hits its low-mark, the asynchronous thread is triggered to start filling the page pool until it reaches the fill-mark level. When the page pool hits its the high-mark (due to allocations being freed), freed pages will be returned to the system instead of CPAs page pool. A memory shrink interface is also implemented in CPA. This interface is used when the system is running in a low memory situation, at which time pages in the pool are released back to the system, unlike the carveout heap.
The latest version of CPA (and its associated tests) should be integrated into a 3.18 Linux Kernel which contains ION. Porting might be required for other Linux Kernel versions.
CPA must be integrated in both the user-space and kernel-space. In order to allocate memory through CPA (as an ION heap), some platform specific integration is required within the ION driver. To enable use of CPA from an Android Gralloc HAL, the correct heap mask must be defined in user-space and match the kernel value. More information will be provided in the steps below.
The generic integration steps are as follows:
Obtain 3.18 kernel with ION and latest version of ion_compound_page.c (based on 3.18 kernel):
# Checkout 3.18 kernel from Linus' repo git clone https://github.com/torvalds/linux.git cd linux git checkout v3.18 # Copy ion_compound_page.c into ION cp ion_compound_page.c drivers/staging/android/ion/
Patch ION to fix issue with Juno DMA heap (commit):
patch --ignore-whitespace -p1 < ion.patch
--- a/drivers/staging/android/ion/ion.c +++ b/drivers/staging/android/ion/ion.c @@ -252,8 +252,19 @@ static struct ion_buffer *ion_buffer_create(struct ion_heap *heap, allocation via dma_map_sg. The implicit contract here is that memory comming from the heaps is ready for dma, ie if it has a cached mapping that mapping has been invalidated */ - for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) - sg_dma_address(sg) = sg_phys(sg); + for_each_sg(buffer->sg_table->sgl, sg, buffer->sg_table->nents, i) { + if(buffer && buffer->heap && buffer->heap->ops && buffer->heap->ops->phys) { + ion_phys_addr_t addr; + size_t len; + buffer->heap->ops->phys(buffer->heap, buffer, &addr, &len); + sg_dma_address(sg) = addr; + sg_dma_len(sg) = len; + } else { + sg_dma_address(sg) = sg_phys(sg); + sg_dma_len(sg) = sg->length; + } + } + mutex_lock(&dev->buffer_lock); ion_buffer_add(dev, buffer); mutex_unlock(&dev->buffer_lock);
NOTE: This is the minimum patch set required to run the tests, however other ION fixes from Linaro Juno 3.18 kernel (ION) might be desirable.
Integrate CPA in ION:
Add compound page configuration options to ion/Kconfig (see
ION_COMPOUND_PAGE
andION_COMPOUND_PAGE_STATS
):config ION_COMPOUND_PAGE bool "Ion compound page system pool" depends on ION help Enable use of compound pages (default to 2MB) system memory pool. Backs a buffer with large and aligned pages where possible, to ease TLB pressure and lessen memory fragmentation. config ION_COMPOUND_PAGE_STATS bool "Collect statistics for the compound page pool" depends on ION_COMPOUND_PAGE && DEBUG_FS help Collect extra usage statistics for the compound page pool. Available via ion's debugfs entry for the pool.
Add compound page allocator to ion/Makefile:
obj-$(CONFIG_ION_COMPOUND_PAGE) += ion_compound_page.o
Add
ION_HEAP_TYPE_COMPOUND_PAGE
toion_heap_type
enum (afterION_HEAP_TYPE_DMA
) in ION uapi drivers/staging/android/uapi/ion.hExpose compound page heap through ION core as per example:
Add following code to ion/ion_heap.c,
ion_heap_create
andion_heap_destroy
respectively:case ION_HEAP_TYPE_COMPOUND_PAGE: heap = ion_compound_page_pool_create(heap_data); break;
case ION_HEAP_TYPE_COMPOUND_PAGE: ion_compound_page_pool_destroy(heap); break;
Add following code to ion/ion_priv.h:
struct ion_heap *ion_compound_page_pool_create(struct ion_platform_heap *); void ion_compound_page_pool_destroy(struct ion_heap *);
Declare compound page platform data
ion_cpa_platform_data
in ion/ion.h:/** * struct ion_cpa_platform_data - settings for a cpa heap instance * @lowmark: Lowest number of items on free list before refill is * triggered * @highmark: Maximum number of item on free list * @fillmark: Number of items to target during a refill * @align_order: Order to round-up allocation sizes to * @order: Order of the compound pages to break allocations into * * Provided as the priv data for a cpa heap */ struct ion_cpa_platform_data { int lowmark; int highmark; int fillmark; int align_order; int order; };
Add compound page heap and platform data to ION device. There is no need to create a new device if dummy is already being used. See Juno example here: ion/juno/juno_ion_dev.c:
Add ION platform if one does not exist. This comprises a directory containing the following files:
<platform>/ ├── <platform>_ion_dev.c ├── <platform>_ion_driver.c ├── Makefile
Juno example can be found here
Define the compound page
ion_platform_heap
as follows:{ .id = ION_HEAP_TYPE_COMPOUND_PAGE, .type = ION_HEAP_TYPE_COMPOUND_PAGE, .name = "compound_page", .priv = &cpa_config, }
NOTE: ensure that the number of heaps
nr
inion_platform_data
is also updated.Define compound page platform data
ion_cpa_platform_data
as follows:static struct ion_cpa_platform_data cpa_config = { .lowmark = 8, .highmark = 128, .fillmark = 64, .align_order = 0, .order = 9, };
NOTE: These values should be tuned for the platform and might cause the system to run into low memory condition if the pool is set too large. low/fill/high marks and allocation/alignment page order can be modified as per
cpa_platform_data
declaration here
Enable ION and CPA in the kernel configuration:
Add following to .conf (e.g. *android.conf):
CONFIG_ION=y CONFIG_ION_<PLATFORM>=y CONFIG_ION_COMPOUND_PAGE=y CONFIG_ION_COMPOUND_PAGE_STATS=y
where
<PLATFORM>
is the name of the platform (e.g.CONFIG_ION_JUNO
).NOTE:
CONFIG_ION_COMPOUND_PAGE_STATS
is only required for statistics collection (useful when looking at CPA behaviour)Add configuration option to enable ION platform in ion/Kconfig:
config ION_<PLATFORM> bool "Ion for <PLATFORM>" depends on ION help ION support for <PLATFORM>.
where, for example,
<PLATFORM>
isJUNO
Add platform directory/files to ion/Makefile:
obj-$(CONFIG_ION_<PLATFORM>) += <platform>/
where, for example,
<PLATFORM>
isJUNO
and<platform>
isjuno
Update external/kernel-headers/original/uapi/linux/ion.h in Android tree with compound page heap info:
Add
ION_HEAP_TYPE_COMPOUND_PAGE
toion_heap_type
:NOTE: the order of heaps in
ion_heap_type
must match the kernel version of header drivers/staging/android/uapi/ion.hAdd compound page mask:
#define ION_HEAP_TYPE_COMPOUND_PAGE_MASK (1 << ION_HEAP_TYPE_COMPOUND_PAGE)
Run update_all.py to re-generate the bionic uapi bionic/libc/kernel/uapi/linux/ion.h
NOTE: if libclang-3.5.so can't be found by the script, try linking to the default (un-versioned):
ln -s <path-to-android-tree>/prebuilts/sdk/tools/linux/lib64/libclang.so <path-to-android-tree>/prebuilts/sdk/tools/linux/lib64/libclang-3.5.so
NOTE: check that the changes made in ion.h header are present in the newly generated header: bionic/libc/kernel/uapi/linux/ion.h
Copy the newly generated bionic ION uapi header to libion:
cp <path-to-android-tree>/bionic/libc/kernel/uapi/linux/ion.h <path-to-android-tree>/system/core/libion/kernel-headers/linux/ion.h
Use the ION compound page heap mask for all CPA allocations:
////// file: test.c #include <linux/ion.h> #if defined(ION_HEAP_TYPE_COMPOUND_PAGE_MASK) ret = ion_alloc(ion_client, size, 0, ION_HEAP_TYPE_COMPOUND_PAGE_MASK, 0, &ion_hnd); #else #error "Compound page heap mask not defined" #endif
#### file: Android.mk LOCAL_SHARED_LIBRARIES += libion
At present, CPA has only been integrated into a downstream version of the 3.18 Linux Kernel (Linaro Juno 3.18 kernel: https://git.linaro.org/landing-teams/working/arm/kernel.git/log/?h=arm-juno-mali-fpga). The generic instructions, above, are exemplified by the following commits for Juno:
- Add compound page heap to ION
- Expose CPA to ION core
- Add heap to Juno ION device
- Update CPA to v1
- Re-configure CPA for Juno ION device (to reflect changes in CPA)
WARNING: all steps are required since CPA was updated after first revision
(e.g. use of ion_platform_heap
private data for CPA heap changed between
3rd and 4th commit)
CPA is also able to show its current working state through the ION heap
debug_show
interface. This can be accessed by mounting debugfs
(/sys/kernel/debug, for example) and reading
/sys/kernel/debug/ion/heaps/ion_compound_page. This file contains
performance data and module working state.
The CPA logging system can be enabled via configuring
CONFIG_ION_COMPOUND_PAGE_STATS
in kernel.
Performance data and state, from sys file ion_compound_page
, will be
shown in the following format:
root@juno:/ # cat /sys/kernel/debug/ion/heaps/compound_page
client pid size
----------------------------------------------------
----------------------------------------------------
orphaned allocations (info is from last known client):
----------------------------------------------------
total orphaned 0
total 0
----------------------------------------------------
Free pool:
0 times depleted
0 page(s) in pool - 0 B (0)
0 partial(s) in use
Unused in partials - 0 B (0)
Partial bitmaps:
Shrink info:
Shrunk performed 0 time(s)
0 page(s) shrunk in total
Usage stats:
Max time spent to perform an allocation: 42452220 ns
Max time spent to allocate a single page from kernel: 32746920 ns
Soft alloc failures: 0
Hard alloc failures: 194
Allocations:
Total number of allocs seen: 1429
Live allocations: 0
Accumulated bytes requested: 2.84 GiB (3058683904)
Accumulated bytes committed: 2.84 GiB (3058683904)
Live bytes requested: 0 B (0)
Live bytes committed: 0 B (0)
Distribution:
0 page(s):
Total number of allocs seen: 3
Live allocations: 0
Accumulated bytes requested: 2.98 MiB (3133440)
Accumulated bytes committed: 2.98 MiB (3133440)
Live bytes requested: 0 B (0)
Live bytes committed: 0 B (0)
1 page(s):
Total number of allocs seen: 1425
Live allocations: 0
Accumulated bytes requested: 2.78 GiB (2988441600)
Accumulated bytes committed: 2.78 GiB (2988441600)
Live bytes requested: 0 B (0)
Live bytes committed: 0 B (0)
15 page(s):
Total number of allocs seen: 1
Live allocations: 0
Accumulated bytes requested: 64.0 MiB (67108864)
Accumulated bytes committed: 64.0 MiB (67108864)
Live bytes requested: 0 B (0)
Live bytes committed: 0 B (0)
NOTE: The compound page stats from above were captured immediately after running the CPA tests (as described below) on Juno platform.
There are two primary tests provided for CPA:
- Basic test
- Fragmentation problem test
Basic test:
This is a standalone user-space test to allocate compound pages through ION
with mask ION_HEAP_TYPE_COMPOUND_PAGE_MASK
. CPA allocations vary in size
and it's important to confirm that each allocation is fulfilled by the
ION CPA heap and that each 2MB granule is physically contiguous. All allocations
are passed to the kernel module to verify that the large page size is 2MB.
Furthermore, it's necessary to ensure that CPA fails to allocate memory
gracefully when system memory is exhausted. It's possible to check for errors
(such as process hang, kernel panic, etc.) in this situation. Android low memory
killer should be disabled during this test to ensure that the system doesn't
try and free memory during the test.
NOTE: Page order inion_cpa_platform_data
must be set to9
for this test (which expects 2MB pages, 4kB * 2^9) or the kernel module updated to match expected page order.
Fragment problem test:
This test validates that CPA behaves correctly when the system memory is heavily fragmented. CPA might fail to allocate large pages even if there is enough system memory free.
Test steps:
- Test kernel module implements a function to force memory system get into fragment situation by allocating and freeing pages.
- User space then attempts 200 times to allocate 2MB pages through CPA.
The test then records how many allocations failed. This shall be referred to
as
t1
. - The test then attempts to free 200 page units we hold in test kernel module.
- Step 2 is then repeated as the tests attempts to allocate 2MB pages 200
times, then records how many allocations failed, this shall be referred to as
t2
. - The test compares the failed times
t1&t2
. Ift2
is less thant1
, CPA was affected by fragmentation problem in step 2.
CPA test code contains two parts:
- User-space native application
- Kernel-space module
Compile Android native user-space application as follows:
# Setup Android build environment
cd <path-to-android-tree>
source build/envsetup.sh
lunch <lunch-combo>
# Link to CPA Tests user code
mkdir -p <path-to-android-tree>/vendor/<vendor>/
ln -sf <path-to-cpa-tests>/test/user <path-to-android-tree>/vendor/<vendor>/cpa-test-user
# Build user-space application
cd <path-to-android-tree>/vendor/<vendor>/cpa-test-user
mm
test_cpa_user
will be generated in
<path-to-android-tree>/out/target/product/<device-name>/system/bin/
Compile Kernel space module with:
export CROSS_COMPILE=<path-to-compiler>
export ARCH=<arch>
export KDIR=<path-to-kernel>
cd <path-to-cpa-tests>/test/kernel
make
where, for example:
<arch>
:arm
orarm64
<path-to-compiler>
:<path-to-aarch64-gcc>/bin/aarch64-linux-gnu-
for 64-bit Arm
test_cpa_kernel.ko
will be generated in current directory.
In order to achieve consistent results the following should be taken into account:
NOTE: Low Memory Killer should be disabled for all the tests. This can be
done through disabling kernel configuration option
CONFIG_ANDROID_LOW_MEMORY_KILLER
.
NOTE: In order to achieve consistent results the CPA page pool mechanism should be disabled by setting lowmark, highmark and fillmark in struct
ion_cpa_platform_data
to 0. The configuration should be like as following:struct ion_cpa_platform_data cpa_config = { .lowmark = 0, .highmark = 0, .fillmark = 0, .align_order = 0, .order = 9 };
NOTE: Before runningtest_cpa_user
, it's recommended to stop all Android user-space services by executingstop
command from the Android shell. This eliminates any dynamic effect of the Android system.
Copy user-space application and kernel module to Android system:
adb push <path-to-android-tree>/out/target/product/<device-name>/system/bin/test_cpa_user /system/bin/ adb push <path-to-cpa-tests>/test/kernel/test_cpa_kernel.ko <path-to-module>/
Insert kernel module:
insmod <path-to-module>/test_cpa_kernel.ko
Run application:
test_cpa_user
Example console output:
root@juno:/ # test_cpa_user
CPA test start!!!
===================Test 1 START===================.
1. Verify CPA, Alloc 1KB from CPA success.Verify CPA memory success.
2. Verify CPA, Alloc 1024KB from CPA success.Verify CPA memory success.
3. Verify CPA, Alloc 2048KB from CPA success.Verify CPA memory success.
4. Verify CPA, Alloc 2031KB from CPA success.Verify CPA memory success.
5. Verify CPA, Alloc 65536KB from CPA success.Verify CPA memory success.
6. Try to allocate from CPA until system mem exhausts.
>>> Successfully exhausted system memory and no error happened.
===================Test 1 END===================.
===================Test 2 START===================.
Start to simulate memory fragment in test-cpa module.
>>> After simulate memory fragment, system free memory: 5979 MB, test allocated memory: 589 MB.
>>> System is in 2MB memory fragment situation.
>>> Try to allocate 2MB physical contiguous memory from CPA for 200 times.
>>> Failed 194 times. 12MB memory allocated.
>>> Try to free 3200 KB pages in test-cpa module.
>>> After free some tests allocated page, system free memory: 6027 MB, test allocated memory: 577 MB.
>>> Try to allocate 2MB physical contiguous memory from CPA for 200 times.
>>> Failed 0 times.400 MB memory allocated.
>>> Expected result: CPA is affected by memory fragment problem.
Stop to simulate memory fragment in test-cpa module.
===================Test 2 END===================.
CPA test end!!!
This project is licensed under GPL-2.0.
Contributions are accepted under GPL-2.0. Only submit contributions where you have authored all of the code. Ensure that your employer, where applicable, consents.