1a1a11a / libcachesim Goto Github PK

View Code? Open in Web Editor NEW

155.0 7.0 34.0 79.35 MB

a high performance library for building cache simulators

License: GNU General Public License v3.0

CMake 2.15% C++ 33.14% C 55.39% Shell 0.14% Python 9.18%

lru cache simulation performance cache-simulation fifo benchmark cache-simulator cachelib memcached

libcachesim's People

Contributors

Stargazers

Watchers

libcachesim's Issues

[ERROR] 11-26-2023 21:28:11 cache_init.h:160 (tid=140522033927552): do not support algorithm glcache

hi~当我运行以下命令行
root@Samuel:/home/sz/libCacheSim/_build# ./bin/cachesim ../data/trace.vscsi vscsi glcache auto
我会显示报错：
[ERROR] 11-26-2023 21:28:11 cache_init.h:160 (tid=140522033927552): do not support algorithm glcache
请问应该如何解决？

Is it possible to support obj size with unit_64 type?

Hi,

I'm using this excellent repo, and many modern traces have large obj sizes. Would it be possible to support the uint_64 type of obj size in the next version?

Thank you so much!

How do you warm up?

[Juncheng Yang, Yao Yue, Rashmi Vinayak, A large scale analysis of hundreds of in-memory cache clusters at Twitter. 14th USENIX Symposium on Operating Systems Design and Implementation (OSDI 20), 2020]

In this paper, it says that I warmed up 5 days in Sector 6.2 Simulation Setup, but I don't know how to warm up traces except for open_trace when I used a simulator. How can I warm up while using Quickstart? I downloaded the trace used in the current experiment.

a bug in LFU-DA

example command

./bin/cachesim ../_build/data.csv csv lfuda 0.8,0.9 -t "time-col=1,obj-id-col=3,obj-size-col=4,delimiter=,,obj-id-is-num=0"

example data

# timestamp (ms), op type [Put:0, Get:1, Delete:2, Head: 3, Copy: 4], object id, object size (byte)
0,1,A,8371
103,1,B,31
159,1,C,3668
315,1,D,15572
418,1,E,62099
529,1,F,7621
881,1,G,1316252
956,1,H,104330
1026,1,I,4778

Using the Setup a binary reader in the libCache SimQuickstart document

Thank you for your answer to the last issue. I want to use the same way as binary trace, but there is a problem.

reader_init_param_t init_params_bin = {};
reader = setup_reader(filename, BIN_TRACE, OBJ_ID_NUM, &init_params_bin);

If you look at quickstart.md,
reader_init_param_t init_params_bin = {.binary_fmt="<3I2H2Q", .obj_size_field=2, .obj_id_field=6, };

You gave me an example like this, but I don't understand exactly what to write on binary_fmt and field. Can you help me?

Questions about running test.c

Hello

After installing it along with README.md, it runs as a test.out file
cache_t *cache = create_cache(name, common_cache_params, cache_specific_params);
LRU, FIFO, and LFU work well here, but if you insert Random and LHD, you will get the error "cannot load internal cache" How can I write down the name to use Random and LHD?

Set but unused variable

ccache_params_copy in Random.c is set but not used thereafter:

libCacheSim/libCacheSim/cache/eviction/Random.c

Lines 48 to 51 in eeae989

    
           common_cache_params_t ccache_params_copy = ccache_params; 
        
           ccache_params_copy.hashpower = MAX(12, ccache_params_copy.hashpower - 8); 
        
           cache_t *cache = cache_struct_init("Random", ccache_params, cache_specific_params);

I suspect the ccache_params on line 51 wants to be ccache_params_copy instead?

ETA: Also

libCacheSim/libCacheSim/cache/eviction/RandomTwo.c

Lines 49 to 53 in eeae989

    
           common_cache_params_t ccache_params_copy = ccache_params; 
        
           ccache_params_copy.hashpower = MAX(12, ccache_params_copy.hashpower - 8); 
        
           cache_t *cache = 
        
               cache_struct_init("RandomTwo", ccache_params, cache_specific_params);

Understanding the MRC curves

Hi,

I am currently using this library to understand the eviction algorithms that I should be exploring. However, both the MRCs over size and MRCs over time seem to indicate that Belady is the worst performer. Am I missing something here?
MRC over sizes:

MRC over time:

The commands that I have used to obtain the plots are as follows (similar to those found in https://github.com/1a1a11a/libCacheSim/tree/develop/scripts#plot-miss-ratio-curves, with belady added as an additional algorithm):

# plot miss ratio over sizes 
python3 plot_mrc_size.py \
--tracepath ../data/twitter_cluster52.csv --trace-format csv \
--trace-format-params="time-col=1,obj-id-col=2,obj-size-col=3,delimiter=,,obj-id-is-num=1" \
--algos=fifo,lru,lecar,s3fifo,belady

# plot miss ratio over time
python3 plot_mrc_time.py \
--tracepath ../data/twitter_cluster52.csv --trace-format csv \
--trace-format-params="time-col=1,obj-id-col=2,obj-size-col=3,delimiter=,,obj-id-is-num=1" \
--algos=fifo,lru,lecar,s3fifo,belady \
--report-interval 120

Documentation bug

The documentation in develop/doc/advanced_lib_extend.md says:

Add command line option in bin/cachesim/cli.c so that you can use cachesim binary.

There isn't such a file. I think one needs to add the new algorithm to the doc array in cli_parser.c and the string comparison clause with the new initialization function in cache_init.h.

Implementing shards

https://www.usenix.org/conference/fast15/technical-sessions/presentation/waldspurger

Unable to run csv traces that have alphabetical object IDs

Hi,

I have traces of the following format (with alphabetical object IDs):

# timestamp (ms), op type [Put:0, Get:1, Delete:2, Head: 3, Copy: 4], object id, object size (byte)
0,1,A,8371
103,1,B,31
159,1,C,3668
315,1,D,15572
418,1,E,62099
529,1,F,7621
881,1,G,1316252
956,1,H,104330
1026,1,I,4778

And I tried to run this command:

python3 plot_mrc_size.py --tracepath ../../../shared/IBMObjectStorageTrace/IBMObjectStoreTrace018Part0-sliced-compressed --trace-format csv --trace-format-params="time-col=1,obj-id-col=3,obj-size-col=4,delimiter=,,obj-id-is-num=0" --algos=fifo,lru,clock,slru,lfu,lfuda,arc,twoq,gdsf,hyperbolic,lecar,cacheus,lhd,qdlp,s3fifo,sieve --sizes=0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9

However, I am faced with the following error:

Traceback (most recent call last):
  File "plot_mrc_size.py", line 229, in <module>
    plot_mrc_size(mrc_dict,
  File "plot_mrc_size.py", line 116, in plot_mrc_size
    first_size = int(list(mrc_dict.values())[0][0][0])
IndexError: list index out of range

Could you give me some pointers as to how I can fix this issue and run my trace?

Error when running the cacheCluster example

There's an error (see picture) I am getting when I try to run the cacheCluster example.

Question on Op in vscsi trace

i have one question，that is，when i find the data structure of vscsi and print the trace(length and offset),
typedef struct {
uint32_t sn;
uint32_t len;
uint32_t nSG;
uint16_t cmd;
uint16_t ver;
uint64_t lbn;
uint64_t ts;
} trace_v1_record_t;
which element represents read or write
i found that the simulator do not use the op(w or r), thanks for explaination!
Thank you so much

how to read OracleGeneral trace，how to transfrom from csv to it

i use ztsd to decompress msr_hm_0.oracleGeneral.zst to msr_hm_0.oracleGeneral，i cannot see the exact trace，i use vim，and found full of Garbled characters，can you tell me how to read OracleGeneral trace，how to transfrom from csv to it.
thanks

New version which does not use plugin to implement cache

Hi.

As I said before, you said you would update the new version that does not implement the cache using the plug-in, but where did the update come from? I don't think I can find it.

v

Trace analyzer space usage

Hi folks, I am using the trace analyzer for reuse distance on a 371MB trace. The analyzer keeps running, but can never finish, because it runs out of disk space. For example, last time I tried my .reuseWindow_w300_rt file grew to 91GB, filling out the remaining space in my file system. At that point, trace analyzer hung, not quitting or producing any info messages.

Is it normal to generate 91GB+ of .reuseWindow_w300_rt for a 371MB trace? Is there a way to run the trace analyzer differently so that it actually completes?

Cannot compile test.c by the gcc command in README

Got multiple errors and warnings while compiling test.c in the root directory of this repo by gcc $(pkg-config --cflags --libs libCacheSim glib-2.0) -lm -ldl test.c -o test. Here are the error messages:

gcc $(pkg-config --cflags --libs libCacheSim glib-2.0) -lm -ldl test.c -o test
test.c:6:20: error: initializer element is not constant
    6 | reader_t *reader = open_trace("data/trace.vscsi", VSCSI_TRACE, OBJ_ID_NUM, NULL);
      |                    ^~~~~~~~~~
test.c:9:18: error: initializer element is not constant
    9 | request_t *req = new_request();
      |                  ^~~~~~~~~~~
test.c:13:18: error: initializer element is not constant
   13 | cache_t *cache = create_cache("LRU", cc_params, NULL);
      |                  ^~~~~~~~~~~~
test.c:19:1: error: expected identifier or ‘(’ before ‘while’
   19 | while (read_one_req(reader, req) == 0) {
      | ^~~~~
test.c:27:1: warning: data definition has no type or storage class
   27 | close_trace(reader);
      | ^~~~~~~~~~~
test.c:27:1: warning: type defaults to ‘int’ in declaration of ‘close_trace’ [-Wimplicit-int]
test.c:27:1: warning: parameter names (without types) in function declaration
test.c:28:1: warning: data definition has no type or storage class
   28 | free_request(req);
      | ^~~~~~~~~~~~
test.c:28:1: warning: type defaults to ‘int’ in declaration of ‘free_request’ [-Wimplicit-int]
test.c:28:1: warning: parameter names (without types) in function declaration
test.c:28:1: error: conflicting types for ‘free_request’
In file included from /usr/local/include/libCacheSim/cacheObj.h:9,
                 from /usr/local/include/libCacheSim/cache.h:17,
                 from /usr/local/include/libCacheSim.h:13,
                 from test.c:3:
/usr/local/include/libCacheSim/request.h:71:20: note: previous definition of ‘free_request’ was here
   71 | static inline void free_request(request_t *req) {
      |                    ^~~~~~~~~~~~
test.c:29:6: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘->’ token
   29 | cache->cache_free(cache);
      |      ^~

This can be fixed by adding the main function in test.c. Also, I think it is a better practice to specify the build output (gcc -o option) as a different name instead of test, since a directory named as test already exists, which might occur cannot open output file test: Is a directory error.

Glib-ERROR

The following error occurred when I opened trace while running the program.

What kind of problem?

(process:8155): GLib-ERROR **: 10:10:40.250: ../../../../glib/gmem.c:105: failed to allocate 49463296 bytes
Trace/breakpoint trap (core dumped)

Building on MacOS BigSur

Building on MacOS BigSure required installing libomp:

brew install libomp.

Question: Understanding Units in Trace Data

Hi，
i want to ask 2 question，
1.
in trace.csv, the unit of every cols is
version, time(ms or s ?), op, size(byte), lbn(512B,sector)?
2.
struct {
uint32_t timestamp;
uint64_t obj_id;
uint32_t obj_size;
int64_t next_access_vtime; // -1 if no next access
}
what are the units of timestamp and next_access_vtime, if we have timestamp, why we need next_acess_vtime

thanks

GLCache bug

Documentation error

Your build instructions on the front page say:

git clone https://github.com/1a1a11a/libCacheSimPrv

That repo name is not the name of your repo. I think you need to drop the "Prv" suffix.

S3Fifo article feedback

I attempted to reproduce the hit rate results in your blog article, where you observed that TinyLFU had lower performance than S3Fifo. I ran your workloads against the reference implementation, Caffeine.

Background

Caffeine uses the adaptive windowing technique, so this does not fully discount your observations. The paper recommended a 1% window as a starting point, since that works well in many critical workloads like database, search, and analytics. It concludes by showing workloads where a larger window is required and that dynamically adjusting this is left for a future work. That subsequent work to evaluate approaches does appear to correct this deficiency and is most often used in practice (available in Java, Rust, Go, Python, TypeScript).

Caffeine starts with a 1% window, monitors the hit rate over the TinyLFU sample period, and uses hill climbing to adjust the window size. The number of frequency counters is not known upfront, so like a hash table the capacity grows based on the entry count. This resizing loses the popularity scores and the policy may have started at a poor window size. Thus, like a JIT compiler, during this warmup period the hit rate may be lower but it quickly reaches peak performance so that this becomes noise except in very short runs. The admission decisions vary slightly between runs due to jitter introduced to mitigate hash flooding attacks.

Analysis

I ran the two simulators side-by-side by comparing the LRU hit rates to ensure that the traces are being evaluated correctly. Caffeine's simulator does not have rich variable size support (entry's weight) as this is less common in Java due to the object's heap usage being mostly hidden from the developer. There is a fork that extends support in order to evaluate size-aware TinyLFU strategies (research paper) where they proposed an adaption to further improve the hit rate. I hope to someday evaluate that proposal and incorporate the improvements.

I used libcachesim at 0f4d135 (current master) and Caffeine at cb5f75d (current master) with this patch to include the new trace formats.

In some traces the simulators report different results for FIFO and LRU by up to 0.25%. I determined that this was because libcachesim does not update the entry's size on a cache hit but does record the byte hit with the new weight.

trace.csv

./cachesim ../../data/trace.csv csv fifo,lru,qdlp,s3fifo,sieve 1gb -t "time-col=2, obj-id-col=5, obj-size-col=4"

	Hit Rate	Byte Hit Rate
FIFO	36.64% (63.36% MR)	26.77% (73.23% MR)
LRU	36.96% (63.04% MR)	27.14% (72.86% MR)
QDLP	43.43% (56.57% MR)	35.99% (64.01% MR)
S3Fifo	38.22% (61.78% MR)	29.10% (70.90% MR)
Sieve	43.47% (56.53% MR)	36.03% (63.97% MR)
Caffeine	47.07% (52.93% MR)	38.22% (61.78% MR)

Caffeine leads Sieve (the best algorithm) by 3.60% HR / 2.19% BHR, and S3Fifo by 8.85% HR / 9.12% BHR.

twitter_cluster52.csv

./cachesim ../../data/twitter_cluster52.csv csv fifo,lru,qdlp,s3fifo,sieve 1mb -t "time-col=1, obj-id-col=2, obj-size-col=3"

	Hit Rate	Byte Hit Rate
FIFO	70.65 (29.35% MR)	68.89% (31.11% MR)
LRU	73.13% (26.87% MR)	71.53% (28.47% MR)
QDLP	75.63% (24.37% MR)	74.82% (25.18% MR)
S3Fifo	76.66% (23.34% MR)	75.99% (24.01% MR)
Sieve	74.86% (25.14% MR)	73.88 (26.12% MR)
Caffeine	75.91% (24.09% MR)	75.09% (24.91% MR)

Caffeine is beaten by S3Fifo (the best algorithm) by -0.75% HR / -0.90% BHR. That is very different from the -10% difference that was observed in the blog article.

hm_0.csv.gz

./cachesim ../../data/hm_0.csv csv fifo,lru,qdlp,s3fifo,sieve 1gb -t "time-col=2, obj-id-col=5, obj-size-col=6"

	Hit Rate	Byte Hit Rate
FIFO	72.86 (27.14% MR)	59.12% (40.88% MR)
LRU	73.76% (26.24% MR)	60.67% (39.33% MR)
QDLP	75.27% (24.73% MR)	63.82% (36.18% MR)
S3Fifo	75.22 (24.78% MR)	63.52% (36.48% MR)
Sieve	73.93% (26.07% MR)	62.34% (37.66% MR)
Caffeine	75.65% (24.35% MR)	64.27% (35.73% MR)

Caffeine leads QDLP (the best algorithm) by 0.38% HR / 0.45% BHR, and S3Fifo by 0.43% HR / 0.75% BHR.

Multithreading

The article states that LRU is a scalability bottleneck due to reordering under a lock. This assertion might be misleading because many caches decouple their eviction policy from the way that they manage concurrency, rendering LRU's characteristics here irrelevant. This allows the cache designer to consider a broader set of algorithms and data structures that may allow for better time and space efficiency, or that enable additional features. For example, Caffeine scaled linearly to 380M reads/s on 16 cores (2015 hardware). In comparison, Segcache's FIFO policy was reported at 70M reads/s on 24 cores (2020 hardware). The benchmarks may differ, but it shows that the policy's concurrency does not need to be a bottleneck and designers can focus on other areas to optimize.

Conclusion

In the workloads that were considered exemplary by this project and used for its policy designs, we can observe that adaptive W-TinyLFU (Caffeine) demonstrates competitive performance. It is important to compare reference implementations to avoid accidental differences and I have not attempted to debug your implementation to explain the degradation. I believe any static configuration will underperform in some scenario and that designers should continue to explore ways to make more effective and adaptive choices.

P.S. This is an issue since you disabled the discussion tab. You are welcome to close this after reading (self destructs in 10, 9, ...)

Adding framework support for generating miss ratio curve

We need a new module and binary for implementing algorithms that can generate miss ratio curves.

"obj-id-is-num" parameter doesn't support hex values

Inside tracePrint and traceConv.

Example input:
time,object,size,next_access_vtime
604038471,82131353f9ddc8c6,87,-1
604038471,a54f88ab02ece24d,0,-1
604038471,a54f88ab02ece24d,0,-1
604038471,a54f88ab02ece24d,0,-1
604038471,a54f88ab02ece24d,0,-1

tracePrint says "object id is not numeric a54f88ab02ece24d"

Support Mithril

I will add the Mithril algorithm that was mentioned in the previous issue.

Since Mithril is intended to enhance a cache for block storage or on-chip storage, there is a confusing problem in the original algorithm (block_size and all objects are of equal size), which conflicts with most algorithms that support variable object sizes in this project.

So I have done these changes.

The block_size is 1 in the default setting, otherwise, set block_size in eviction_params such as -e "block-size=65536".
For variable size, I set a hash_table cache_size_map that maps all requested objects to their size in order to set req->obj_size in prefetching. There seems to be a better approach here, like keeping track of object sizes in recording_table, mining_table and prefetch_table.
Oh, Mithril supports the second-chance mechanism, so the base cache should provide {base_cache}_to_evict interface.

Based on the above, I will submit a pull request. 😀

fix CI

Support Belady on traces without future information

The current implementation of Belady's algorithm requires an oracleGeneral trace format, which has future access information. This design was chosen to reduce the memory requirement because we need to allocate an array of size N (number of requests) to store future access time. However, requiring the oracleGeneral trace makes ad-hoc experiments hard. It would be good to support Belady's MIN algorithm for non oracleGeneral traces, e.g., txt trace.

Support dynamic window size in TinyLFU

We would like to support dynamic window size in TinyLFU based on descriptions from
https://dl.acm.org/doi/10.1145/3274808.3274816?cid=99659224047

Cloudphysic block trace origin format

Hi，can you provide Cloudphysic block trace origin format, i want to analyze it, but i cannot find the way to download a Cloudphsic block trace with offset and length.
Thank you so much!

	common_cache_params_t ccache_params_copy = ccache_params;
	ccache_params_copy.hashpower = MAX(12, ccache_params_copy.hashpower - 8);

	cache_t *cache = cache_struct_init("Random", ccache_params, cache_specific_params);

	common_cache_params_t ccache_params_copy = ccache_params;
	ccache_params_copy.hashpower = MAX(12, ccache_params_copy.hashpower - 8);

	cache_t *cache =
	cache_struct_init("RandomTwo", ccache_params, cache_specific_params);