diff options
Diffstat (limited to 'vendor/github.com/klauspost/compress/zstd/README.md')
-rw-r--r-- | vendor/github.com/klauspost/compress/zstd/README.md | 206 |
1 files changed, 113 insertions, 93 deletions
diff --git a/vendor/github.com/klauspost/compress/zstd/README.md b/vendor/github.com/klauspost/compress/zstd/README.md index f2a80b5d0..ac3640dc9 100644 --- a/vendor/github.com/klauspost/compress/zstd/README.md +++ b/vendor/github.com/klauspost/compress/zstd/README.md @@ -5,11 +5,9 @@ It offers a very wide range of compression / speed trade-off, while being backed A high performance compression algorithm is implemented. For now focused on speed. This package provides [compression](#Compressor) to and [decompression](#Decompressor) of Zstandard content. -Note that custom dictionaries are not supported yet, so if your code relies on that, -you cannot use the package as-is. +Note that custom dictionaries are only supported for decompression. This package is pure Go and without use of "unsafe". -If a significant speedup can be achieved using "unsafe", it may be added as an option later. The `zstd` package is provided as open source software using a Go standard license. @@ -142,80 +140,96 @@ Using the Encoder for both a stream and individual blocks concurrently is safe. I have collected some speed examples to compare speed and compression against other compressors. * `file` is the input file. -* `out` is the compressor used. `zskp` is this package. `gzstd` is gzip standard library. `zstd` is the Datadog cgo library. +* `out` is the compressor used. `zskp` is this package. `zstd` is the Datadog cgo library. `gzstd/gzkp` is gzip standard and this library. * `level` is the compression level used. For `zskp` level 1 is "fastest", level 2 is "default". * `insize`/`outsize` is the input/output size. * `millis` is the number of milliseconds used for compression. * `mb/s` is megabytes (2^20 bytes) per second. ``` -The test data for the Large Text Compression Benchmark is the first -10^9 bytes of the English Wikipedia dump on Mar. 3, 2006. -http://mattmahoney.net/dc/textdata.html - -file out level insize outsize millis mb/s -enwik9 zskp 1 1000000000 343833033 5840 163.30 -enwik9 zskp 2 1000000000 317822183 8449 112.87 -enwik9 gzstd 1 1000000000 382578136 13627 69.98 -enwik9 gzstd 3 1000000000 349139651 22344 42.68 -enwik9 zstd 1 1000000000 357416379 4838 197.12 -enwik9 zstd 3 1000000000 313734522 7556 126.21 +Silesia Corpus: +http://sun.aei.polsl.pl/~sdeor/corpus/silesia.zip -GOB stream of binary data. Highly compressible. -https://files.klauspost.com/compress/gob-stream.7z +This package: +file out level insize outsize millis mb/s +silesia.tar zskp 1 211947520 73101992 643 313.87 +silesia.tar zskp 2 211947520 67504318 969 208.38 +silesia.tar zskp 3 211947520 65177448 1899 106.44 -file out level insize outsize millis mb/s -gob-stream zskp 1 1911399616 234981983 5100 357.42 -gob-stream zskp 2 1911399616 208674003 6698 272.15 -gob-stream gzstd 1 1911399616 357382641 14727 123.78 -gob-stream gzstd 3 1911399616 327835097 17005 107.19 -gob-stream zstd 1 1911399616 250787165 4075 447.22 -gob-stream zstd 3 1911399616 208191888 5511 330.77 - -Highly compressible JSON file. Similar to logs in a lot of ways. -https://files.klauspost.com/compress/adresser.001.gz - -file out level insize outsize millis mb/s -adresser.001 zskp 1 1073741824 18510122 1477 692.83 -adresser.001 zskp 2 1073741824 19831697 1705 600.59 -adresser.001 gzstd 1 1073741824 47755503 3079 332.47 -adresser.001 gzstd 3 1073741824 40052381 3051 335.63 -adresser.001 zstd 1 1073741824 16135896 994 1030.18 -adresser.001 zstd 3 1073741824 17794465 905 1131.49 +cgo zstd: +silesia.tar zstd 1 211947520 73605392 543 371.56 +silesia.tar zstd 3 211947520 66793289 864 233.68 +silesia.tar zstd 6 211947520 62916450 1913 105.66 -VM Image, Linux mint with a few installed applications: -https://files.klauspost.com/compress/rawstudio-mint14.7z +gzip, stdlib/this package: +silesia.tar gzstd 1 211947520 80007735 1654 122.21 +silesia.tar gzkp 1 211947520 80369488 1168 173.06 -file out level insize outsize millis mb/s -rawstudio-mint14.tar zskp 1 8558382592 3648168838 33398 244.38 -rawstudio-mint14.tar zskp 2 8558382592 3376721436 50962 160.16 -rawstudio-mint14.tar gzstd 1 8558382592 3926257486 84712 96.35 -rawstudio-mint14.tar gzstd 3 8558382592 3740711978 176344 46.28 -rawstudio-mint14.tar zstd 1 8558382592 3607859742 27903 292.51 -rawstudio-mint14.tar zstd 3 8558382592 3341710879 46700 174.77 +GOB stream of binary data. Highly compressible. +https://files.klauspost.com/compress/gob-stream.7z +file out level insize outsize millis mb/s +gob-stream zskp 1 1911399616 235022249 3088 590.30 +gob-stream zskp 2 1911399616 205669791 3786 481.34 +gob-stream zskp 3 1911399616 185792019 9324 195.48 +gob-stream zstd 1 1911399616 249810424 2637 691.26 +gob-stream zstd 3 1911399616 208192146 3490 522.31 +gob-stream zstd 6 1911399616 193632038 6687 272.56 +gob-stream gzstd 1 1911399616 357382641 10251 177.82 +gob-stream gzkp 1 1911399616 362156523 5695 320.08 -The test data is designed to test archivers in realistic backup scenarios. -http://mattmahoney.net/dc/10gb.html +The test data for the Large Text Compression Benchmark is the first +10^9 bytes of the English Wikipedia dump on Mar. 3, 2006. +http://mattmahoney.net/dc/textdata.html -file out level insize outsize millis mb/s -10gb.tar zskp 1 10065157632 4883149814 45715 209.97 -10gb.tar zskp 2 10065157632 4638110010 60970 157.44 -10gb.tar gzstd 1 10065157632 5198296126 97769 98.18 -10gb.tar gzstd 3 10065157632 4932665487 313427 30.63 -10gb.tar zstd 1 10065157632 4940796535 40391 237.65 -10gb.tar zstd 3 10065157632 4638618579 52911 181.42 +file out level insize outsize millis mb/s +enwik9 zskp 1 1000000000 343848582 3609 264.18 +enwik9 zskp 2 1000000000 317276632 5746 165.97 +enwik9 zskp 3 1000000000 294540704 11725 81.34 +enwik9 zstd 1 1000000000 358072021 3110 306.65 +enwik9 zstd 3 1000000000 313734672 4784 199.35 +enwik9 zstd 6 1000000000 295138875 10290 92.68 +enwik9 gzstd 1 1000000000 382578136 9604 99.30 +enwik9 gzkp 1 1000000000 383825945 6544 145.73 + +Highly compressible JSON file. +https://files.klauspost.com/compress/github-june-2days-2019.json.zst + +file out level insize outsize millis mb/s +github-june-2days-2019.json zskp 1 6273951764 699045015 10620 563.40 +github-june-2days-2019.json zskp 2 6273951764 617881763 11687 511.96 +github-june-2days-2019.json zskp 3 6273951764 537511906 29252 204.54 +github-june-2days-2019.json zstd 1 6273951764 766284037 8450 708.00 +github-june-2days-2019.json zstd 3 6273951764 661889476 10927 547.57 +github-june-2days-2019.json zstd 6 6273951764 642756859 22996 260.18 +github-june-2days-2019.json gzstd 1 6273951764 1164400847 29948 199.79 +github-june-2days-2019.json gzkp 1 6273951764 1128755542 19236 311.03 -Silesia Corpus: -http://sun.aei.polsl.pl/~sdeor/corpus/silesia.zip +VM Image, Linux mint with a few installed applications: +https://files.klauspost.com/compress/rawstudio-mint14.7z -file out level insize outsize millis mb/s -silesia.tar zskp 1 211947520 73025800 1108 182.26 -silesia.tar zskp 2 211947520 67674684 1599 126.41 -silesia.tar gzstd 1 211947520 80007735 2515 80.37 -silesia.tar gzstd 3 211947520 73133380 4259 47.45 -silesia.tar zstd 1 211947520 73513991 933 216.64 -silesia.tar zstd 3 211947520 66793301 1377 146.79 +file out level insize outsize millis mb/s +rawstudio-mint14.tar zskp 1 8558382592 3667489370 20210 403.84 +rawstudio-mint14.tar zskp 2 8558382592 3364592300 31873 256.07 +rawstudio-mint14.tar zskp 3 8558382592 3224594213 71751 113.75 +rawstudio-mint14.tar zstd 1 8558382592 3609250104 17136 476.27 +rawstudio-mint14.tar zstd 3 8558382592 3341679997 29262 278.92 +rawstudio-mint14.tar zstd 6 8558382592 3235846406 77904 104.77 +rawstudio-mint14.tar gzstd 1 8558382592 3926257486 57722 141.40 +rawstudio-mint14.tar gzkp 1 8558382592 3970463184 41749 195.49 + +CSV data: +https://files.klauspost.com/compress/nyc-taxi-data-10M.csv.zst + +file out level insize outsize millis mb/s +nyc-taxi-data-10M.csv zskp 1 3325605752 641339945 8925 355.35 +nyc-taxi-data-10M.csv zskp 2 3325605752 591748091 11268 281.44 +nyc-taxi-data-10M.csv zskp 3 3325605752 538490114 19880 159.53 +nyc-taxi-data-10M.csv zstd 1 3325605752 687399637 8233 385.18 +nyc-taxi-data-10M.csv zstd 3 3325605752 598514411 10065 315.07 +nyc-taxi-data-10M.csv zstd 6 3325605752 570522953 20038 158.27 +nyc-taxi-data-10M.csv gzstd 1 3325605752 928656485 23876 132.83 +nyc-taxi-data-10M.csv gzkp 1 3325605752 924718719 16388 193.53 ``` ### Converters @@ -315,13 +329,13 @@ Data compressed with [dictionaries](https://github.com/facebook/zstd#the-case-fo Dictionaries are added individually to Decoders. Dictionaries are generated by the `zstd --train` command and contains an initial state for the decoder. -To add a dictionary use the `RegisterDict(data)` with the dictionary data before starting any decompression. +To add a dictionary use the `WithDecoderDicts(dicts ...[]byte)` option with the dictionary data. +Several dictionaries can be added at once. The dictionary will be used automatically for the data that specifies them. - A re-used Decoder will still contain the dictionaries registered. -When registering a dictionary with the same ID it will override the existing. +When registering multiple dictionaries with the same ID, the last one will be used. ### Allocation-less operation @@ -364,36 +378,42 @@ These are some examples of performance compared to [datadog cgo library](https:/ The first two are streaming decodes and the last are smaller inputs. ``` -BenchmarkDecoderSilesia-8 20 642550210 ns/op 329.85 MB/s 3101 B/op 8 allocs/op -BenchmarkDecoderSilesiaCgo-8 100 384930000 ns/op 550.61 MB/s 451878 B/op 9713 allocs/op - -BenchmarkDecoderEnwik9-2 10 3146000080 ns/op 317.86 MB/s 2649 B/op 9 allocs/op -BenchmarkDecoderEnwik9Cgo-2 20 1905900000 ns/op 524.69 MB/s 1125120 B/op 45785 allocs/op - -BenchmarkDecoder_DecodeAll/z000000.zst-8 200 7049994 ns/op 138.26 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000001.zst-8 100000 19560 ns/op 97.49 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000002.zst-8 5000 297599 ns/op 236.99 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000003.zst-8 2000 725502 ns/op 141.17 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000004.zst-8 200000 9314 ns/op 54.54 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000005.zst-8 10000 137500 ns/op 104.72 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000006.zst-8 500 2316009 ns/op 206.06 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000007.zst-8 20000 64499 ns/op 344.90 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000008.zst-8 50000 24900 ns/op 219.56 MB/s 40 B/op 2 allocs/op -BenchmarkDecoder_DecodeAll/z000009.zst-8 1000 2348999 ns/op 154.01 MB/s 40 B/op 2 allocs/op - -BenchmarkDecoder_DecodeAllCgo/z000000.zst-8 500 4268005 ns/op 228.38 MB/s 1228849 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000001.zst-8 100000 15250 ns/op 125.05 MB/s 2096 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000002.zst-8 10000 147399 ns/op 478.49 MB/s 73776 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000003.zst-8 5000 320798 ns/op 319.27 MB/s 139312 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000004.zst-8 200000 10004 ns/op 50.77 MB/s 560 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000005.zst-8 20000 73599 ns/op 195.64 MB/s 19120 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000006.zst-8 1000 1119003 ns/op 426.48 MB/s 557104 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000007.zst-8 20000 103450 ns/op 215.04 MB/s 71296 B/op 9 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000008.zst-8 100000 20130 ns/op 271.58 MB/s 6192 B/op 3 allocs/op -BenchmarkDecoder_DecodeAllCgo/z000009.zst-8 2000 1123500 ns/op 322.00 MB/s 368688 B/op 3 allocs/op +BenchmarkDecoderSilesia-8 3 385000067 ns/op 550.51 MB/s 5498 B/op 8 allocs/op +BenchmarkDecoderSilesiaCgo-8 6 197666567 ns/op 1072.25 MB/s 270672 B/op 8 allocs/op + +BenchmarkDecoderEnwik9-8 1 2027001600 ns/op 493.34 MB/s 10496 B/op 18 allocs/op +BenchmarkDecoderEnwik9Cgo-8 2 979499200 ns/op 1020.93 MB/s 270672 B/op 8 allocs/op + +Concurrent performance: + +BenchmarkDecoder_DecodeAllParallel/kppkn.gtb.zst-16 28915 42469 ns/op 4340.07 MB/s 114 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/geo.protodata.zst-16 116505 9965 ns/op 11900.16 MB/s 16 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/plrabn12.txt.zst-16 8952 134272 ns/op 3588.70 MB/s 915 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/lcet10.txt.zst-16 11820 102538 ns/op 4161.90 MB/s 594 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/asyoulik.txt.zst-16 34782 34184 ns/op 3661.88 MB/s 60 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/alice29.txt.zst-16 27712 43447 ns/op 3500.58 MB/s 99 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/html_x_4.zst-16 62826 18750 ns/op 21845.10 MB/s 104 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/paper-100k.pdf.zst-16 631545 1794 ns/op 57078.74 MB/s 2 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/fireworks.jpeg.zst-16 1690140 712 ns/op 172938.13 MB/s 1 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/urls.10K.zst-16 10432 113593 ns/op 6180.73 MB/s 1143 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/html.zst-16 113206 10671 ns/op 9596.27 MB/s 15 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallel/comp-data.bin.zst-16 1530615 779 ns/op 5229.49 MB/s 0 B/op 0 allocs/op + +BenchmarkDecoder_DecodeAllParallelCgo/kppkn.gtb.zst-16 65217 16192 ns/op 11383.34 MB/s 46 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/geo.protodata.zst-16 292671 4039 ns/op 29363.19 MB/s 6 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/plrabn12.txt.zst-16 26314 46021 ns/op 10470.43 MB/s 293 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/lcet10.txt.zst-16 33897 34900 ns/op 12227.96 MB/s 205 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/asyoulik.txt.zst-16 104348 11433 ns/op 10949.01 MB/s 20 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/alice29.txt.zst-16 75949 15510 ns/op 9805.60 MB/s 32 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/html_x_4.zst-16 173910 6756 ns/op 60624.29 MB/s 37 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/paper-100k.pdf.zst-16 923076 1339 ns/op 76474.87 MB/s 1 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/fireworks.jpeg.zst-16 922920 1351 ns/op 91102.57 MB/s 2 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/urls.10K.zst-16 27649 43618 ns/op 16096.19 MB/s 407 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/html.zst-16 279073 4160 ns/op 24614.18 MB/s 6 B/op 0 allocs/op +BenchmarkDecoder_DecodeAllParallelCgo/comp-data.bin.zst-16 749938 1579 ns/op 2581.71 MB/s 0 B/op 0 allocs/op ``` -This reflects the performance around May 2019, but this may be out of date. +This reflects the performance around May 2020, but this may be out of date. # Contributions |