changelog shortlog graph tags branches changeset files file revisions raw help

Mercurial > core / annotate lisp/ffi/zstd/dict.lisp

changeset 697: 08621be7e780
parent: 38e9c3be2392
author: Richard Westhaver <ellis@rwest.io>
date: Fri, 04 Oct 2024 21:45:59 -0400
permissions: -rw-r--r--
description: alien C updates
438
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
1
 ;;; dict.lisp --- Zstd Dictionary API
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
2
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
3
 ;; 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
4
 
657
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
5
 ;;; Commentary:
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
6
 
658
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
7
 ;; The CDict can be created once and shared across multiple threads since it's
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
8
 ;; read-only.
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
9
 
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
10
 ;; Unclear if DDict is also read-only.
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
11
 
657
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
12
 ;; From zdict.h:
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
13
 #|
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
14
  * Zstd dictionary builder
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
15
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
16
  * FAQ
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
17
  * ===
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
18
  * Why should I use a dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
19
  * ------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
20
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
21
  * Zstd can use dictionaries to improve compression ratio of small data.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
22
  * Traditionally small files don't compress well because there is very little
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
23
  * repetition in a single sample, since it is small. But, if you are compressing
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
24
  * many similar files, like a bunch of JSON records that share the same
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
25
  * structure, you can train a dictionary on ahead of time on some samples of
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
26
  * these files. Then, zstd can use the dictionary to find repetitions that are
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
27
  * present across samples. This can vastly improve compression ratio.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
28
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
29
  * When is a dictionary useful?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
30
  * ----------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
31
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
32
  * Dictionaries are useful when compressing many small files that are similar.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
33
  * The larger a file is, the less benefit a dictionary will have. Generally,
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
34
  * we don't expect dictionary compression to be effective past 100KB. And the
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
35
  * smaller a file is, the more we would expect the dictionary to help.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
36
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
37
  * How do I use a dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
38
  * --------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
39
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
40
  * Simply pass the dictionary to the zstd compressor with
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
41
  * `ZSTD_CCtx_loadDictionary()`. The same dictionary must then be passed to
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
42
  * the decompressor, using `ZSTD_DCtx_loadDictionary()`. There are other
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
43
  * more advanced functions that allow selecting some options, see zstd.h for
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
44
  * complete documentation.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
45
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
46
  * What is a zstd dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
47
  * --------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
48
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
49
  * A zstd dictionary has two pieces: Its header, and its content. The header
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
50
  * contains a magic number, the dictionary ID, and entropy tables. These
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
51
  * entropy tables allow zstd to save on header costs in the compressed file,
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
52
  * which really matters for small data. The content is just bytes, which are
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
53
  * repeated content that is common across many samples.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
54
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
55
  * What is a raw content dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
56
  * ---------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
57
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
58
  * A raw content dictionary is just bytes. It doesn't have a zstd dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
59
  * header, a dictionary ID, or entropy tables. Any buffer is a valid raw
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
60
  * content dictionary.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
61
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
62
  * How do I train a dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
63
  * ----------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
64
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
65
  * Gather samples from your use case. These samples should be similar to each
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
66
  * other. If you have several use cases, you could try to train one dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
67
  * per use case.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
68
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
69
  * Pass those samples to `ZDICT_trainFromBuffer()` and that will train your
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
70
  * dictionary. There are a few advanced versions of this function, but this
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
71
  * is a great starting point. If you want to further tune your dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
72
  * you could try `ZDICT_optimizeTrainFromBuffer_cover()`. If that is too slow
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
73
  * you can try `ZDICT_optimizeTrainFromBuffer_fastCover()`.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
74
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
75
  * If the dictionary training function fails, that is likely because you
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
76
  * either passed too few samples, or a dictionary would not be effective
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
77
  * for your data. Look at the messages that the dictionary trainer printed,
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
78
  * if it doesn't say too few samples, then a dictionary would not be effective.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
79
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
80
  * How large should my dictionary be?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
81
  * ----------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
82
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
83
  * A reasonable dictionary size, the `dictBufferCapacity`, is about 100KB.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
84
  * The zstd CLI defaults to a 110KB dictionary. You likely don't need a
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
85
  * dictionary larger than that. But, most use cases can get away with a
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
86
  * smaller dictionary. The advanced dictionary builders can automatically
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
87
  * shrink the dictionary for you, and select the smallest size that doesn't
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
88
  * hurt compression ratio too much. See the `shrinkDict` parameter.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
89
  * A smaller dictionary can save memory, and potentially speed up
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
90
  * compression.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
91
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
92
  * How many samples should I provide to the dictionary builder?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
93
  * ------------------------------------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
94
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
95
  * We generally recommend passing ~100x the size of the dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
96
  * in samples. A few thousand should suffice. Having too few samples
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
97
  * can hurt the dictionaries effectiveness. Having more samples will
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
98
  * only improve the dictionaries effectiveness. But having too many
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
99
  * samples can slow down the dictionary builder.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
100
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
101
  * How do I determine if a dictionary will be effective?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
102
  * -----------------------------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
103
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
104
  * Simply train a dictionary and try it out. You can use zstd's built in
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
105
  * benchmarking tool to test the dictionary effectiveness.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
106
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
107
  *   # Benchmark levels 1-3 without a dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
108
  *   zstd -b1e3 -r /path/to/my/files
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
109
  *   # Benchmark levels 1-3 with a dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
110
  *   zstd -b1e3 -r /path/to/my/files -D /path/to/my/dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
111
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
112
  * When should I retrain a dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
113
  * -----------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
114
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
115
  * You should retrain a dictionary when its effectiveness drops. Dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
116
  * effectiveness drops as the data you are compressing changes. Generally, we do
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
117
  * expect dictionaries to "decay" over time, as your data changes, but the rate
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
118
  * at which they decay depends on your use case. Internally, we regularly
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
119
  * retrain dictionaries, and if the new dictionary performs significantly
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
120
  * better than the old dictionary, we will ship the new dictionary.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
121
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
122
  * I have a raw content dictionary, how do I turn it into a zstd dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
123
  * -------------------------------------------------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
124
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
125
  * If you have a raw content dictionary, e.g. by manually constructing it, or
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
126
  * using a third-party dictionary builder, you can turn it into a zstd
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
127
  * dictionary by using `ZDICT_finalizeDictionary()`. You'll also have to
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
128
  * provide some samples of the data. It will add the zstd header to the
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
129
  * raw content, which contains a dictionary ID and entropy tables, which
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
130
  * will improve compression ratio, and allow zstd to write the dictionary ID
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
131
  * into the frame, if you so choose.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
132
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
133
  * Do I have to use zstd's dictionary builder?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
134
  * -------------------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
135
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
136
  * No! You can construct dictionary content however you please, it is just
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
137
  * bytes. It will always be valid as a raw content dictionary. If you want
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
138
  * a zstd dictionary, which can improve compression ratio, use
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
139
  * `ZDICT_finalizeDictionary()`.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
140
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
141
  * What is the attack surface of a zstd dictionary?
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
142
  * ------------------------------------------------
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
143
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
144
  * Zstd is heavily fuzz tested, including loading fuzzed dictionaries, so
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
145
  * zstd should never crash, or access out-of-bounds memory no matter what
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
146
  * the dictionary is. However, if an attacker can control the dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
147
  * during decompression, they can cause zstd to generate arbitrary bytes,
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
148
  * just like if they controlled the compressed data.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
149
  *
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
150
  ******************************************************************************/
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
151
 
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
152
 
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
153
 /*! ZDICT_trainFromBuffer():
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
154
  *  Train a dictionary from an array of samples.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
155
  *  Redirect towards ZDICT_optimizeTrainFromBuffer_fastCover() single-threaded, with d=8, steps=4,
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
156
  *  f=20, and accel=1.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
157
  *  Samples must be stored concatenated in a single flat buffer `samplesBuffer`,
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
158
  *  supplied with an array of sizes `samplesSizes`, providing the size of each sample, in order.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
159
  *  The resulting dictionary will be saved into `dictBuffer`.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
160
  * @return: size of dictionary stored into `dictBuffer` (<= `dictBufferCapacity`)
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
161
  *          or an error code, which can be tested with ZDICT_isError().
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
162
  *  Note:  Dictionary training will fail if there are not enough samples to construct a
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
163
  *         dictionary, or if most of the samples are too small (< 8 bytes being the lower limit).
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
164
  *         If dictionary training fails, you should use zstd without a dictionary, as the dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
165
  *         would've been ineffective anyways. If you believe your samples would benefit from a dictionary
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
166
  *         please open an issue with details, and we can look into it.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
167
  *  Note: ZDICT_trainFromBuffer()'s memory usage is about 6 MB.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
168
  *  Tips: In general, a reasonable dictionary has a size of ~ 100 KB.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
169
  *        It's possible to select smaller or larger size, just by specifying `dictBufferCapacity`.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
170
  *        In general, it's recommended to provide a few thousands samples, though this can vary a lot.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
171
  *        It's recommended that total size of all samples be about ~x100 times the target size of dictionary.
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
172
  */
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
173
 |#
438
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
174
 ;;; Code:
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
175
 (in-package :zstd)
657
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
176
 (deferror zstd-ddict-error (zstd-alien-error) ())
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
177
 (deferror zstd-cdict-error (zstd-alien-error)
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
178
     ()
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
179
     (:report (lambda (c s)
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
180
                (format s "ZSTD CDict signalled error: ~A" (zstd-errorcode* (zstd-error-code c))))))
438
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
181
 
469
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
182
 (define-alien-enum (zstd-dict-content-type int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
183
                    :auto 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
184
                    :raw-content 1
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
185
                    :full-dict 2)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
186
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
187
 (define-alien-enum (zstd-dict-load-method int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
188
                    :by-copy 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
189
                    :by-ref 1)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
190
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
191
 (define-alien-enum (zstd-force-ignore-checksum int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
192
                    :validate-checksum 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
193
                    :ignore-checksum 1)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
194
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
195
 (define-alien-enum (zstd-ref-multiple-ddicts int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
196
                    :ref-single-ddict 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
197
                    :ref-multiple-ddicts 1)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
198
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
199
 (define-alien-enum (zstd-dict-attach-pref int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
200
                    :default-attach 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
201
                    :force-attach 1
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
202
                    :force-copy 2
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
203
                    :force-load 3)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
204
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
205
 (define-alien-enum (zstd-literal-compression-mode int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
206
                    :auto 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
207
                    :huffman 1
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
208
                    :uncompressed 2)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
209
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
210
 (define-alien-enum (zstd-param-switch int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
211
                    :auto 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
212
                    :enable 1
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
213
                    :disable 2)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
214
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
215
 (define-alien-enum (zstd-frame-type int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
216
                    :frame 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
217
                    :skippable-frame 1)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
218
 
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
219
 (define-alien-enum (zstd-sequence-format int)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
220
                    :no-block-delimiters 0
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
221
                    :explicit-block-delimiters 1)
7354623e5b54 define-alien-enum, zstd, skel, and pod work
Richard Westhaver <ellis@rwest.io>
parents: 438
diff changeset
222
 
438
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
223
 ;;; Simple Dictionary API
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
224
 (define-alien-routine "ZSTD_compress_usingDict" size-t
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
225
   (cctx (* zstd-cctx))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
226
   (dst (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
227
   (dst-capacity size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
228
   (src (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
229
   (src-size size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
230
   (dict (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
231
   (dict-size size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
232
   (compression-level int))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
233
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
234
 (define-alien-routine "ZSTD_decompress_usingDict" size-t
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
235
   (dctx (* zstd-dctx))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
236
   (dst (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
237
   (dst-capacity size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
238
   (src (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
239
   (src-size size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
240
   (dict (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
241
   (dict-size size-t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
242
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
243
 ;;; Bulk-processing Dictionary API
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
244
 (define-alien-type zstd-cdict (struct zstd-cdict-s))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
245
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
246
 (define-alien-routine "ZSTD_createCDict" (* zstd-cdict)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
247
   (dict-buffer (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
248
   (dict-size size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
249
   (compression-level int))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
250
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
251
 (define-alien-routine "ZSTD_freeCDict" size-t (cdict (* zstd-cdict)))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
252
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
253
 (define-alien-routine "ZSTD_compress_usingCDict" size-t
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
254
   (cctx (* zstd-cctx))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
255
   (dst (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
256
   (dst-capacity size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
257
   (src (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
258
   (src-size size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
259
   (cdict (* zstd-cdict)))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
260
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
261
 (define-alien-type zstd-ddict (struct zstd-ddict-s))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
262
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
263
 (define-alien-routine "ZSTD_createDDict" (* zstd-ddict)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
264
   (dict-buffer (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
265
   (dict-size size-t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
266
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
267
 (define-alien-routine "ZSTD_freeDDict" size-t (ddict (* zstd-ddict)))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
268
 
658
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
269
 (define-alien-routine "ZSTD_decompress_usingDDict" size-t
438
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
270
   (dctx (* zstd-dctx))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
271
   (dst (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
272
   (dst-capacity size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
273
   (src (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
274
   (src-size size-t)
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
275
   (ddict (* zstd-ddict)))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
276
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
277
 ;; dictionary utils
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
278
 (define-alien-routine "ZSTD_getDictID_fromDict" unsigned
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
279
   (dict (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
280
   (dict-size size-t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
281
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
282
 (define-alien-routine "ZSTD_getDictID_fromCDict" unsigned
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
283
   (cdict (* zstd-cdict)))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
284
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
285
 (define-alien-routine "ZSTD_getDictID_fromDDict" unsigned
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
286
   (cdict (* zstd-ddict)))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
287
 
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
288
 (define-alien-routine "ZSTD_getDictID_fromFrame" unsigned
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
289
   (src (* t))
b719ae57647d zstd refactoring
Richard Westhaver <ellis@rwest.io>
parents:
diff changeset
290
   (src-size size-t))
657
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
291
 
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
292
 (define-alien-routine "ZSTD_estimatedDictSize" size-t (dict-size size-t) (dict-load-method zstd-dict-load-method))
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
293
 
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
294
 (defmacro with-zstd-cdict ((cv &key buffer size (level (zstd-defaultclevel))) &body body)
658
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
295
   `(with-alien ((,cv (* zstd-cdict) (zstd-createcdict (cast (octets-to-alien ,buffer) (* t))
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
296
                                                       (or ,size (length ,buffer))
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
297
                                                       ,level)))
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
298
      (unwind-protect (progn ,@body)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
299
        (zstd-freecdict ,cv))))
657
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
300
 
937a6f354047 zstd tests and macros
Richard Westhaver <ellis@rwest.io>
parents: 469
diff changeset
301
 (defmacro with-zstd-ddict ((dv &key buffer size) &body body)
658
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
302
   `(with-alien ((,dv (* zstd-ddict)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
303
                      (zstd-createddict (cast (octets-to-alien ,buffer) (* t)) (or ,size (length ,buffer)))))
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
304
      (unwind-protect (progn ,@body)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
305
        (zstd-freeddict ,dv))))
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
306
 
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
307
 ;;; zdict.h
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
308
 (define-alien-type zstd-cover-params 
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
309
     (struct zdict-cover-params
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
310
             (k unsigned)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
311
             (d unsigned)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
312
             (steps unsigned)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
313
             (nb-threads unsigned)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
314
             (split-point double)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
315
             (shrink-dict unsigned)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
316
             (shrink-dict-max-regression unsigned)
804b5ee20a46 zstd completed (besides zdict), working on readline
Richard Westhaver <ellis@rwest.io>
parents: 657
diff changeset
317
             (zparams zdict-params)))
696
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
318
 
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
319
 (define-alien-routine ("ZDICT_trainFromBuffer" zdict-train-from-buffer) size-t
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
320
   (dict-buffer (* t))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
321
   (dict-buffer-capacity size-t)
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
322
   (samples-buffer (* t))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
323
   (samples-sizes (* size-t))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
324
   (nb-samples unsigned))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
325
 
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
326
 (define-alien-type zdict-params
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
327
   (struct zdict-params-t
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
328
           (compression-level int)
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
329
           (notification-level unsigned)
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
330
           (dict-id unsigned)))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
331
 
697
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
332
 ;; NOTE: Requires returning struct by value
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
333
 
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
334
 ;; This is the ONLY function which used libzstd-alien.so right now.
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
335
 (define-alien-routine ("ZDICT_finalizeDictionaryWithParams" zdict-finalize-dictionary) size-t
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
336
   (dst-dict-buffer (* t))
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
337
   (max-dict-size size-t)
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
338
   (dict-content (* t))
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
339
   (dict-content-size size-t)
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
340
   (samples-buffer (* t))
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
341
   (samples-sizes (* size-t))
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
342
   (nb-samples unsigned)
08621be7e780 alien C updates
Richard Westhaver <ellis@rwest.io>
parents: 696
diff changeset
343
   (parameters (* zdict-params)))
696
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
344
 
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
345
 (define-alien-routine ("ZDICT_getDictID" zdict-get-dict-id) unsigned
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
346
   (dict-buffer (* t))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
347
   (dict-size size-t))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
348
 
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
349
 (define-alien-routine ("ZDICT_getDictHeaderSize" zdict-get-dict-header-size) size-t
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
350
   (dict-buffer (* t))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
351
   (dict-size size-t))
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
352
 
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
353
 (define-alien-routine ("ZDICT_isError" zdict-is-error) unsigned
38e9c3be2392 prep for adding zdict wrapper, change default control stack size of inferior-lisp to 8M
Richard Westhaver <ellis@rwest.io>
parents: 658
diff changeset
354
   (error-code size-t))