changeset 698: | 96958d3eb5b0 |
parent: | 642b3b82b20d |
author: | Richard Westhaver <ellis@rwest.io> |
date: | Fri, 04 Oct 2024 22:04:59 -0400 |
permissions: | -rw-r--r-- |
description: | fixes |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
1 | ;;; /home/ellis/comp/core/lisp/lib/dat/parquet/thrift.lisp --- Parquet Thrift Definitions -*- buffer-read-only:t -*- |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
2 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
3 | ;; input = /home/ellis/comp/core/.stash/parquet.json |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
4 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
5 | ;; This file was generated automatically by |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
6 | ;; DAT/PARQUET/GEN:PARSE-PARQUET-THRIFT-DEFINITIONS |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
7 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
8 | ;; Do not modify. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
9 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
10 | ;;; Code: |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
11 | (in-package :dat/parquet) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
12 | |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
13 | (defvar *parquet-json-types* |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
14 | '(:boolean :int32 :int64 :int96 :float :double :byte-array |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
15 | :fixed-len-byte-array)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
16 | (defvar *parquet-json-converted-types* |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
17 | '(:utf8 :map :map-key-value :list :enum :decimal :date :time-millis |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
18 | :time-micros :timestamp-millis :timestamp-micros :uint-8 :uint-16 :uint-32 |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
19 | :uint-64 :int-8 :int-16 :int-32 :int-64 :json :bson :interval)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
20 | (defvar *parquet-json-field-repetition-types* '(:required :optional :repeated)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
21 | (defvar *parquet-json-encodings* |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
22 | '(:plain :plain-dictionary :rle :bit-packed :delta-binary-packed |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
23 | :delta-length-byte-array :delta-byte-array :rle-dictionary |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
24 | :byte-stream-split)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
25 | (defvar *parquet-json-compression-codecs* |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
26 | '(:uncompressed :snappy :gzip :lzo :brotli :lz4 :zstd :lz4-raw)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
27 | (defvar *parquet-json-page-types* |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
28 | '(:data-page :index-page :dictionary-page :data-page-v2)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
29 | (defvar *parquet-json-boundary-orders* '(:unordered :ascending :descending)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
30 | (deftype parquet-boolean () 'boolean) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
31 | (deftype parquet-int32 () '(signed-byte 32)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
32 | (deftype parquet-int64 () '(signed-byte 64)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
33 | (deftype parquet-int96 () '(signed-byte 96)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
34 | (deftype parquet-float () 'float) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
35 | (deftype parquet-double () 'double-float) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
36 | (deftype parquet-byte-array (&optional dat/parquet/gen::size) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
37 | `(octet-vector ,dat/parquet/gen::size)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
38 | (deftype parquet-fixed-len-byte-array (dat/parquet/gen::size) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
39 | `(octet-vector ,dat/parquet/gen::size)) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
40 | (defclass parquet-size-statistics (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
41 | ((unencoded-byte-array-data-bytes :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
42 | "The number of physical bytes stored for BYTE_ARRAY data values assuming |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
43 | no encoding. This is exclusive of the bytes needed to store the length of |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
44 | each byte array. In other words, this field is equivalent to the `(size |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
45 | of PLAIN-ENCODING the byte array values) - (4 bytes * number of values |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
46 | written)`. To determine unencoded sizes of other types readers can use |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
47 | schema information multiplied by the number of non-null and null values. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
48 | The number of null\\non-null values can be inferred from the histograms |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
49 | below. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
50 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
51 | For example, if a column chunk is dictionary-encoded with dictionary |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
52 | [\\a\\, \\bc\\, \\cde\\], and a data page contains the indices [0, 0, 1, 2], |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
53 | then this value for that data page should be 7 (1 + 1 + 2 + 3). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
54 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
55 | This field should only be set for types that use BYTE_ARRAY as their |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
56 | physical type. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
57 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
58 | :initarg :unencoded-byte-array-data-bytes :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
59 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
60 | (repetition-level-histogram :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
61 | "When present, there is expected to be one element corresponding to each |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
62 | repetition (i.e. size=max repetition_level+1) where each element |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
63 | represents the number of times the repetition level was observed in the |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
64 | data. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
65 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
66 | This field may be omitted if max_repetition_level is 0 without loss |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
67 | of information. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
68 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
69 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
70 | :initarg :repetition-level-histogram :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
71 | (or null (vector (signed-byte 64)))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
72 | (definition-level-histogram :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
73 | "Same as repetition_level_histogram except for definition levels. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
74 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
75 | This field may be omitted if max_definition_level is 0 or 1 without |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
76 | loss of information. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
77 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
78 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
79 | :initarg :definition-level-histogram :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
80 | (or null (vector (signed-byte 64))))) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
81 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
82 | "A structure for capturing metadata for estimating the unencoded, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
83 | uncompressed size of data written. This is useful for readers to estimate |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
84 | how much memory is needed to reconstruct data in their memory model and for |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
85 | fine grained filter pushdown on nested structures (the histograms contained |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
86 | in this structure can help determine the number of nulls at a particular |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
87 | nesting level and maximum length of lists). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
88 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
89 | (defclass parquet-statistics (parquet-object) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
90 | ((max :documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
91 | "DEPRECATED: min and max value of the column. Use min_value and max_value. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
92 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
93 | Values are encoded using PLAIN encoding, except that variable-length byte |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
94 | arrays do not include a length prefix. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
95 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
96 | These fields encode min and max values determined by signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
97 | only. New files should use the correct order for a column's logical type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
98 | and store the values in the min_value and max_value fields. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
99 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
100 | To support older readers, these may be set when the column order is |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
101 | signed. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
102 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
103 | :initarg :max :initform nil :type (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
104 | (min :initarg :min :initform nil :type (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
105 | (null-count :documentation "count of null value in the column |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
106 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
107 | :initarg :null-count :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
108 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
109 | (distinct-count :documentation "count of distinct values occurring |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
110 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
111 | :initarg :distinct-count :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
112 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
113 | (max-value :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
114 | "Lower and upper bound values for the column, determined by its ColumnOrder. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
115 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
116 | These may be the actual minimum and maximum values found on a page or column |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
117 | chunk, but can also be (more compact) values that do not exist on a page or |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
118 | column chunk. For example, instead of storing \\Blart Versenwald III\\, a writer |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
119 | may set min_value=\\B\\, max_value=\\C\\. Such more compact values must still be |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
120 | valid values within the column's logical type. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
121 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
122 | Values are encoded using PLAIN encoding, except that variable-length byte |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
123 | arrays do not include a length prefix. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
124 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
125 | :initarg :max-value :initform nil :type (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
126 | (min-value :initarg :min-value :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
127 | (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
128 | (is-max-value-exact :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
129 | "If true, max_value is the actual maximum value for a column |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
130 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
131 | :initarg :is-max-value-exact :initform nil :type (or null boolean)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
132 | (is-min-value-exact :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
133 | "If true, min_value is the actual minimum value for a column |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
134 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
135 | :initarg :is-min-value-exact :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
136 | (or null boolean))) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
137 | (:documentation "Statistics per row group and per page |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
138 | All fields are optional. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
139 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
140 | (defclass parquet-string-type (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
141 | (:documentation "Empty structs to use as logical type annotations |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
142 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
143 | (defclass parquet-uuid-type (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
144 | (defclass parquet-map-type (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
145 | (defclass parquet-list-type (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
146 | (defclass parquet-enum-type (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
147 | (defclass parquet-date-type (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
148 | (defclass parquet-float16-type (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
149 | (defclass parquet-null-type (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
150 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
151 | "Logical type to annotate a column that is always null. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
152 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
153 | Sometimes when discovering the schema of existing data, values are always |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
154 | null and the physical type can't be determined. This annotation signals |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
155 | the case where the physical type was guessed from all null values. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
156 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
157 | (defclass parquet-decimal-type (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
158 | ((scale :initarg :scale :type (signed-byte 32)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
159 | (precision :initarg :precision :type (signed-byte 32))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
160 | (:documentation "Decimal logical type annotation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
161 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
162 | Scale must be zero or a positive integer less than or equal to the precision. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
163 | Precision must be a non-zero positive integer. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
164 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
165 | To maintain forward-compatibility in v1, implementations using this logical |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
166 | type must also set scale and precision on the annotated SchemaElement. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
167 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
168 | Allowed for physical types: INT32, INT64, FIXED_LEN_BYTE_ARRAY, and BYTE_ARRAY. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
169 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
170 | (defclass parquet-milli-seconds (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
171 | (:documentation "Time units for logical types |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
172 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
173 | (defclass parquet-micro-seconds (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
174 | (defclass parquet-nano-seconds (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
175 | (defclass parquet-time-unit (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
176 | ((millis :initarg :millis :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
177 | (or null parquet-milli-seconds)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
178 | (micros :initarg :micros :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
179 | (or null parquet-micro-seconds)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
180 | (nanos :initarg :nanos :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
181 | (or null parquet-nano-seconds)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
182 | (defclass parquet-timestamp-type (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
183 | ((isadjustedtoutc :initarg :isadjustedtoutc :type boolean) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
184 | (unit :initarg :unit :type parquet-time-unit)) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
185 | (:documentation "Timestamp logical type annotation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
186 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
187 | Allowed for physical types: INT64 |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
188 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
189 | (defclass parquet-time-type (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
190 | ((isadjustedtoutc :initarg :isadjustedtoutc :type boolean) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
191 | (unit :initarg :unit :type parquet-time-unit)) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
192 | (:documentation "Time logical type annotation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
193 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
194 | Allowed for physical types: INT32 (millis), INT64 (micros, nanos) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
195 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
196 | (defclass parquet-int-type (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
197 | ((bitwidth :initarg :bitwidth) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
198 | (issigned :initarg :issigned :type boolean)) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
199 | (:documentation "Integer logical type annotation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
200 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
201 | bitWidth must be 8, 16, 32, or 64. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
202 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
203 | Allowed for physical types: INT32, INT64 |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
204 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
205 | (defclass parquet-json-type (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
206 | (:documentation "Embedded JSON logical type annotation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
207 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
208 | Allowed for physical types: BYTE_ARRAY |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
209 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
210 | (defclass parquet-bson-type (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
211 | (:documentation "Embedded BSON logical type annotation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
212 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
213 | Allowed for physical types: BYTE_ARRAY |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
214 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
215 | (defclass parquet-logical-type (parquet-object) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
216 | ((string :initarg :string :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
217 | (or null parquet-string-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
218 | (map :initarg :map :initform nil :type (or null parquet-map-type)) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
219 | (list :initarg :list :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
220 | (or null parquet-list-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
221 | (enum :initarg :enum :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
222 | (or null parquet-enum-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
223 | (decimal :initarg :decimal :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
224 | (or null parquet-decimal-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
225 | (date :initarg :date :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
226 | (or null parquet-date-type)) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
227 | (time |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
228 | :initarg |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
229 | :time |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
230 | :initform |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
231 | nil |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
232 | :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
233 | (or null parquet-time-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
234 | (timestamp :initarg :timestamp :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
235 | (or null parquet-timestamp-type)) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
236 | (integer :initarg :integer :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
237 | (or null parquet-int-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
238 | (unknown :initarg :unknown :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
239 | (or null parquet-null-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
240 | (json :initarg :json :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
241 | (or null parquet-json-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
242 | (bson :initarg :bson :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
243 | (or null parquet-bson-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
244 | (uuid :initarg :uuid :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
245 | (or null parquet-uuid-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
246 | (float16 :initarg :float16 :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
247 | (or null parquet-float16-type))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
248 | (:documentation "LogicalType annotations to replace ConvertedType. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
249 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
250 | To maintain compatibility, implementations using LogicalType for a |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
251 | SchemaElement must also set the corresponding ConvertedType (if any) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
252 | from the following table. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
253 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
254 | (defclass parquet-schema-element (parquet-object) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
255 | ((type :documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
256 | "Data type for this field. Not set if the current element is a non-leaf node |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
257 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
258 | :initarg :type :initform nil :type (or null parquet-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
259 | (type-length :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
260 | "If type is FIXED_LEN_BYTE_ARRAY, this is the byte length of the values. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
261 | Otherwise, if specified, this is the maximum bit length to store any of the values. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
262 | (e.g. a low cardinality INT col could have this set to 3). Note that this is |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
263 | in the schema, and therefore fixed for the entire file. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
264 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
265 | :initarg :type-length :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
266 | (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
267 | (repetition-type :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
268 | "repetition of the field. The root of the schema does not have a repetition_type. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
269 | All other nodes must have one |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
270 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
271 | :initarg :repetition-type :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
272 | (or null parquet-field-repetition-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
273 | (name :documentation "Name of the field in the schema |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
274 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
275 | :initarg :name :type string) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
276 | (num-children :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
277 | "Nested fields. Since thrift does not support nested fields, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
278 | the nesting is flattened to a single list by a depth-first traversal. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
279 | The children count is used to construct the nested relationship. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
280 | This field is not set when the element is a primitive type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
281 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
282 | :initarg :num-children :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
283 | (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
284 | (converted-type :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
285 | "DEPRECATED: When the schema is the result of a conversion from another model. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
286 | Used to record the original type to help with cross conversion. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
287 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
288 | This is superseded by logicalType. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
289 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
290 | :initarg :converted-type :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
291 | (or null parquet-converted-type)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
292 | (scale :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
293 | "DEPRECATED: Used when this column contains decimal data. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
294 | See the DECIMAL converted type for more details. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
295 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
296 | This is superseded by using the DecimalType annotation in logicalType. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
297 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
298 | :initarg :scale :initform nil :type (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
299 | (precision :initarg :precision :initform nil :type |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
300 | (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
301 | (field-id :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
302 | "When the original schema supports field ids, this will save the |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
303 | original field id in the parquet schema |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
304 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
305 | :initarg :field-id :initform nil :type (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
306 | (logicaltype :documentation "The logical type of this SchemaElement |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
307 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
308 | LogicalType replaces ConvertedType, but ConvertedType is still required |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
309 | for some logical types to ensure forward-compatibility in format v1. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
310 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
311 | :initarg :logicaltype :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
312 | (or null parquet-logical-type))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
313 | (:documentation "Represents a element inside a schema definition. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
314 | - if it is a group (inner node) then type is undefined and num_children is defined |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
315 | - if it is a primitive type (leaf) then type is defined and num_children is undefined |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
316 | the nodes are listed in depth first traversal order. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
317 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
318 | (defclass parquet-data-page-header (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
319 | ((num-values :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
320 | "Number of values, including NULLs, in this data page. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
321 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
322 | If a OffsetIndex is present, a page must begin at a row |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
323 | boundary (repetition_level = 0). Otherwise, pages may begin |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
324 | within a row (repetition_level > 0). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
325 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
326 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
327 | :initarg :num-values :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
328 | (encoding :documentation "Encoding used for this data page * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
329 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
330 | :initarg :encoding :type parquet-encoding) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
331 | (definition-level-encoding :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
332 | "Encoding used for definition levels * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
333 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
334 | :initarg :definition-level-encoding :type parquet-encoding) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
335 | (repetition-level-encoding :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
336 | "Encoding used for repetition levels * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
337 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
338 | :initarg :repetition-level-encoding :type parquet-encoding) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
339 | (statistics :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
340 | "Optional statistics for the data in this page * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
341 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
342 | :initarg :statistics :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
343 | (or null parquet-statistics))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
344 | (:documentation "Data page header |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
345 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
346 | (defclass parquet-index-page-header (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
347 | (defclass parquet-dictionary-page-header (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
348 | ((num-values :documentation "Number of values in the dictionary * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
349 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
350 | :initarg :num-values :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
351 | (encoding :documentation "Encoding using this dictionary page * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
352 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
353 | :initarg :encoding :type parquet-encoding) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
354 | (is-sorted :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
355 | "If true, the entries in the dictionary are sorted in ascending order * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
356 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
357 | :initarg :is-sorted :initform nil :type (or null boolean))) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
358 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
359 | "The dictionary page must be placed at the first position of the column chunk |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
360 | if it is partly or completely dictionary encoded. At most one dictionary page |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
361 | can be placed in a column chunk. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
362 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
363 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
364 | (defclass parquet-data-page-header-v2 (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
365 | ((num-values :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
366 | "Number of values, including NULLs, in this data page. * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
367 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
368 | :initarg :num-values :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
369 | (num-nulls :documentation "Number of NULL values, in this data page. |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
370 | Number of non-null = num_values - num_nulls which is also the number of values in the data section * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
371 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
372 | :initarg :num-nulls :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
373 | (num-rows :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
374 | "Number of rows in this data page. Every page must begin at a |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
375 | row boundary (repetition_level = 0): rows must **not** be |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
376 | split across page boundaries when using V2 data pages. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
377 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
378 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
379 | :initarg :num-rows :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
380 | (encoding :documentation "Encoding used for data in this page * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
381 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
382 | :initarg :encoding :type parquet-encoding) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
383 | (definition-levels-byte-length :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
384 | "Length of the definition levels |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
385 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
386 | :initarg :definition-levels-byte-length :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
387 | (repetition-levels-byte-length :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
388 | "Length of the repetition levels |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
389 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
390 | :initarg :repetition-levels-byte-length :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
391 | (is-compressed :documentation "Whether the values are compressed. |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
392 | Which means the section of the page between |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
393 | definition_levels_byte_length + repetition_levels_byte_length + 1 and compressed_page_size (included) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
394 | is compressed with the compression_codec. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
395 | If missing it is considered compressed |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
396 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
397 | :initarg :is-compressed :initform nil :type (or null boolean)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
398 | (statistics :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
399 | "Optional statistics for the data in this page * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
400 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
401 | :initarg :statistics :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
402 | (or null parquet-statistics))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
403 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
404 | "New page format allowing reading levels without decompressing the data |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
405 | Repetition and definition levels are uncompressed |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
406 | The remaining section containing the data is compressed if is_compressed is true |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
407 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
408 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
409 | (defclass parquet-split-block-algorithm (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
410 | (:documentation "Block-based algorithm type annotation. * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
411 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
412 | (defclass parquet-bloom-filter-algorithm (parquet-object) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
413 | ((block :documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
414 | "Block-based Bloom filter. * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
415 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
416 | :initarg |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
417 | :block |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
418 | :initform |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
419 | nil |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
420 | :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
421 | (or null parquet-split-block-algorithm))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
422 | (:documentation "The algorithm used in Bloom filter. * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
423 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
424 | (defclass parquet-xx-hash (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
425 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
426 | "Hash strategy type annotation. xxHash is an extremely fast non-cryptographic hash |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
427 | algorithm. It uses 64 bits version of xxHash. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
428 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
429 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
430 | (defclass parquet-bloom-filter-hash (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
431 | ((xxhash :documentation "xxHash Strategy. * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
432 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
433 | :initarg :xxhash :initform nil :type (or null parquet-xx-hash))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
434 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
435 | "The hash function used in Bloom filter. This function takes the hash of a column value |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
436 | using plain encoding. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
437 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
438 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
439 | (defclass parquet-uncompressed (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
440 | (:documentation "The compression used in the Bloom filter. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
441 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
442 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
443 | (defclass parquet-bloom-filter-compression (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
444 | ((uncompressed :initarg :uncompressed :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
445 | (or null parquet-uncompressed)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
446 | (defclass parquet-bloom-filter-header (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
447 | ((numbytes :documentation "The size of bitset in bytes * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
448 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
449 | :initarg :numbytes :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
450 | (algorithm :documentation "The algorithm for setting bits. * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
451 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
452 | :initarg :algorithm :type parquet-bloom-filter-algorithm) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
453 | (hash :documentation "The hash function used for Bloom filter. * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
454 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
455 | :initarg :hash :type parquet-bloom-filter-hash) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
456 | (compression :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
457 | "The compression used in the Bloom filter * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
458 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
459 | :initarg :compression :type parquet-bloom-filter-compression)) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
460 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
461 | "Bloom filter header is stored at beginning of Bloom filter data of each column |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
462 | and followed by its bitset. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
463 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
464 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
465 | (defclass parquet-page-header (parquet-object) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
466 | ((type :documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
467 | "the type of the page: indicates which of the *_header fields is set * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
468 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
469 | :initarg :type :type parquet-page-type) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
470 | (uncompressed-page-size :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
471 | "Uncompressed page size in bytes (not including this header) * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
472 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
473 | :initarg :uncompressed-page-size :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
474 | (compressed-page-size :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
475 | "Compressed (and potentially encrypted) page size in bytes, not including this header * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
476 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
477 | :initarg :compressed-page-size :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
478 | (crc :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
479 | "The 32-bit CRC checksum for the page, to be be calculated as follows: |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
480 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
481 | - The standard CRC32 algorithm is used (with polynomial 0x04C11DB7, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
482 | the same as in e.g. GZip). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
483 | - All page types can have a CRC (v1 and v2 data pages, dictionary pages, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
484 | etc.). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
485 | - The CRC is computed on the serialization binary representation of the page |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
486 | (as written to disk), excluding the page header. For example, for v1 |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
487 | data pages, the CRC is computed on the concatenation of repetition levels, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
488 | definition levels and column values (optionally compressed, optionally |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
489 | encrypted). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
490 | - The CRC computation therefore takes place after any compression |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
491 | and encryption steps, if any. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
492 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
493 | If enabled, this allows for disabling checksumming in HDFS if only a few |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
494 | pages need to be read. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
495 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
496 | :initarg :crc :initform nil :type (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
497 | (data-page-header :initarg :data-page-header :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
498 | (or null parquet-data-page-header)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
499 | (index-page-header :initarg :index-page-header :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
500 | (or null parquet-index-page-header)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
501 | (dictionary-page-header :initarg :dictionary-page-header :initform |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
502 | nil :type (or null parquet-dictionary-page-header)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
503 | (data-page-header-v2 :initarg :data-page-header-v2 :initform nil |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
504 | :type (or null parquet-data-page-header-v2)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
505 | (defclass parquet-key-value (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
506 | ((key :initarg :key :type string) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
507 | (value :initarg :value :initform nil :type (or null string))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
508 | (:documentation "Wrapper struct to store key values |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
509 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
510 | (defclass parquet-sorting-column (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
511 | ((column-idx :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
512 | "The ordinal position of the column (in this row group) * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
513 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
514 | :initarg :column-idx :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
515 | (descending :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
516 | "If true, indicates this column is sorted in descending order. * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
517 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
518 | :initarg :descending :type boolean) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
519 | (nulls-first :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
520 | "If true, nulls will come before non-null values, otherwise, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
521 | nulls go at the end. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
522 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
523 | :initarg :nulls-first :type boolean)) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
524 | (:documentation "Sort order within a RowGroup of a leaf column |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
525 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
526 | (defclass parquet-page-encoding-stats (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
527 | ((page-type :documentation "the page type (data\\dic\\...) * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
528 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
529 | :initarg :page-type :type parquet-page-type) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
530 | (encoding :documentation "encoding of the page * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
531 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
532 | :initarg :encoding :type parquet-encoding) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
533 | (count :documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
534 | "number of pages of this type with this encoding * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
535 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
536 | :initarg :count :type (signed-byte 32))) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
537 | (:documentation "statistics of a given page type and encoding |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
538 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
539 | (defclass parquet-column-meta-data (parquet-object) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
540 | ((type :documentation "Type of this column * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
541 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
542 | :initarg :type :type parquet-type) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
543 | (encodings :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
544 | "Set of all encodings used for this column. The purpose is to validate |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
545 | whether we can decode those pages. * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
546 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
547 | :initarg :encodings :type (vector parquet-encoding)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
548 | (path-in-schema :documentation "Path in schema * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
549 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
550 | :initarg :path-in-schema :type (vector string)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
551 | (codec :documentation "Compression codec * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
552 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
553 | :initarg :codec :type parquet-compression-codec) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
554 | (num-values :documentation "Number of values in this column * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
555 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
556 | :initarg :num-values :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
557 | (total-uncompressed-size :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
558 | "total byte size of all uncompressed pages in this column chunk (including the headers) * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
559 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
560 | :initarg :total-uncompressed-size :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
561 | (total-compressed-size :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
562 | "total byte size of all compressed, and potentially encrypted, pages |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
563 | in this column chunk (including the headers) * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
564 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
565 | :initarg :total-compressed-size :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
566 | (key-value-metadata :documentation "Optional key\\value metadata * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
567 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
568 | :initarg :key-value-metadata :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
569 | (or null (vector parquet-key-value))) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
570 | (data-page-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
571 | "Byte offset from beginning of file to first data page * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
572 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
573 | :initarg :data-page-offset :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
574 | (index-page-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
575 | "Byte offset from beginning of file to root index page * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
576 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
577 | :initarg :index-page-offset :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
578 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
579 | (dictionary-page-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
580 | "Byte offset from the beginning of file to first (only) dictionary page * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
581 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
582 | :initarg :dictionary-page-offset :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
583 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
584 | (statistics :documentation "optional statistics for this column chunk |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
585 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
586 | :initarg :statistics :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
587 | (or null parquet-statistics)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
588 | (encoding-stats :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
589 | "Set of all encodings used for pages in this column chunk. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
590 | This information can be used to determine if all data pages are |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
591 | dictionary encoded for example * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
592 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
593 | :initarg :encoding-stats :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
594 | (or null (vector parquet-page-encoding-stats))) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
595 | (bloom-filter-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
596 | "Byte offset from beginning of file to Bloom filter data. * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
597 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
598 | :initarg :bloom-filter-offset :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
599 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
600 | (bloom-filter-length :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
601 | "Size of Bloom filter data including the serialized header, in bytes. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
602 | Added in 2.10 so readers may not read this field from old files and |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
603 | it can be obtained after the BloomFilterHeader has been deserialized. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
604 | Writers should write this field so readers can read the bloom filter |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
605 | in a single I\\O. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
606 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
607 | :initarg :bloom-filter-length :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
608 | (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
609 | (size-statistics :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
610 | "Optional statistics to help estimate total memory when converted to in-memory |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
611 | representations. The histograms contained in these statistics can |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
612 | also be useful in some cases for more fine-grained nullability\\list length |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
613 | filter pushdown. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
614 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
615 | :initarg :size-statistics :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
616 | (or null parquet-size-statistics))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
617 | (:documentation "Description for column metadata |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
618 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
619 | (defclass parquet-encryption-with-footer-key (parquet-object) nil) |
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
620 | (defclass parquet-encryption-with-column-key (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
621 | ((path-in-schema :documentation "Column path in schema * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
622 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
623 | :initarg :path-in-schema :type (vector string)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
624 | (key-metadata :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
625 | "Retrieval metadata of column encryption key * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
626 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
627 | :initarg :key-metadata :initform nil :type (or null octet-vector)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
628 | (defclass parquet-column-crypto-meta-data (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
629 | ((encryption-with-footer-key :initarg :encryption-with-footer-key |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
630 | :initform nil :type (or null parquet-encryption-with-footer-key)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
631 | (encryption-with-column-key :initarg :encryption-with-column-key |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
632 | :initform nil :type (or null parquet-encryption-with-column-key)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
633 | (defclass parquet-column-chunk (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
634 | ((file-path :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
635 | "File where column data is stored. If not set, assumed to be same file as |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
636 | metadata. This path is relative to the current file. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
637 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
638 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
639 | :initarg :file-path :initform nil :type (or null string)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
640 | (file-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
641 | "Deprecated: Byte offset in file_path to the ColumnMetaData |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
642 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
643 | Past use of this field has been inconsistent, with some implementations |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
644 | using it to point to the ColumnMetaData and some using it to point to |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
645 | the first page in the column chunk. In many cases, the ColumnMetaData at this |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
646 | location is wrong. This field is now deprecated and should not be used. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
647 | Writers should set this field to 0 if no ColumnMetaData has been written outside |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
648 | the footer. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
649 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
650 | :initarg :file-offset :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
651 | (meta-data :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
652 | "Column metadata for this chunk. Some writers may also replicate this at the |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
653 | location pointed to by file_path\\file_offset. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
654 | Note: while marked as optional, this field is in fact required by most major |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
655 | Parquet implementations. As such, writers MUST populate this field. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
656 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
657 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
658 | :initarg :meta-data :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
659 | (or null parquet-column-meta-data)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
660 | (offset-index-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
661 | "File offset of ColumnChunk's OffsetIndex * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
662 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
663 | :initarg :offset-index-offset :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
664 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
665 | (offset-index-length :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
666 | "Size of ColumnChunk's OffsetIndex, in bytes * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
667 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
668 | :initarg :offset-index-length :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
669 | (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
670 | (column-index-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
671 | "File offset of ColumnChunk's ColumnIndex * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
672 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
673 | :initarg :column-index-offset :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
674 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
675 | (column-index-length :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
676 | "Size of ColumnChunk's ColumnIndex, in bytes * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
677 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
678 | :initarg :column-index-length :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
679 | (or null (signed-byte 32))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
680 | (crypto-metadata :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
681 | "Crypto metadata of encrypted columns * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
682 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
683 | :initarg :crypto-metadata :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
684 | (or null parquet-column-crypto-meta-data)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
685 | (encrypted-column-metadata :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
686 | "Encrypted column metadata for this chunk * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
687 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
688 | :initarg :encrypted-column-metadata :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
689 | (or null octet-vector)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
690 | (defclass parquet-row-group (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
691 | ((columns :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
692 | "Metadata for each column chunk in this row group. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
693 | This list must have the same order as the SchemaElement list in FileMetaData. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
694 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
695 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
696 | :initarg :columns :type (vector parquet-column-chunk)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
697 | (total-byte-size :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
698 | "Total byte size of all the uncompressed column data in this row group * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
699 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
700 | :initarg :total-byte-size :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
701 | (num-rows :documentation "Number of rows in this row group * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
702 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
703 | :initarg :num-rows :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
704 | (sorting-columns :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
705 | "If set, specifies a sort ordering of the rows in this RowGroup. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
706 | The sorting columns can be a subset of all the columns. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
707 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
708 | :initarg :sorting-columns :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
709 | (or null (vector parquet-sorting-column))) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
710 | (file-offset :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
711 | "Byte offset from beginning of file to first page (data or dictionary) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
712 | in this row group * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
713 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
714 | :initarg :file-offset :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
715 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
716 | (total-compressed-size :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
717 | "Total byte size of all compressed (and potentially encrypted) column data |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
718 | in this row group * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
719 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
720 | :initarg :total-compressed-size :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
721 | (or null (signed-byte 64))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
722 | (ordinal :documentation "Row group ordinal in the file * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
723 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
724 | :initarg :ordinal :initform nil :type (or null (signed-byte 16))))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
725 | (defclass parquet-type-defined-order (parquet-object) nil |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
726 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
727 | "Empty struct to signal the order defined by the physical or logical type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
728 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
729 | (defclass parquet-column-order (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
730 | ((type-order :documentation "The sort orders for logical types are: |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
731 | UTF8 - unsigned byte-wise comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
732 | INT8 - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
733 | INT16 - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
734 | INT32 - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
735 | INT64 - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
736 | UINT8 - unsigned comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
737 | UINT16 - unsigned comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
738 | UINT32 - unsigned comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
739 | UINT64 - unsigned comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
740 | DECIMAL - signed comparison of the represented value |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
741 | DATE - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
742 | TIME_MILLIS - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
743 | TIME_MICROS - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
744 | TIMESTAMP_MILLIS - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
745 | TIMESTAMP_MICROS - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
746 | INTERVAL - undefined |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
747 | JSON - unsigned byte-wise comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
748 | BSON - unsigned byte-wise comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
749 | ENUM - unsigned byte-wise comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
750 | LIST - undefined |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
751 | MAP - undefined |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
752 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
753 | In the absence of logical types, the sort order is determined by the physical type: |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
754 | BOOLEAN - false, true |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
755 | INT32 - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
756 | INT64 - signed comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
757 | INT96 (only used for legacy timestamps) - undefined |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
758 | FLOAT - signed comparison of the represented value (*) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
759 | DOUBLE - signed comparison of the represented value (*) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
760 | BYTE_ARRAY - unsigned byte-wise comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
761 | FIXED_LEN_BYTE_ARRAY - unsigned byte-wise comparison |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
762 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
763 | (*) Because the sorting order is not specified properly for floating |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
764 | point values (relations vs. total ordering) the following |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
765 | compatibility rules should be applied when reading statistics: |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
766 | - If the min is a NaN, it should be ignored. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
767 | - If the max is a NaN, it should be ignored. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
768 | - If the min is +0, the row group may contain -0 values as well. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
769 | - If the max is -0, the row group may contain +0 values as well. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
770 | - When looking for NaN values, min and max should be ignored. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
771 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
772 | When writing statistics the following rules should be followed: |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
773 | - NaNs should not be written to min or max statistics fields. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
774 | - If the computed max value is zero (whether negative or positive), |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
775 | `+0.0` should be written into the max statistics field. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
776 | - If the computed min value is zero (whether negative or positive), |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
777 | `-0.0` should be written into the min statistics field. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
778 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
779 | :initarg :type-order :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
780 | (or null parquet-type-defined-order))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
781 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
782 | "Union to specify the order used for the min_value and max_value fields for a |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
783 | column. This union takes the role of an enhanced enum that allows rich |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
784 | elements (which will be needed for a collation-based ordering in the future). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
785 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
786 | Possible values are: |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
787 | * TypeDefinedOrder - the column uses the order defined by its logical or |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
788 | physical type (if there is no logical type). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
789 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
790 | If the reader does not support the value of this union, min and max stats |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
791 | for this column should be ignored. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
792 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
793 | (defclass parquet-page-location (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
794 | ((offset :documentation "Offset of the page in the file * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
795 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
796 | :initarg :offset :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
797 | (compressed-page-size :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
798 | "Size of the page, including header. Sum of compressed_page_size and header |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
799 | length |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
800 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
801 | :initarg :compressed-page-size :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
802 | (first-row-index :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
803 | "Index within the RowGroup of the first row of the page. When an |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
804 | OffsetIndex is present, pages must begin on row boundaries |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
805 | (repetition_level = 0). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
806 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
807 | :initarg :first-row-index :type (signed-byte 64)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
808 | (defclass parquet-offset-index (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
809 | ((page-locations :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
810 | "PageLocations, ordered by increasing PageLocation.offset. It is required |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
811 | that page_locations[i].first_row_index < page_locations[i+1].first_row_index. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
812 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
813 | :initarg :page-locations :type (vector parquet-page-location)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
814 | (unencoded-byte-array-data-bytes :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
815 | "Unencoded\\uncompressed size for BYTE_ARRAY types. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
816 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
817 | See documention for unencoded_byte_array_data_bytes in SizeStatistics for |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
818 | more details on this field. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
819 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
820 | :initarg :unencoded-byte-array-data-bytes :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
821 | (or null (vector (signed-byte 64))))) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
822 | (:documentation "Optional offsets for each data page in a ColumnChunk. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
823 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
824 | Forms part of the page index, along with ColumnIndex. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
825 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
826 | OffsetIndex may be present even if ColumnIndex is not. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
827 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
828 | (defclass parquet-column-index (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
829 | ((null-pages :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
830 | "A list of Boolean values to determine the validity of the corresponding |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
831 | min and max values. If true, a page contains only null values, and writers |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
832 | have to set the corresponding entries in min_values and max_values to |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
833 | byte[0], so that all lists have the same length. If false, the |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
834 | corresponding entries in min_values and max_values must be valid. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
835 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
836 | :initarg :null-pages :type (vector boolean)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
837 | (min-values :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
838 | "Two lists containing lower and upper bounds for the values of each page |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
839 | determined by the ColumnOrder of the column. These may be the actual |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
840 | minimum and maximum values found on a page, but can also be (more compact) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
841 | values that do not exist on a page. For example, instead of storing \\\\Blart |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
842 | Versenwald III\\, a writer may set min_values[i]=\\B\\, max_values[i]=\\C\\. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
843 | Such more compact values must still be valid values within the column's |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
844 | logical type. Readers must make sure that list entries are populated before |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
845 | using them by inspecting null_pages. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
846 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
847 | :initarg :min-values :type (vector octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
848 | (max-values :initarg :max-values :type (vector octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
849 | (boundary-order :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
850 | "Stores whether both min_values and max_values are ordered and if so, in |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
851 | which direction. This allows readers to perform binary searches in both |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
852 | lists. Readers cannot assume that max_values[i] <= min_values[i+1], even |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
853 | if the lists are ordered. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
854 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
855 | :initarg :boundary-order :type parquet-boundary-order) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
856 | (null-counts :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
857 | "A list containing the number of null values for each page * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
858 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
859 | :initarg :null-counts :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
860 | (or null (vector (signed-byte 64)))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
861 | (repetition-level-histograms :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
862 | "Contains repetition level histograms for each page |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
863 | concatenated together. The repetition_level_histogram field on |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
864 | SizeStatistics contains more details. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
865 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
866 | When present the length should always be (number of pages * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
867 | (max_repetition_level + 1)) elements. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
868 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
869 | Element 0 is the first element of the histogram for the first page. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
870 | Element (max_repetition_level + 1) is the first element of the histogram |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
871 | for the second page. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
872 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
873 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
874 | :initarg :repetition-level-histograms :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
875 | (or null (vector (signed-byte 64)))) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
876 | (definition-level-histograms :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
877 | "Same as repetition_level_histograms except for definitions levels. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
878 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
879 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
880 | :initarg :definition-level-histograms :initform nil :type |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
881 | (or null (vector (signed-byte 64))))) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
882 | (:documentation |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
883 | "Optional statistics for each data page in a ColumnChunk. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
884 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
885 | Forms part the page index, along with OffsetIndex. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
886 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
887 | If this structure is present, OffsetIndex must also be present. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
888 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
889 | For each field in this structure, <field>[i] refers to the page at |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
890 | OffsetIndex.page_locations[i] |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
891 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
892 | (defclass parquet-aes-gcm-v1 (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
893 | ((aad-prefix :documentation "AAD prefix * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
894 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
895 | :initarg :aad-prefix :initform nil :type (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
896 | (aad-file-unique :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
897 | "Unique file identifier part of AAD suffix * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
898 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
899 | :initarg :aad-file-unique :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
900 | (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
901 | (supply-aad-prefix :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
902 | "In files encrypted with AAD prefix without storing it, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
903 | readers must supply the prefix * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
904 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
905 | :initarg :supply-aad-prefix :initform nil :type (or null boolean)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
906 | (defclass parquet-aes-gcm-ctr-v1 (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
907 | ((aad-prefix :documentation "AAD prefix * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
908 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
909 | :initarg :aad-prefix :initform nil :type (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
910 | (aad-file-unique :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
911 | "Unique file identifier part of AAD suffix * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
912 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
913 | :initarg :aad-file-unique :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
914 | (or null octet-vector)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
915 | (supply-aad-prefix :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
916 | "In files encrypted with AAD prefix without storing it, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
917 | readers must supply the prefix * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
918 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
919 | :initarg :supply-aad-prefix :initform nil :type (or null boolean)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
920 | (defclass parquet-encryption-algorithm (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
921 | ((aes-gcm-v1 :initarg :aes-gcm-v1 :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
922 | (or null parquet-aes-gcm-v1)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
923 | (aes-gcm-ctr-v1 :initarg :aes-gcm-ctr-v1 :initform nil :type |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
924 | (or null parquet-aes-gcm-ctr-v1)))) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
925 | (defclass parquet-file-meta-data (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
926 | ((version :documentation "Version of this file * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
927 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
928 | :initarg :version :type (signed-byte 32)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
929 | (schema :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
930 | "Parquet schema for this file. This schema contains metadata for all the columns. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
931 | The schema is represented as a tree with a single root. The nodes of the tree |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
932 | are flattened to a list by doing a depth-first traversal. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
933 | The column metadata contains the path in the schema for that column which can be |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
934 | used to map columns to nodes in the schema. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
935 | The first element is the root * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
936 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
937 | :initarg :schema :type (vector parquet-schema-element)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
938 | (num-rows :documentation "Number of rows in this file * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
939 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
940 | :initarg :num-rows :type (signed-byte 64)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
941 | (row-groups :documentation "Row groups in this file * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
942 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
943 | :initarg :row-groups :type (vector parquet-row-group)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
944 | (key-value-metadata :documentation "Optional key\\value metadata * |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
945 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
946 | :initarg :key-value-metadata :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
947 | (or null (vector parquet-key-value))) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
948 | (created-by :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
949 | "String for application that wrote this file. This should be in the format |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
950 | <Application> version <App Version> (build <App Build Hash>). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
951 | e.g. impala version 1.0 (build 6cf94d29b2b7115df4de2c06e2ab4326d721eb55) |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
952 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
953 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
954 | :initarg :created-by :initform nil :type (or null string)) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
955 | (column-orders :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
956 | "Sort order used for the min_value and max_value fields in the Statistics |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
957 | objects and the min_values and max_values fields in the ColumnIndex |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
958 | objects of each column in this file. Sort orders are listed in the order |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
959 | matching the columns in the schema. The indexes are not necessary the same |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
960 | though, because only leaf nodes of the schema are represented in the list |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
961 | of sort orders. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
962 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
963 | Without column_orders, the meaning of the min_value and max_value fields |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
964 | in the Statistics object and the ColumnIndex object is undefined. To ensure |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
965 | well-defined behaviour, if these fields are written to a Parquet file, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
966 | column_orders must be written as well. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
967 | |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
968 | The obsolete min and max fields in the Statistics object are always sorted |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
969 | by signed comparison regardless of column_orders. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
970 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
971 | :initarg :column-orders :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
972 | (or null (vector parquet-column-order))) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
973 | (encryption-algorithm :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
974 | "Encryption algorithm. This field is set only in encrypted files |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
975 | with plaintext footer. Files with encrypted footer store algorithm id |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
976 | in FileCryptoMetaData structure. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
977 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
978 | :initarg :encryption-algorithm :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
979 | (or null parquet-encryption-algorithm)) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
980 | (footer-signing-key-metadata :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
981 | "Retrieval metadata of key used for signing the footer. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
982 | Used only in encrypted files with plaintext footer. |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
983 | " |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
984 | :initarg :footer-signing-key-metadata :initform nil :type |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
985 | (or null octet-vector))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
986 | (:documentation "Description for file metadata |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
987 | ")) |
640
642b3b82b20d
thrift fixes, org-get-with-inheritance init
Richard Westhaver <ellis@rwest.io>
parents:
637
diff
changeset
|
988 | (defclass parquet-file-crypto-meta-data (parquet-object) |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
989 | ((encryption-algorithm :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
990 | "Encryption algorithm. This field is only used for files |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
991 | with encrypted footer. Files with plaintext footer store algorithm id |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
992 | inside footer (FileMetaData structure). |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
993 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
994 | :initarg :encryption-algorithm :type parquet-encryption-algorithm) |
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
995 | (key-metadata :documentation |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
996 | "Retrieval metadata of key used for encryption of footer, |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
997 | and (possibly) columns * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
998 | " |
637
b88bf15f60d0
parquet tweaks, import ox-man
Richard Westhaver <ellis@rwest.io>
parents:
635
diff
changeset
|
999 | :initarg :key-metadata :initform nil :type (or null octet-vector))) |
635
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
1000 | (:documentation "Crypto metadata for files with encrypted footer * |
849f72b72b41
add back fuzz.lisp and proper codegen for parquet.json thrift definitions
Richard Westhaver <ellis@rwest.io>
parents:
diff
changeset
|
1001 | ")) |