changelog shortlog graph tags branches files raw help

Mercurial > org > notes / changeset: bump

changeset 8: 6ac37a61456a
parent 7: d543f73892d3
child 9: 4839b0675118
author: Richard Westhaver <ellis@rwest.io>
date: Sat, 27 Jul 2024 02:45:34 -0400
files: parquet-parsing.org q-notes.org
description: bump
     1.1--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2+++ b/parquet-parsing.org	Sat Jul 27 02:45:34 2024 -0400
     1.3@@ -0,0 +1,38 @@
     1.4+* DAT/PARQUET
     1.5+https://github.com/apache/parquet-format
     1.6+https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift
     1.7+https://github.com/apache/parquet-testing
     1.8+https://github.com/apache/parquet-java
     1.9+** glossary
    1.10+- block :: same as HDFS block
    1.11+- file :: file metadata is required, data is not
    1.12+- row-group :: a logical horizontal partitioning of the data into
    1.13+  rows. no physical rep is guaranteed for row-group
    1.14+- column-chunk :: a chunk of the data for a particular column
    1.15+- page :: column chunks are divided into pages. a page is conceptually
    1.16+  indivisible in terms of compression/encoding. multiple page types
    1.17+  can be interleaved in a column chunk.
    1.18+
    1.19+Files consists of 1+ row-groups. A row-group contains exactly one
    1.20+column chunk per column. Column chunks contain one or more pages.
    1.21+
    1.22+** format summary
    1.23+#+begin_example
    1.24+  4-byte magic number "PAR1"
    1.25+  <Column 1 Chunk 1>
    1.26+  <Column 2 Chunk 1>
    1.27+  ...
    1.28+  <Column N Chunk 1>
    1.29+  <Column 1 Chunk 2>
    1.30+  <Column 2 Chunk 2>
    1.31+  ...
    1.32+  <Column N Chunk 2>
    1.33+  ...
    1.34+  <Column 1 Chunk M>
    1.35+  <Column 2 Chunk M>
    1.36+  ...
    1.37+  <Column N Chunk M>
    1.38+  File Metadata
    1.39+  4-byte length in bytes of file metadata (little endian)
    1.40+  4-byte magic number "PAR1"
    1.41+#+end_example
     2.1--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     2.2+++ b/q-notes.org	Sat Jul 27 02:45:34 2024 -0400
     2.3@@ -0,0 +1,33 @@
     2.4+* Queries
     2.5+Q --- Query languages
     2.6+
     2.7+EQL = Event Query Language
     2.8+SQL = Structured Query Language
     2.9+LQL = Logical Query Language
    2.10+GQL = Graph Query Language
    2.11+
    2.12+refs:
    2.13+https://tdop.github.io/
    2.14+https://howqueryengineswork.com/01-what-is-a-query-engine.html
    2.15+https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf
    2.16+https://en.wikipedia.org/wiki/Graph_Query_Language
    2.17+https://code.kx.com/q
    2.18+https://docs.xtdb.com/reference/main/xtql/queries.html
    2.19+https://neo4j.com/docs/cypher-manual
    2.20+https://eql.readthedocs.io/en/latest/
    2.21+https://clojure.github.io/clojure-contrib/doc/datalog.html
    2.22+https://www.researchgate.net/publication/2850953_Soft_Stratification_for_Magic_Set_Based_Query_Evaluation_in_Deductive_Databases
    2.23+http://sieve.info/
    2.24+https://www.researchgate.net/publication/221542994_SARI-SQL_Event_query_language_for_event_analysis
    2.25+https://www.elastic.co/guide/en/elasticsearch/reference/current/eql.html
    2.26+https://www.youtube.com/watch?v=8XUutFBbUrg
    2.27+https://www.scryer.pl/
    2.28+https://github.com/inconvergent/cl-grph
    2.29+https://jakewheat.github.io/sql-overview/
    2.30+https://github.com/ronsavage/SQL
    2.31+https://web.csulb.edu/colleges/coe/cecs/dbdesign/dbdesign.php
    2.32+https://github.com/nikodemus/screamer
    2.33+https://github.com/defunkydrummer/cl-gambol
    2.34+https://www.lispworks.com/products/knowledgeworks.html
    2.35+https://namin.seas.harvard.edu/files/namin/files/sql2c_jfp.pdf
    2.36+https://scala-lms.github.io/tutorials/query.html