# HG changeset patch # User Richard Westhaver # Date 1722062734 14400 # Node ID 6ac37a61456a27fbb4474ace491e9a8eedbbf122 # Parent d543f73892d3a6d21cb7a7b31c9ba6ef67bed7b6 bump diff -r d543f73892d3 -r 6ac37a61456a parquet-parsing.org --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/parquet-parsing.org Sat Jul 27 02:45:34 2024 -0400 @@ -0,0 +1,38 @@ +* DAT/PARQUET +https://github.com/apache/parquet-format +https://github.com/apache/parquet-format/blob/master/src/main/thrift/parquet.thrift +https://github.com/apache/parquet-testing +https://github.com/apache/parquet-java +** glossary +- block :: same as HDFS block +- file :: file metadata is required, data is not +- row-group :: a logical horizontal partitioning of the data into + rows. no physical rep is guaranteed for row-group +- column-chunk :: a chunk of the data for a particular column +- page :: column chunks are divided into pages. a page is conceptually + indivisible in terms of compression/encoding. multiple page types + can be interleaved in a column chunk. + +Files consists of 1+ row-groups. A row-group contains exactly one +column chunk per column. Column chunks contain one or more pages. + +** format summary +#+begin_example + 4-byte magic number "PAR1" + + + ... + + + + ... + + ... + + + ... + + File Metadata + 4-byte length in bytes of file metadata (little endian) + 4-byte magic number "PAR1" +#+end_example diff -r d543f73892d3 -r 6ac37a61456a q-notes.org --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/q-notes.org Sat Jul 27 02:45:34 2024 -0400 @@ -0,0 +1,33 @@ +* Queries +Q --- Query languages + +EQL = Event Query Language +SQL = Structured Query Language +LQL = Logical Query Language +GQL = Graph Query Language + +refs: +https://tdop.github.io/ +https://howqueryengineswork.com/01-what-is-a-query-engine.html +https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf +https://en.wikipedia.org/wiki/Graph_Query_Language +https://code.kx.com/q +https://docs.xtdb.com/reference/main/xtql/queries.html +https://neo4j.com/docs/cypher-manual +https://eql.readthedocs.io/en/latest/ +https://clojure.github.io/clojure-contrib/doc/datalog.html +https://www.researchgate.net/publication/2850953_Soft_Stratification_for_Magic_Set_Based_Query_Evaluation_in_Deductive_Databases +http://sieve.info/ +https://www.researchgate.net/publication/221542994_SARI-SQL_Event_query_language_for_event_analysis +https://www.elastic.co/guide/en/elasticsearch/reference/current/eql.html +https://www.youtube.com/watch?v=8XUutFBbUrg +https://www.scryer.pl/ +https://github.com/inconvergent/cl-grph +https://jakewheat.github.io/sql-overview/ +https://github.com/ronsavage/SQL +https://web.csulb.edu/colleges/coe/cecs/dbdesign/dbdesign.php +https://github.com/nikodemus/screamer +https://github.com/defunkydrummer/cl-gambol +https://www.lispworks.com/products/knowledgeworks.html +https://namin.seas.harvard.edu/files/namin/files/sql2c_jfp.pdf +https://scala-lms.github.io/tutorials/query.html