GH-590: Allow VARIANT value to be omitted#591
Conversation
|
@alamb @emkornfield @zeroshade Please take a look. I found that arrow-rs already allows to omit value array. https://github.com/apache/arrow-rs/blob/7616e10f12894232eacb807993e55427a067d061/parquet-variant-compute/src/variant_array.rs#L858 |
|
Could you help review this? @aihuaxu @steveloughran |
|
I don't think this is correct. The test cases are the ones that we use in Parquet Java and Iceberg to ensure that we can read variant data as long as it can be interpreted correctly. Being liberal about what we accept does not mean that it is a good idea for writers to produce values that way. We don't need to adjust the spec, this is just a reasonable choice for readers so they don't needlessly fail. |
|
Thanks @rdblue for the explanation! Then I agree that this spec doesn't need any change. Perhaps we have two follow-ups to avoid future confusion:
|
Rationale for this change
Allow
VARIANTvalue to be omitted.There are some cases in
parquet-testinghttps://github.com/apache/parquet-testing/tree/master/shredded_variant which don't havevaluefield.What changes are included in this PR?
Updated the descriptions in the variant-related Markdown files.
Do these changes have PoC implementations?
No.