When you are interested in vertically merging more than one repeating container where you’d otherwise GROUP BY one expression, the AGGREGATE keyword can be helpful.
FROM <url>
|> AGGREGATE GROUPS <css_selector>, <css_selector>
|> SELECT <css_selector>, <css_selector>
An example of this would be when there are two different repeating containers that contain the same field selectors as children
┌──────────────────────┐
│.container_selector_a │
│ │
│ ┌──────────────────┐│
│ │.field_selector_a ││
│ └──────────────────┘│
│ ┌──────────────────┐│
│ │.field_selector_b ││
│ └──────────────────┘│
└──────────────────────┘
┌──────────────────────┐
│.container_selector_b │
│ │
│ ┌──────────────────┐│
│ │.field_selector_a ││
│ └──────────────────┘│
│ ┌──────────────────┐│
│ │.field_selector_b ││
│ └──────────────────┘│
└──────────────────────┘
┌──────────────────────┐
│.container_selector_a │
│ │
│ ┌──────────────────┐│
│ │.field_selector_a ││
│ └──────────────────┘│
│ ┌──────────────────┐│
│ │.field_selector_b ││
│ └──────────────────┘│
└──────────────────────┘
┌──────────────────────┐
│.container_selector_b │
│ │
│ ┌──────────────────┐│
│ │.field_selector_a ││
│ └──────────────────┘│
│ ┌──────────────────┐│
│ │.field_selector_b ││
│ └──────────────────┘│
└──────────────────────┘
Where the corresponding query to select all instances of .field_selector_a
and .field_selector_b
would look like the following:
container_a <| .container_selector_a |
container_b <| .container_selector_b |
field_a <| .field_selector_a |
field_b <| .field_selector_b |
FROM url
|> AGGREGATE GROUPS container_a, container_b
|> SELECT field_a, field_b
For a given Wikipedia article, there are section headers in between paragraphs with CSS classes like the following:
┌──────────────────────┐
│.mw-heading2 │
└──────────────────────┘
┌──────────────────────┐
│.mw-heading3 │
└──────────────────────┘
┌──────────────────────┐
│.mw-heading2 │
└──────────────────────┘
┌──────────────────────┐
│.mw-heading3 │
└──────────────────────┘
To obtain the section headers alone we can use the following query with AGGREGATE GROUPS:
wiki <| https://en.wikipedia.org/wiki/Memex |
heading_a <| .mw-heading2 |
heading_b <| .mw-heading3 |
heading <| div |
FROM wiki
|> AGGREGATE GROUPS heading_a, heading_b
|> SELECT heading