Database |> Language |> 

Attributes

HTML elements often contain valuable data in their attributes (like URLs in href or file paths in src). In CSS you will find the @ symbol used for defining rules but never selectors, so in LSD we use it extract these values. In LSD, it’s used as a mnemonic for getting these values.

Contents

Definition

To obtain an attribute value from an element, add @ after the CSS selector for the element you’re interested in. For example:

FROM <url>
|> SELECT <css_selector>@<attribute> AS <your_label>

Delimiter

The delimiter is the symbol that separates the CSS selector from the attribute. In LSD, we use @.

(element).(class)@(attribute) 

For example:

a@href         # Gets href attribute from anchor tags
img@src       # Gets src attribute from image tags
div@class     # Gets class attribute from div tags

Digging deeper, you can also use the delimiter to get the attribute value of a nested element. For example:

a.external@href       # Gets href from anchor tags with class "external"
img.avatar@src        # Gets src from img tags with class "avatar"
div.post@data-id     # Gets data-id from div elements with class "post"

Example

Let’s take the following query which provides the front page of Lobsters.

post_container <| div.details |
post <| a |
domain <| a.domain |
author <| a.u-author.h-card |

FROM https://lobste.rs/
|> GROUP BY post_container
|> SELECT post, domain, author



However, if you wanted to not only grab the post title provided by the author but also the link itself, then you’d be interested in grabbing the href attribute of the anchor tag for post. This could then be simply rewritten to post@href.

post_container <| div.details |
post <| a |
domain <| a.domain |
author <| a.u-author.h-card |

FROM https://lobste.rs/
|> GROUP BY post_container
|> SELECT post, post@href, domain, author



For readability reasons, the above could be written as the below with the attribute selector applied to the CSS selector a itself instead of the variable post.

post_container <| div.details |
post <| a |
post_link <| a@href |
domain <| a.domain |
author <| a.u-author.h-card |

FROM https://lobste.rs/
|> GROUP BY post_container
|> SELECT post, post_link, domain, author




Related: