Database |> Language |> 

CSS Selectors

CSS selectors are identifiers used to select HTML elements. In LSD, we use these selectors to extract specific data from web pages, similar to how columns work in traditional SQL. This allows you to target elements using familiar CSS syntax while leveraging SQL’s powerful querying capabilities.

Contents

Definition

In contrast to an ordinary SELECT statement where the columns are fields in a table:

SELECT <column> FROM <table>;

Columns in LSD SQL are CSS selectors.

FROM url
|> SELECT <css_selector>

For information on getting the value of an attribute and not the text content, see our documentation on attributes.

Examples

Simple

Suppose you were interested in getting the tagline on our website, the CSS selector for that would be (spoiler alert!) the h1 tag.

FROM https://lsd.so
|> SELECT h1



Or, if you were to write the above statement using variables for readability in the output.

tagline <| h1 |

FROM https://lsd.so
|> SELECT tagline



Complex

When you want to grab more granularly, you can use attribute selectors to match elements that abide by a particular pattern like lobotomized owls.

* + *

For example, suppose we were interested in getting links from the front page of Hacker News only when it points to something on GitHub. We could do that using a prefix selector with the expression a[href^=”https://github.com”].

hn <| https://news.ycombinator.com |
link_on_github <| a[href^="https://github.com"] |
link <| a@href |

FROM hn
|> GROUP BY link_on_github
|> SELECT link




Related: