Database |> 

Language

LSD SQL is an internet first programming language and database, allowing you query web data directly using familiar SQL syntax, with no schema definitions required. Our database provides SQL without ontology, taking advantage of how the grammar of a SELECT statements tells you the desired output structure without requiring a CREATE TABLE beforehand. Simply specify a URL as your “table,” and use CSS selectors to determine which HTML elements to extract.

Contents

Key Differences from PosgtreSQL

LSD SQL adapts traditional SQL syntax to work with web data.

  • URLs as Tables: Replace traditional table names with any web URL. LSD fetches and parses the content automatically.
  • CSS selectors as Columns: Use any valid CSS selector to extract data from the HTML elements. No table definitions required - the web page is the table.
  • Time-based Filtering (coming soon): Apply temporal filters at the query level to get historical or real-time data. Time becomes a first-class query parameter rather than just another column.

Example

For example, get Hacker News posts and links with a simple SELECT statement:

FROM
  https://news.ycombinator.com
SELECT
  a AS post
, a@href AS link
GROUP BY
  span.titleline;

Or, using our pipe operator syntax:

FROM https://news.ycombinator.com
|> SELECT a AS post, a@href AS link
|> GROUP BY span.titleline


Keywords and Concepts

Concepts:

For querying:

For data manipulation:

  • MAP - Return sitemap and interactive elements
  • ASSIGN - Define variables
  • RUN - Execute commands
  • SCAN - (coming soon)
  • ZIP - Combine multiple queries

For navigation:

  • CRAWL - Return sitemap
  • DIVE - Navigate nested content
  • ENTER - Enter text entry field

For browser automation:

  • CLICK - Interact with elements
  • TYPE - (coming soon)
  • HOVER - Mouse over elements

For data extraction:

  • HTML - Get raw HTML
  • TEXT - Get readable text from a page
  • MARKDOWN - Parse markdown
  • PDF - Extract from PDFs
  • URL - Obtain resolved URLs

Related: