Database |> 

Quickstart

Eager to get started? Start by authenticating. Once authenticated, LSD lang can be run in the workbench or python.

If you’re non technical, explore public queries on the nexus or make them using our browser (download) (docs).

Contents

Example 1

Let’s take a look at a simple example to extract data from a web page:

FROM https://news.ycombinator.com
|> SELECT a AS post
|> GROUP BY span.titleline



So what did that code do?

  1. We start by telling LSD we’d like to pull data FROM Hacker News.
  2. We then SELECT the a tag (post title) to be extracted.
  3. We GROUP the a tags BY the span.titleline class.
  4. On running the query, LSD accesses the page, extracts the data, and make a table of the a tags (titles) in a column called post.

FROM, SELECT, and GROUP BY are all keywords with unique functionality to control the browser to wrangle data on the web.

To run this code from python, you can use the following script:

# The postgres adapter used to connect to LSD with
import psycopg2

# Setting up a connection to LSD via postgres
conn = psycopg2.connect("host='lsd.so' dbname='andrea@lsd.so' password='<api key>'")

with conn.cursor() as curs:
  curs.execute("""
    PORTAL <| OPEN |

FROM https://news.ycombinator.com
|> SELECT a AS post
|> GROUP BY span.titleline
  """)
  rows = curs.fetchall()
  for row in rows:
    print(row)

Remember to install psycopg2 and fill in your api key (available from your profile). What you’ll notice is the syntax is the same in python as the code we ran in the workbench.

Example 2

PORTAL <| OPEN |

post_container <| div.details |
post <| a |
domain <| a.domain |
author <| a.u-author.h-card |

FROM https://lobste.rs/
|> GROUP BY post_container
|> SELECT post, domain, author



So what did that code do?

  1. We start by telling LSD we’d like to open the PORTAL (a window to see our SQL execute). In this case, it’s a screenshot of the page we’re pulling data from.
  2. We then ASSIGN variables ‘post_container’, ‘post’, ‘domain’, and ‘author’ to their respective tags ‘div.details’, ‘a’, ‘a.domain’, and ‘a.u-author.h-card’.
  3. Finally, we GROUP the results BY the span.titleline tag as we did in the first example.

This example is similar to the first, but it shows how to activate the portal, assign variables to tags, and pull multiple columns from the page (in a repeating structure). The next example will dig deeper showing you how to programatically control a browser.

Example 3

PORTAL <| OPEN |

calculators <| https://www.smooth-on.com/support/calculators/ |
pour_on_mold <| div[data-calcid="pour-mold"] |
product_dropdown <| #pour-prod |
dropdown_value <| "24.7" |
model_volume_input <| #pour-model-volume |
model_volume <| "12" |
box_volume_input <| #pour-box-volume |
box_volume <| "20" |
calculate_button <| #pour-calculate |
estimate <| #pour-results |

FROM calculators
|> CLICK ON pour_on_mold
|> CHOOSE IN product_dropdown dropdown_value
|> ENTER INTO model_volume_input model_volume
|> ENTER INTO box_volume_input box_volume
|> CLICK ON calculate_button
|> SELECT estimate



So what did that code do?

(ignoring portal opening and variable assignment, though you’ll notice the portal is a video this time.)

  1. Navigate to the web page.
  2. Click on the ‘Pour on Mold Estimator’ button.
  3. Choose ‘24.7’ (the selector corresponding to ‘Compact 45’) from the dropdown.
  4. Enter ‘12’ into the ‘Model Volume’ input.
  5. Enter ‘20’ into the ‘Box Volume’ input.
  6. Click on the ‘Calculate’ button.
  7. Select the ‘Estimate’ tag.

Related: