Home Products Documentation

What is this?

We build tools that bring the internet to life

What is LSD?

Conceptual overview

LSD is a fork of PostgreSQL designed to work as intended by the original creator of SQL (click here to read the paper), in particular one line from the abstract is worth highlighting

Activities of users at terminals and most application programs should
remain unaffected when the internal representation is changed and even
when some aspects of the external representation are changed.

As such, extracting information from a web page should be the same irrespective if the underlying CSS changes hence the flexibility of the language and browser

What can you do with LSD?

Note, when using with a driver like psycopg2, you will need to set the autocommit mode to True

import psycopg2
      conn = psycopg2.connect("dbname='lsd' host='lsd.so'")
      conn.set_session(autocommit=True)
      with self.conn.cursor() as curs:
      curs.execute("<query>")
      many_rows = curs.fetchall()

LSD is a fork of PostgreSQL where tables don't need to be defined beforehand due to taking advantage of how SELECT statements imply the desired structure of the output, take for example the following expression

$ psql -h lsd.so -U you
	you=> SELECT
	a
	FROM
	https://news.ycombinator.com
	GROUP BY
	span.titleline;

Here the URL is treated as a table identifier and the "a" column being selected refers to the CSS selector for an anchor tag (basically a link on the page) and you don't have to specify "a" to be available prior to running the query. The group by clause states that, rather than just grab the first matching element to the provided CSS selector, to group each "span.titleline" element into a row and then query for the anchor tag inside those containers

If a matching element is not found within the designated group, it'll see if the group itself is the thing intended to matched on, consider the following query to get all email links from a page

$ psql -h lsd.so -U you
      you=> SELECT
      a@href AS email
      FROM
      <url>
      WHERE
      a@href LIKE 'mailto:%'
      GROUP BY
      a;

Since the "@" symbol isn't used for CSS selectors, it's used in LSD as a mnemonic for "attribute" hence selecting the matched anchor tags' "href" attribute and aliasing into a column named "email".

Due to the selected CSS selector being the anchor tag as well as the group by, this query grabs all the anchor tags' href attributes in a page where the href starts with "mailto:" hence grabbing all email links contained within a page. In the case of the following query:

$ psql -h lsd.so -U you
	you=> SELECT
	a:nth-child(7)
	AS
	jobs
	FROM
	https://news.ycombinator.com
	GROUP BY
	span.titleline;

There doesn't exist a 7th anchor tag child inside one of the "titleline" containers so this query returns just one row for the "jobs" link at the top of the page

Examples

These are all providing live data as you see them on their respective websites

Football headlines

SELECT
    h3.title.h5 AS headline
    , span.date_time__KhlCV.time AS how_long_ago
    FROM
    https://www.goal.com/en-us/category/transfers/1/k94w8e1yy9ch14mllpf4srnks
    GROUP BY
    div.content-wrapper;

Sequoia's Fintech investments

SELECT
    td.company-listing__cell-wide.company-listing__text.u-md-hide AS company_tagline
    , th.company-listing__cell-wide.company-listing__head AS company_name
    , td.u-lg-hide AS company_stage
    , li AS investor
    FROM
    https://www.sequoiacap.com/our-companies/?_categories=fintech&_sort=stage_current-asc#all-panel
    GROUP BY
    tr.aos-init.aos-animate;

Sequoia companies that have gotten acquired

SELECT
    th.company-listing__cell-wide.company-listing__head AS company_name
    , td.company-listing__cell-wide.company-listing__text.u-md-hide AS company_function
    , td.u-lg-hide AS company_status
    , li AS investor
    FROM
    https://www.sequoiacap.com/our-companies/?_stage_current=acquired&_sort=stage_current-asc#all-panel
    GROUP BY
    tr.aos-init;

Hummingbird VC portfolio company names

SELECT
    p.paragraph.break-spaces.hide-tablet AS what_does_it_do
    , p.text-sm.hide-tablet AS location
    , p.text-sm AS name
    FROM
    https://www.hummingbird.vc/portfolio
    GROUP BY
    div.grid-item-row;

Propel VC portfolio company names

SELECT
    a.company-t AS portco
    FROM
    https://www.propel.vc/investments
    GROUP BY
    div.company_link_cover;

Abstract VC portfolio company names

SELECT
    a AS portco
    FROM
    https://www.abstractvc.com/companies
    WHERE a != '' AND a != 'About'
    GROUP BY
    div.MuiGrid-item;

Abstract VC's portfolio company logos

SELECT
    img.featured-logo AS portco
    FROM
    https://abstractvc.com/companies
    GROUP BY
    div.featured-inner;

Obligatory demo getting posts from Hacker News

SELECT
    a AS post_title,
    a@href AS post_link
    FROM
    https://news.ycombinator.com
    GROUP BY
    span.titleline;

Scraping a list of web scraping services from a scraping service website

SELECT
    a@href AS scraping_tool
    FROM
    https://www.octoparse.com/blog/top-30-free-web-scraping-software
    GROUP BY
    strong;

What is the (REST) API?

Conceptual overview

If you are looking to access the SQL database from client-side application then you may be interested in the general /api endpoint

What can you do with the (REST) API?

Simply provide the SQL statement that you'd like to run on your database in the query parameter (the below example is for JavasScript)

fetch(
  `https://lsd.so/api?query=${
    encodeURIComponent(
      'SELECT a FROM https://news.ycombinator.com GROUP BY span.titleline;'
    )
  }`
);

What is /knawledge?

Conceptual overview

Knawledge is how, with natural language, you can obtain information about the stuff that's on a page

What can you do with the knawledge API?

To make a request for knawledge, simply hit the knawledge API with the natural language query in the query parameter (the below example is for JavaScript)

fetch(
  `https://lsd.so/knawledge?query=${
    encodeURIComponent(
      'give me every post and link on hacker news'
    )
  }`
);

The way this works is by attempting to grab the following from the given input string

<noun (field)> <noun (field)> ... <preposition> <group of nouns (source)>

What are the aliases

Conceptual overview

The aliases bank can be seen as the developing bibliography for the web. If you've ever spent time digging through the Devtools in order to find the exact CSS selector then this is what you'd be interested in.

What can I do with the aliases API?

Provide either the url or title of the page you're interested in via query parameter to get a list of aliases that have been labeled for that page

fetch(
  `https://lsd.so/aliases?url=${
    encodeURIComponent('https://news.ycombinator.com')
  }`
);

What is the Bicycle?

Conceptual overview

The bicycle is an early version of what will be the modern equivalent of a memex

What can you do with the Bicycle?

In its present state, the Bicycle is a single-page web browser that, when "activated" via Control-K, provides a highlighter so that you can hover and click on the elements in front of you that you are interested in.

Normatively, if there exists information in a page in front of you, you should be able to grab it in the structured format of your liking

What is Lucy?

Conceptual overview

A natural language iMessage interface to interact with LSD

What can you do with the Lucy?

Announcement: We made our self-hosted iMessage Python client open source!

Users can send links with notes to Lucy to add to their “Me” tab and to be shown on Bicycle mobile. Users can also make natural language requests to LSD by using the command “Lucy” before the request. To start interacting with Lucy, go to the login page

Bicycle (mobile)

Conceptual overview

Bicycle for mobile is a companion app to the Bicycle desktop browser allowing users to see links they have shared and that others have shared and what people say about each link through their comments.

What can you do with the Bicycle for mobile?

Bicycle for mobile has two tabs. The first tab, “Explore”, shows a chronological view of the latest links that everyone has shared through Lucy. When a link is clicked, an expanded view shows the link and all comments left on it. The second tab, “Your Links”, shows the same view but specifically for links that you have shared

Need help?

Contact

If you need support with any of LSD's products, contact us and we'll do our best to resolve your issues.