Extract open graph tags from an HTML document and return them in a simple JSON data structure.
Find a file
2022-05-10 16:22:54 -07:00
.gitignore Ignore pew configuration 2022-05-10 16:22:54 -07:00
extract.py Start downloading the HTML directly 2022-05-10 16:21:15 -07:00
LICENSE Initial commit 2022-05-10 17:11:04 +00:00
README.md Describe the tags that are being parsed out 2022-05-10 13:40:16 -07:00
requirements.txt Start downloading the HTML directly 2022-05-10 16:21:15 -07:00

opengraph-extractor

Extract open graph tags from an HTML document and return them in a simple JSON data structure. Specifically, look for the canonical site, title, url, summary, and image.

Twitter Cards

This is using the Twitter cards markup, taken from tags looking like <meta name="twitter:site" content="@minorthoughts">.

  • twitter:site
  • twitter:title
  • twitter:url
  • twitter:description
  • twtter:image

Facebook Open Graph

This use the Open Graph protocol, created by Facebook. It's taken from tags looking like <meta property="og:site_name" content="Minor Thoughts"/>.

  • og:site_name
  • og:title
  • og:url
  • og:description
  • og:image
  • og:image:url
  • og:image:secure_url

Google+ / Schema.org

This uses the Article schema. It's taken from tags looking like <meta itemprop="name" content="New Prosecutors Are Reopening Old Cases Against Police Officers : Minor Thoughts"/>

  • publisher
  • name
  • headline
  • description
  • image