Extract open graph tags from an HTML document and return them in a simple JSON data structure.
| .gitignore | ||
| extract.py | ||
| LICENSE | ||
| README.md | ||
| requirements.txt | ||
opengraph-extractor
Extract open graph tags from an HTML document and return them in a simple JSON data structure. Specifically, look for the canonical site, title, url, summary, and image.
Twitter Cards
This is using the Twitter cards markup, taken from tags looking like <meta name="twitter:site" content="@minorthoughts">.
- twitter:site
- twitter:title
- twitter:url
- twitter:description
- twtter:image
Facebook Open Graph
This use the Open Graph protocol, created by Facebook. It's taken from tags looking like <meta property="og:site_name" content="Minor Thoughts"/>.
- og:site_name
- og:title
- og:url
- og:description
- og:image
- og:image:url
- og:image:secure_url
Google+ / Schema.org
This uses the Article schema. It's taken from tags looking like <meta itemprop="name" content="New Prosecutors Are Reopening Old Cases Against Police Officers : Minor Thoughts"/>
- publisher
- name
- headline
- description
- image