Do y'all know about textise? I don't see mention of it come up in a quick search. https://www.textise.net/
It can be used with the duckduckgo bang !textise
It also works over Tor, where I can use it as a proxy to avoid Cloudflare checkpoints.
I don't think that it is open source but not completely sure.
Copy from the site intro:
Textise is a new way of looking at the Web. It’s an internet tool that removes everything from a web page except for its text. In practice, this means that images, forms, scripts, adverts, they all go, leaving plain text. Find out more here… (https://textise.wordpress.com/about-textise/)
How to use this page
- Type or paste the URL of a web page into the box below and click "Textise". A text only version of the web page will be displayed.
- Type a search term into the box, select a search engine from the drop-down list, and click "Search". You will be taken to a text only version of the search results.
Does textise support what Reader mode doesn't? If reader mode can't determine the central content, does textise have more logic to so so?
Given the wording I also want to point out a website doesn't have to actively explicitly support reader mode. They only have to follow html website standards marking their content - a general accessibility approach too.
Technically, you’re correct.
However, many websites doesn’t follow the appropriate HTML standards and just abuse h1 and p.
I just tried it with Google.com and it seems to remove all html notations other than text.
It useful in some cases such as wordpress one-page websites which have their story, mission, products, etc…