I know about utils like html2text, BeautifulSoup etc. but the issue is that they also extract javascript and add it to the text making it tough to separate them.
htmlDom = BeautifulSoup(webPage)
htmlDom.findAll(text=True)
Alternately,
from stripogram import html2text
extract = html2text(webPage)
Both of these extract all the javascript on the page as well, this is undesired.
I just wanted the readable text which you could copy from your browser to be extracted.