Trafilatura: Discover and Extract Text Data on the Web
install
uv tool install trafilatura
usage
trafilatura -u <url>
trafilatura -u <url> > _.txt
sed "s/$/\n/" _.txt > _2.md
uv tool install trafilatura
trafilatura -u <url>
trafilatura -u <url> > _.txt
sed "s/$/\n/" _.txt > _2.md