WebScraper uses the Integrity v8 engine to quickly scan a website, and can output extracted data as CSV or JSON. Plus download images to a folder.
Easy to scan a site - just enter the starting URL and press "Go"
Easy to export - choose the columns you want
Plenty of extraction options, including HTML elements with certain classes or IDs, regular expressions, or entire content in a number of formats (html, plain text, markdown)
'helper' utilities within the app make it easy to find a suitable class / id or produce a regular expression (regex) to extract the data you want
Since v4.1 can download to a folder all images discovered
Configuration of various limits on the crawl and the output file size
What's new in WebScraper
Adds setting 'Legacy webview'. The new default = use the up-to-date WebKit webview for rendering, however, the legacy version may work better in some cases and so is retained as an option.
The setting 'Attempt authentication' has been relabeled 'handle cookies' (the button's function remains unchanged) because it's sometimes advantageous to have cookie handling switched on regardless of whether you're attempting to authenticate
and the new label describes what the button actually does.