This function scrapes a web page for all links (<a>
tags) and extracts both
the URLs and the link text.
Source
The source code of this function was taken from this gist.
Value
A tibble with two columns: link_text
containing the text of each
link, and url
containing the absolute URL of each link. The tibble is
sorted by URL and then by link text, and only unique links are included.
Examples
ScrapLinks("https://github.com/")
#> # A tibble: 123 × 2
#> link_text url
#> <chr> <chr>
#> 1 "" https:/github.com/
#> 2 "Reload" https:/github.com/
#> 3 "Jump to footnote 1" https:/github.com/#footnote-1
#> 4 "Jump to footnote 2" https:/github.com/#footnote-2
#> 5 "" https:/github.com/#footnote-ref-1
#> 6 "" https:/github.com/#footnote-ref-2
#> 7 "" https:/github.com/#hero
#> 8 "Skip to content" https:/github.com/#start-of-content
#> 9 "About" https:/github.com/about
#> 10 "Inclusion" https:/github.com/about/diversity
#> # ℹ 113 more rows