Am I using this right? I continually get “Fetch failed ‘not_fetched_yet?’, …” for each resource on a page.
Interactive Elixir (1.7.3) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Crawler.crawl("http://elixir-lang.org", max_depths: 2)
{:ok,
%{
assets: [],
depth: 0,
encode_uri: false,
html_tag: "a",
interval: 0,
max_depths: 2,
modifier: Crawler.Fetcher.Modifier,
parser: Crawler.Parser,
queue: #PID<0.256.0>,
retrier: Crawler.Fetcher.Retrier,
save_to: nil,
scraper: Crawler.Scraper,
timeout: 5000,
url: "http://elixir-lang.org",
url_filter: Crawler.Fetcher.UrlFilter,
user_agent: "Crawler/1.0.0 (https://github.com/fredwu/crawler)",
workers: 10
}}
iex(2)>
23:18:58.487 [debug] "Fetch failed 'not_fetched_yet?', with opts: %{assets: [], content_type: \"text/html\", depth: 1, encode_uri: false, headers: [{\"Server\", \"GitHub.com\"}, {\"Content-Type\", \"text/html; charset=utf-8\"}, {\"Last-Modified\", \"Wed, 05 Sep 2018 18:30:34 GMT\"}, {\"ETag\", \"\\\"5b9020ca-4e80\\\"\"}, {\"Access-Control-Allow-Origin\", \"*\"}, {\"Expires\", \"Fri, 07 Sep 2018 02:57:17 GMT\"}, {\"Cache-Control\", \"max-age=600\"}, {\"X-GitHub-Request-Id\", \"CBCE:36A8:66F1A1:8881F9:5B91E6B4\"}, {\"Content-Length\", \"20096\"}, {\"Accept-Ranges\", \"bytes\"}, {\"Date\", \"Fri, 07 Sep 2018 03:18:58 GMT\"}, {\"Via\", \"1.1 varnish\"}, {\"Age\", \"0\"}, {\"Connection\", \"keep-alive\"}, {\"X-Served-By\", \"cache-cmh8820-CMH\"}, {\"X-Cache\", \"MISS\"}, {\"X-Cache-Hits\", \"0\"}, {\"X-Timer\", \"S1536290339.583748,VS0,VE24\"}, {\"Vary\", \"Accept-Encoding\"}, {\"X-Fastly-Request-ID\", \"6be5015e3bd8a7dbc5292078f32e40a48f6fe0ce\"}], html_tag: \"a\", interval: 0, max_depths: 2, modifier: Crawler.Fetcher.Modifier, parser: Crawler.Parser, queue: #PID<0.256.0>, referrer_url: \"http://elixir-lang.org\", retrier: Crawler.Fetcher.Retrier, save_to: nil, scraper: Crawler.Scraper, timeout: 5000, url: \"http://elixir-lang.org\", url_filter: Crawler.Fetcher.UrlFilter, user_agent: \"Crawler/1.0.0 (https://github.com/fredwu/crawler)\", workers: 10}."
Any suggestions?
Michael