Curling for web sites
I wanted information about ISO 3166-1 alpha-2 country codes. Google found me the definitive link (http://www.iso.org/iso/country_codes/iso_3166_code_lists.htm) but clicking on it showed the ISO website to be temporarily down for maintenance.
Rather than check back again every few minutes or hunt for stale information in the google cache, I got curl
and bash
to notify me when the site went live.
$ url=http://www.iso.org/iso/country_codes/iso_3166_code_lists.htm $ curl -I $url HTTP/1.1 302 Found Date: Tue, 27 May 2008 08:00:44 GMT Server: BIG-IP Location: http://www.iso.org/error/sitedown.html Via: 1.1 www.iso.org Connection: close Content-Type: text/html
Curl -I
fetches the page header only, which in this case uses a 302 status code to temporarily redirect clients to the sitedown.html
page. Using this information I wrote a simple while loop to ping the site every minute and determine when this status changed.
$ http_status() { curl -I -s $1 | head -1 | cut -d " " -f 2; }
$ while [ $(http_status $url) == 302 ]; do sleep 60; done; open $url
Open
is an OS X thing: when the loop completes open
just opens the web page in a browser tab.
To run this command in the background, &
it.
$ (while [ $(http_status $url) == 302 ]; do sleep 60; done; open $url)&
[1] 808
Here, the job has a handle of 1
and a process id of 808
. You can recover this information using jobs
.
$ jobs
[1]+ Running ( while [ $(http_status $url) == 302 ]; do
sleep 300;
done; open $url ) &
If you need to kill the job, kill %1
does the trick.