Nicolas Bouliane

Python 3 urllib examples

Posted on

This article is the missing manual for Python 3’s urllib. It shows you how to do basic things that are not clearly described in the official documentation.

The requests library is much easier to use than urllib. Only use urllib if you want to avoid external dependencies.

Request a URL, read response content

To make an HTTP request download a page with urllib, you must call urllib.request.urlopen().

import urllib.request

response = urllib.request.urlopen('https://nicolasbouliane.com')
response_content = response.read()

print(response_content)
# "<!doctype html>\n<html..."

A few notes:

Get response status code

To make an HTTP request download a page with urllib, you must call urllib.request.urlopen().

from urllib.error import HTTPError
import urllib.request

try:
    response = urllib.request.urlopen('https://nicolasbouliane.com')
    response_status = response.status  # 200, 301, etc
except HTTPError as error:
    response_status = error.code  # 404, 500, etc

A few notes:

Get response headers

urllib.request.urlopen() returns a http.client.HTTPResponse object. You get headers by calling response.getheaders() or getheader(header_name).

import urllib.request

response = urllib.request.urlopen('https://nicolasbouliane.com')
headers = response.getheaders()
content_type = response.getheader('Content-Type')

print(headers)
# [('Content-Type', 'text/html; charset=utf-8'), ('Transfer-Encoding', 'chunked'), ...]

print(content_type)
# "text/html; charset=utf-8"

A few notes: