requests - Documentation

What is the Requests Library?

Requests is a popular and elegant Python HTTP library. It allows you to send HTTP requests (like GET, POST, PUT, DELETE, etc.) easily and efficiently. It handles the complexities of making HTTP requests, such as encoding data, handling headers, and managing cookies, making it much simpler than using Python’s built-in urllib library. It’s designed to be user-friendly and intuitive, letting you focus on the application logic rather than the low-level details of HTTP communication.

Why Use Requests?

Requests offers several key advantages over lower-level alternatives:

Installation and Setup

Requests is easily installed using pip, the Python package installer:

pip install requests

This command will download and install the Requests library and its dependencies. No further configuration is usually required.

Basic Request Example

The simplest way to make a GET request is:

import requests

response = requests.get('https://www.example.com')

# Check the status code (200 means success)
print(response.status_code)

# Access the content of the response
print(response.text)  #Get the response body as text
print(response.json()) #Get the response body as a json object (if applicable)

This code snippet first imports the requests library. Then, it makes a GET request to https://www.example.com. The response object contains the server’s response, including the status code (e.g., 200 for OK) and the content. The example then prints the status code and the response text. If the response is JSON, you can use response.json() for convenient parsing. Remember to handle potential exceptions (like requests.exceptions.RequestException) in production code.

Making GET Requests

Simple GET Requests

The most basic GET request is made using the requests.get() function, passing the URL as an argument:

import requests

response = requests.get('https://www.example.com')

print(response.status_code)  # Check the status code (e.g., 200)
print(response.text)       # Access the response content as text

This sends a GET request to the specified URL. The response object contains details about the request, including the status code and the response content. Always check the status code to ensure the request was successful (200 OK is a common success code).

Handling Query Parameters

Query parameters are appended to the URL after a question mark (?). They are used to pass additional information to the server. Requests handles these easily:

import requests

params = {'key1': 'value1', 'key2': 'value2'}
response = requests.get('https://api.example.com/data', params=params)

print(response.url) # Print the full URL with query parameters
print(response.json()) # Parse JSON response if applicable

The params dictionary is automatically converted into a query string.

Working with Headers

HTTP headers provide additional information about the request. You can set custom headers using the headers parameter:

import requests

headers = {'User-Agent': 'My Custom User Agent', 'Accept': 'application/json'}
response = requests.get('https://api.example.com/data', headers=headers)

print(response.headers) # Access response headers
print(response.json()) # Parse JSON response if applicable

This example sets a custom User-Agent header, which helps identify your application to the server. Setting the Accept header specifies the desired response content type.

Accessing Response Data

The response object provides several ways to access the data:

Handling Different Content Types

Requests automatically handles various content types. However, you might need to specify the expected content type using the headers parameter’s Accept header or handle the response accordingly (e.g. by checking the Content-Type header in the response). For example, to ensure you get JSON back:

import requests

headers = {'Accept': 'application/json'}
response = requests.get('https://api.example.com/data', headers=headers)

try:
    data = response.json()
    print(data)
except requests.exceptions.JSONDecodeError:
    print("Response is not valid JSON")

Error Handling and Status Codes

Always check the response.status_code to ensure the request was successful. Non-2xx status codes indicate errors. You can use try...except blocks to handle potential exceptions:

import requests

try:
    response = requests.get('https://api.example.com/data')
    response.raise_for_status()  # Raise an exception for bad status codes (4xx or 5xx)
    print(response.json())
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

response.raise_for_status() conveniently raises an exception for bad status codes, simplifying error handling. The requests.exceptions.RequestException handles various network and HTTP errors.

Making POST Requests

Sending Data with POST Requests

POST requests are used to send data to the server. The data is typically sent in the request body. The basic structure is similar to GET requests, but using requests.post():

import requests

data = {'key1': 'value1', 'key2': 'value2'}
response = requests.post('https://api.example.com/submit', data=data)

print(response.status_code)
print(response.text)

This sends a POST request with the data provided in the data dictionary. The server will process this data.

Different Data Encoding Types (form-data, json, etc.)

Requests supports different ways to encode the data sent in the request body.

import requests

data = {'key1': 'value1', 'key2': 'value2'}
response = requests.post('https://api.example.com/submit', json=data)

print(response.status_code)
print(response.json()) #Parse the JSON response.

This sends the data as a JSON object. The server should expect JSON data.

Handling POST Responses

Handling POST responses is similar to GET requests. Check the response.status_code to verify success. Access the response content using response.text, response.content, or response.json() depending on the expected content type. Remember to handle potential errors using try...except blocks. For example:

import requests

try:
    response = requests.post('https://api.example.com/submit', json={'key': 'value'})
    response.raise_for_status() # Raise HTTPError for bad responses (4xx or 5xx)
    print(response.json())
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

Working with Files in POST Requests

To upload files, use the files parameter with a dictionary where keys are the field names and values are file-like objects (e.g., opened files or file paths):

import requests

files = {'file': open('my_file.txt', 'rb')}  # 'rb' for binary read mode
response = requests.post('https://api.example.com/upload', files=files)

print(response.status_code)
print(response.text)

This uploads my_file.txt. Remember to close the file after the request using files['file'].close(). Alternatively, you can use with open(...) as f: to manage file closing automatically. The server-side needs to be configured to handle file uploads. You might also need to adjust the Content-Type header appropriately, depending on the server’s requirements. For example, a multipart/form-data content type is common for file uploads. Requests usually handles this automatically when using the files parameter.

Handling Responses

Response Status Codes

HTTP status codes indicate the outcome of a request. The response.status_code attribute provides this information. Codes in the 2xx range generally indicate success, while 4xx codes represent client-side errors (like 404 Not Found), and 5xx codes indicate server-side errors (like 500 Internal Server Error). Always check the status code to ensure your request was successful. For example:

import requests

response = requests.get('https://www.example.com')

if response.status_code == 200:
    print("Request successful!")
elif response.status_code == 404:
    print("Not Found")
else:
    print(f"Request failed with status code: {response.status_code}")

Accessing Response Headers

Response headers provide additional information about the response. Access them using response.headers, which is a dictionary-like object:

import requests

response = requests.get('https://www.example.com')

content_type = response.headers.get('Content-Type')
print(f"Content-Type: {content_type}")

server = response.headers.get('Server')
print(f"Server: {server}")

response.headers.get('Header-Name') is used to safely access a header, returning None if the header isn’t present.

Working with Response Content (text, json)

The response content can be accessed in several ways:

import requests

response = requests.get('https://api.example.com/data')

if response.status_code == 200:
    try:
        data = response.json()
        print(data)
    except requests.exceptions.JSONDecodeError:
        print("Invalid JSON response")

Handling Errors and Exceptions

Requests raises exceptions for various errors, including network issues and HTTP errors. Use try...except blocks to handle these:

import requests

try:
    response = requests.get('https://api.example.com/data')
    response.raise_for_status() #Raises HTTPError for bad responses (4xx or 5xx)
    #Process the successful response here
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")
except requests.exceptions.HTTPError as e:
    print(f"HTTP Error: {e}")
except requests.exceptions.ConnectionError as e:
    print(f"Connection Error: {e}")
except requests.exceptions.Timeout as e:
    print(f"Timeout Error: {e}")

response.raise_for_status() is helpful for explicitly raising exceptions for bad status codes.

Checking for Redirects

Requests automatically handles redirects (3xx status codes). To access information about redirects:

import requests

response = requests.get('https://shorturl.at/somelink', allow_redirects=True) #allow_redirects is True by default

print(response.url) #The final URL after redirects
print(response.history) #List of intermediate responses during redirects.

Setting allow_redirects=False will prevent automatic redirection and allow handling redirects manually.

Working with Cookies

Requests handles cookies automatically. To access or manipulate cookies:

import requests

session = requests.Session() # Use a session for managing cookies across requests.

response = session.get('https://www.example.com/login', data={'username':'user', 'password':'pass'})
cookies = session.cookies

response2 = session.get('https://www.example.com/profile') #Cookies from the login will be sent automatically
#access cookies using cookies.get_dict() etc.

print(response2.text)

Using a requests.Session() object is recommended for managing cookies across multiple requests. You can directly access and modify cookies through the session.cookies attribute for more advanced control.

Advanced Usage

Authentication Methods (Basic, OAuth, etc.)

Requests supports various authentication methods.

import requests

response = requests.get('https://api.example.com/data', auth=('username', 'password'))
print(response.text)
import requests

access_token = "YOUR_ACCESS_TOKEN"  #Obtain this through the OAuth flow.
headers = {'Authorization': f'Bearer {access_token}'}
response = requests.get('https://api.example.com/data', headers=headers)
print(response.text)

The specific implementation depends on the OAuth provider.

Using Proxies

To use a proxy server, use the proxies parameter with a dictionary specifying the proxy for each protocol (HTTP and HTTPS):

import requests

proxies = {
    'http': 'http://user:password@proxy.example.com:8080',
    'https': 'http://user:password@proxy.example.com:8080',
}

response = requests.get('https://www.example.com', proxies=proxies)
print(response.text)

Replace placeholders with your proxy server details. If you don’t need authentication, omit the user:password@ part.

Setting Timeouts

To prevent requests from hanging indefinitely, set timeouts using the timeout parameter (in seconds):

import requests

try:
    response = requests.get('https://www.example.com', timeout=5)  # 5-second timeout
    print(response.text)
except requests.exceptions.Timeout:
    print("Request timed out")
except requests.exceptions.RequestException as e:
    print(f"An error occurred: {e}")

You can specify a tuple for separate connect and read timeouts: timeout=(connect_timeout, read_timeout).

Handling SSL Certificates

For self-signed or untrusted certificates, you may need to disable SSL verification or provide a custom certificate bundle:

import requests

# Disable SSL verification (INSECURE - ONLY FOR DEVELOPMENT/TESTING!)
response = requests.get('https://selfsigned.example.com', verify=False)

#Use a custom CA bundle (more secure than verify=False)
response = requests.get('https://selfsigned.example.com', verify='/path/to/my/ca.pem')

print(response.text)

Warning: Disabling SSL verification (verify=False) is highly discouraged in production environments as it compromises security. Use a custom CA bundle whenever possible for secure handling of self-signed certificates.

Working with Sessions

requests.Session() objects persist cookies, headers, and other settings across multiple requests, making them efficient for interacting with websites or APIs that require authentication or state management.

import requests

session = requests.Session()
session.auth = ('username', 'password')

response1 = session.get('https://api.example.com/login')
response2 = session.get('https://api.example.com/data') # Authentication from response1 is reused.

print(response2.text)

The session automatically handles cookies and other persistent data.

Using Custom Adapters

Requests uses adapters to handle connections. You can create custom adapters for special needs (e.g., handling specific protocols or proxies):

import requests

class MyAdapter(requests.adapters.HTTPAdapter):
    def send(self, request, **kwargs):
        # Add custom logic here (e.g., modify the request before sending)
        return super().send(request, **kwargs)

session = requests.Session()
session.mount('http://', MyAdapter()) #Mount the adapter for http:// URLs
session.mount('https://', MyAdapter())#Mount the adapter for https:// URLs


response = session.get('https://www.example.com')
print(response.text)

Custom adapters provide powerful control over the underlying HTTP connection handling. This is advanced usage and requires a good understanding of HTTP and Requests’ internals.

Best Practices and Security

Security Considerations

Error Handling and Logging

Example using Python’s logging module:

import logging
import requests

logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

try:
    response = requests.get('https://api.example.com/data')
    response.raise_for_status()
    logging.info(f"Request successful. Status code: {response.status_code}")
    #Process successful response
except requests.exceptions.RequestException as e:
    logging.error(f"Request failed: {e}")

Efficient Request Handling

Rate Limiting and Throttling

Many APIs implement rate limiting to prevent abuse. If you exceed the rate limit, your requests might be temporarily blocked.

Example of adding a delay to handle rate limiting:

import time
import requests

rate_limit = 1  # requests per second
time_between_requests = 1/rate_limit

for i in range(10):
    response = requests.get('https://api.example.com/data')
    if response.status_code == 200:
        #Process the response
        pass
    else:
        print(f"Request failed with status code: {response.status_code}")
    time.sleep(time_between_requests)

Working with Different APIs

RESTful APIs

RESTful APIs are the most common type of web API. They use standard HTTP methods (GET, POST, PUT, DELETE) to interact with resources. Requests is well-suited for working with RESTful APIs.

Example: Fetching data from a REST API using a GET request:

import requests

url = "https://api.example.com/users/123"
response = requests.get(url)

if response.status_code == 200:
    data = response.json()
    print(data)
else:
    print(f"Error: {response.status_code}")

This example makes a GET request to retrieve user data. The response is assumed to be JSON. Error handling is essential; always check the status code. For POST, PUT, and DELETE requests, use the appropriate requests method (requests.post, requests.put, requests.delete), providing the data in the request body as needed (often as JSON using the json parameter). Remember to handle different response status codes and potential errors. The specific details depend on the API’s documentation.

GraphQL APIs

GraphQL APIs use a query language to specify exactly the data needed. Requests can be used with GraphQL by sending a POST request with the query in the request body. You’ll typically need to handle the JSON response appropriately.

Example: Making a GraphQL query:

import requests

url = "https://api.example.com/graphql"
query = """
    query GetUser($id: ID!) {
        user(id: $id) {
            id
            name
            email
        }
    }
"""
variables = {"id": "123"}

response = requests.post(url, json={"query": query, "variables": variables})

if response.status_code == 200:
    data = response.json()
    print(data['data']['user'])
else:
    print(f"Error: {response.status_code}")

This example sends a GraphQL query to fetch user data. The query and variables are sent as JSON. The response contains the data in the data field. Error handling and checking the response structure are vital. Always consult the GraphQL API’s documentation for specific query structures and response formats.

Other API types

Requests can be used with other API types as well, though the specifics will depend on the API’s communication protocol and data format. This might include:

For any API type beyond the most common RESTful APIs, referring to the API’s specific documentation is crucial to understand the correct request format (headers, body, authentication, etc.) and how to correctly parse the response. Requests provides the fundamental HTTP capabilities, but often, additional libraries or techniques are needed for seamless interaction with more specialized APIs.

Appendix

Glossary of Terms

Common Error Messages and Solutions