How to Use a Proxy with Python Requests
Are you scraping data without a proxy? You’re likely already hitting roadblocks such as CAPTCHAs, IP bans, geo-restrictions, and more.
Scaling your web scraping operation might be very difficult if you keep getting blocked by websites. Hence to avoid such blocking mechanisms and maintain security when extracting data from websites, you need proxies.
In this tutorial, you’ll learn how to use Python’s requests library with a proxy server.
_💡_For top-performing proxies that work efficiently with requests or any other library, check out the following article:
Prerequisites and Setup
This guide assumes you are familiar with Python and want to scrape content from the web using the requests module behind a proxy.
Requirements:
- Python Installed: Make sure Python is installed. If not, you can download Python for your operating system.
- Basic Python Knowledge: You should be comfortable with basic Python code. You can also check out the
requests
documentation to learn more. - requests Library: First, check if
requests
is installed by running:
pip freeze
If requests
isn’t listed, install it using:
pip install requests
- A proxy service. For the purpose of this guide, you’ll be using Bright Data’s proxies.
👉 Check out the different proxy types offered by Bright Data:
First, you’ll look at a basic requests
script using your local machine’s IP. After that, you’ll look at how to use proxies, set sessions, and rotate proxies, and then, you’ll learn how to acquire one ISP proxy from Bright Data and integrate it within your script.
Finally, you’ll test the script to confirm whether you’re sending a request from Bright Data’s servers. The output you receive should reflect one of Bright Data’s IP addresses rather than your local machine’s.
Let’s begin.
How to Use Python Requests
Let’s start by making a simple GET request using the requests
library.
import requests
url = "https://httpbin.org/ip"
response = requests.get(url)
# Output the response in JSON format
print(response.json())
If all goes well, here’s what you should get back. This displays your origin IP address.
{‘origin’: ‘171.76.87.42’}
Now, let’s learn how to use a proxy to mask or change this IP.
How to Use a Proxy with Python Requests
To use a proxy with requests
, you’ll need to pass a proxies
dictionary when making the request.
import requests
# Define proxies for HTTP and HTTPS
proxies = {
'http': 'http://204.185.204.64:8080',
'https': 'https://204.185.204.64:8080',
}
url = "https://httpbin.org/ip"
response = requests.get(url, proxies=proxies)
# Print the new IP address from the proxy
print(response.json())
Here, the proxies
dictionary contains two entries: one for HTTP and one for HTTPS, both pointing to the same proxy server (and using port 8080). When you make the request, it uses the proxy applicable, and masks your real IP.
Now, you’ll notice that you get a different response back.
{‘origin’: ‘204.185.204.64’}
Now, the output shows the origin IP address coming from the proxy server, not your actual IP. Congratulations, you have proxies working!
How to Use Proxy Authentication
Typically, quality proxies from reputed providers require authentication. You can include your username and password directly in the proxy URL.
proxies = {
'http': 'http://username:password@204.185.204.64:8080',
'https': 'https://username:password@204.185.204.64:8080',
}
The syntax remains the same; simply include your credentials in the URL.
Taking it further: Using better proxies
We were using free proxies before, but quality proxies from reputable providers are strictly better — on speed, reliability, and security, and avoiding blacklists and CAPTCHAs. Typically, these require authentication. You can include your username and password directly in the proxy URL.
proxies = {
'http': 'http://username:password@204.185.204.64:8080',
'https': 'https://username:password@204.185.204.64:8080',
}
The syntax remains the same; simply include your credentials in the URL.
Using Sessions with Proxies
To maintain a session across multiple requests, you can use the requests.Session()
object.
import requests
# Create a session
session = requests.Session()
# Assign proxies to the session
session.proxies = {
'http': 'http://204.185.204.64:8080',
'https': 'https://204.185.204.64:8080',
}
url = "https://httpbin.org/ip"
response = session.get(url)
# Print the response
print(response.json())
Using a session allows you to persist settings (like proxies) across multiple requests without redefining them every time.
Setting Proxies as Environment Variables
If you’re frequently using the same proxy for multiple requests, it’s more efficient to define your proxies as environment variables rather than passing them in code every time. Here’s how to do that:
Set Proxy Environment Variables:
For Linux/macOS, use:
export HTTP_PROXY='http://204.185.204.64:8080'
export HTTPS_PROXY='https://204.185.204.64:8080'
For Windows:
set HTTP_PROXY=http://204.185.204.64:8080
set HTTPS_PROXY=https://204.185.204.64:8080
Now, you don’t need to hard code the proxy in the code, you’d fetch it from system environment variables instead, using the os
package.
import requests
import os
# Fetch proxies from environment variables
http_proxy = os.getenv('HTTP_PROXY')
https_proxy = os.getenv('HTTPS_PROXY')
# Raise an error if not set
if not http_proxy or not https_proxy:
raise EnvironmentError("HTTP_PROXY and HTTPS_PROXY environment variables must be set.")
# Assign proxies to the session and use as normal.
proxies = {
'http': http_proxy,
'https': https_proxy,
}
#... rest of the code
Rotating Proxies with Python Requests
If you’re scraping data from websites that frequently block or limit requests, rotating proxies is a good way to avoid detection. With rotating proxies, each request comes from a different IP address, reducing the chance of getting blocked.
Here’s how to rotate through a list of proxies:
import random
import requests
# List of proxies to rotate through
proxies_list = [
"198.59.191.234:8080",
"165.154.233.164:8080",
"45.79.90.143:44554",
"71.86.129.131:8080",
"45.79.158.235:1080"
]
url = "https://httpbin.org/ip"
# Loop to rotate proxies
while True:
try:
# Select a random proxy from the list
proxy = random.choice(proxies_list)
proxies = {
"http": f"http://{proxy}",
"https": f"https://{proxy}"
}
# Make the request using the random proxy
response = requests.get(url, proxies=proxies)
print(f"Response: {response.json()}")
print(f"Proxy currently being used: {proxy}n")
except Exception:
print("Error, trying another proxy...n")
Here’s what you should get back.
Response: {'origin': '165.154.233.164'}
Proxy currently being used: 165.154.233.164:8080
This script randomly selects a proxy from the list and retries with another proxy if one fails. It’s a practical approach for large-scale scraping tasks.
Now that you know the basics of integrating proxies within your requests script, before jumping straight into a project, it’s important to understand which proxies you should avoid using to keep your scraping operation secure and efficient.
Should You Use Free Proxies?
While free proxies may be appealing due to their zero cost, they come with significant drawbacks:
- Unreliable performance: Frequent downtime and slow speeds due to high usage.
- Poor security: Vulnerable to data breaches and malicious activities like traffic monitoring.
- Easy to detect and block: Often overused, increasing the chance of being blacklisted by websites.
- Lack of anonymity: Shared IP addresses reduce privacy and effectiveness for sensitive tasks.
For critical tasks like scraping large datasets or automating requests on sensitive websites paid proxies are worth the investment. They provide better speed, reliability, and, most importantly, security.
Paid proxies from reputable providers ensure stable connections, higher success rates, and better anonymity — saving you from costly downtime or blocked requests.
💡Pro Tip: Ensure your provider complies with data protection laws like GDPR or CCPA. Using a non-compliant proxy service can result in receiving proxies with poor reputations, leading to frequent blocks while scraping.Opt for a reputed provider like Bright Data.
Now that you know why you shouldn’t use free proxies, head over to the next section to learn how you can integrate a secure and reliable proxy from Bright Data.
How to Integrate Bright Data’s Proxy with Python Requests
Bright Data — Proxy Services
Bright Data is one of the most popular and reliable proxy providers offering residential, ISP, datacenter, and mobile proxies for web scraping and other automation tasks.
In the following steps, you will learn how to integrate Bright Data’s proxy with Python’s requests
library.
Step 1: Sign Up for a Bright Data Account
First, sign up for a Bright Data account if you haven’t already.
For the purpose of testing, you can just opt for the free trial.
💡Presently, Bright Data is offering to match up to $500 of your initial deposit. Check out their pricing plans for more details:
After signing up, log in and then click on ‘User Dashboard’ (highlighted in red).
Bright Data — Homepage
From the dashboard, click on ‘Get proxy products’ under ‘Proxies & Scraping Infrastructure’
Bright Data — Dashboard (Select Product)
You’ll also need to set up a proxy zone, which is used to manage your proxy configurations, including user credentials.
Your zones should appear as shown in the screenshot below:
Bright Data — Dashboard (Zones)
For this guide, we’ll be going with the ISP zone.
Step 2: Get Your Proxy Credentials
Once you’ve created the zone, Bright Data provides you with the following details:
- Proxy endpoint (IP and port)
- Username and password
The blurred fields represent the credentials that you’ll need to use for connecting to Bright Data’s proxy servers.
Bright Data — Dashboard (Proxy Access Parameters)
Next, you can configure the proxy based on your requirements. For this guide, you’ll only be using one IP as shown below:
Bright Data — Dashboard (Proxy Configuration)
Click on the Download button (highlighted in red) to download the list of IPs allocated to you. The IPs will contain your proxy URL, port, username, and password separated by a colon(:). The format of each IP is: proxy_url:port:username:password.
Bright Data — Dashboard (Download Allocated IP List and Advanced Settings)
You can also apply further customizations such as adding country (for targeting a specific region), caching proxy, turning on 100% uptime, etc.
After you’re happy with the settings, hit ‘Save’.
Step 3: Use Bright Data’s Proxy in Python
Now that you have the proxy details and credentials, you can set them up in your Python script. Below is a Python script that integrates Bright Data’s proxy with the requests
library.
import requests
# Step 1: Your Bright Data proxy credentials
proxy_username = "your_username"
proxy_password = "your_password"
# Step 2: These host and port values must be included for your proxies
proxy_host = "brd.superproxy.io"
proxy_port = "22225"
# Step 3: Construct your proxy URL from the above values
proxy_url = f"http://{proxy_username}:{proxy_password}@{proxy_host}:{proxy_port}"
# Step 4: Define a proxies dictionary for both HTTP and HTTPS
proxies = {
"http": proxy_url,
"https": proxy_url,
}
# Target URL to scrape
url = "https://httpbin.org/ip"
# Step 5: Send the request using Bright Data's proxy
response = requests.get(url, proxies=proxies)
# Output the response
print(response.json())
Run the script, and you should see your proxy’s IP in the output, confirming that the request went through Bright Data’s proxy.
{‘origin’: ‘your_bright_data_proxy_ip’}
And of course, if you wish to take this further, you could get a bunch of proxies from Bright Data according to your use case, configuring each like we did here, put them in the dictionary/environment variables, and use random like we did before to rotate through them.
But that, along with other Bright Data features like geo-targeting and advanced session controls, are used best with their open-source Proxy Manager.
💡Manage proxies with ease using the Proxy Manager:
Conclusion
Using proxies with Python’s requests
library is a powerful way to ensure anonymity, bypass restrictions, and protect your IP when web scraping or making multiple HTTP requests. From basic proxy setup to rotating proxies and authentication, the techniques in this tutorial cover a wide range of scenarios.
You can further bolster your connection’s security and reliability by integrating Bright Data within your Python’s requests
script.
A top-tier proxy service like Bright Data will make your web scraping operation more robust, reduce the risk of IP blocking, and boost your success rate, making your operation more scalable.
Before committing to a plan, take their free trial to see if the service is a right fit for you.