How to Use Proxy in Puppeteer
To use a proxy in Puppeteer, launch the browser with the --proxy-server=HOST:PORT option so that all requests go through the proxy’s address. If the proxy requires authentication, provide the username and password via page.authenticate() before loading any pages. This ensures Puppeteer’s traffic is routed through the proxy’s IP, masking your real address.
What Is a Proxy in Puppeteer?
A proxy acts as a middleman between your Puppeteer script and the target website. In Puppeteer, using a proxy means your browser traffic is routed through a different server/IP address. This helps hide your real IP and can prevent blocks and avoid triggering anti-scraping measures on websites.
Setting Up a Proxy in Puppeteer
To configure Puppeteer to use a proxy server, follow these steps:
- Obtain a Proxy Server URL: Get the address (IP or hostname and port) of a proxy. This can be a residential proxy from a provider like Proxys.io or any HTTP/SOCKS proxy you have. For example, Proxys.io provides proxies in the format IP:Port:Username:Password which you can find in your account dashboard.
- Launch Puppeteer with the Proxy: Start a Puppeteer browser instance and include the proxy server address in the launch options. Use the Chrome flag --proxy-server= with your proxy’s host and port. Be sure to specify the correct protocol (e.g. use http:// for HTTP or socks5:// for SOCKS5 proxies) in the URL.
- Verify the Proxy is Working: Once the browser is launched with the proxy, all page requests should go through the proxy. You can navigate to a site that shows your IP (such as https://httpbin.org/ip) and confirm that the IP reported matches the proxy’s IP.
For example, here is a simple Puppeteer script using a proxy:
const puppeteer = require('puppeteer');
(async () => {
// Proxy server address (HTTP proxy in this example)
const proxyServer = 'http://123.45.67.89:8000'; // replace with your proxy IP:Port
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyServer}`] // launch Chrome with proxy
});
const page = await browser.newPage();
await page.goto('https://httpbin.org/ip'); // open a page to check IP
const bodyText = await page.evaluate(() => document.body.textContent);
console.log(bodyText); // should show the proxy's IP address
await browser.close();
})();
In the code above, we launch Puppeteer with a proxy URL. The httpbin.org/ip page will return a JSON containing the origin IP, which should be the proxy’s IP (proving that Puppeteer is routing traffic through the proxy).
Proxy Authentication (Username/Password)
Many premium proxies – especially residential proxies – require a username and password to use them. For instance, Proxys.io’s residential proxies come with login credentials. Chrome does not support embedding credentials in the proxy URL (e.g. http://user:pass@IP:port is ignored by Chrome), so you must handle proxy authentication within Puppeteer.
To use an authenticated proxy in Puppeteer, launch the browser with the proxy host and port as before, then call page.authenticate() to supply the credentials. Make sure to do this before navigating to any page. Below is an example using a Proxys.io residential proxy (replace the placeholder values with your actual proxy details):
const puppeteer = require('puppeteer');
(async () => {
// Proxys.io residential proxy details (example)
const proxyURL = 'http://123.45.67.89:8000'; // Proxy IP and port
const proxyUsername = 'your_username'; // Proxy auth username
const proxyPassword = 'your_password'; // Proxy auth password
const browser = await puppeteer.launch({
args: [`--proxy-server=${proxyURL}`]
});
const page = await browser.newPage();
// Provide proxy authentication credentials
await page.authenticate({
username: proxyUsername,
password: proxyPassword
});
// Now navigate to a site through the proxy
await page.goto('https://httpbin.org/ip');
const ip = await page.evaluate(() => document.body.textContent);
console.log(ip); // should show proxy IP if auth was successful
await browser.close();
})();
In this script, we first launch the browser with the proxy’s address (without credentials in the URL). Then page.authenticate() supplies the username/password for the proxy server. After that, all requests from page.goto (and other browsing actions) will use the proxy as long as the credentials are correct.
Note: If the credentials are wrong or missing, the proxy will respond with an HTTP 407 “Proxy Authentication Required” error and Puppeteer’s navigation will fail (ERR_HTTP_RESPONSE_CODE_FAILURE). Always double-check that your proxy username and password are valid.
Tip: If your proxy provider allows it, you can whitelist your IP address to skip authentication altogether. For example, Proxys.io lets you authorize your computer’s IP in their dashboard, after which you can connect to the proxy without credentials. In such cases, you would launch Puppeteer with --proxy-server=IP:PORT and omit the page.authenticate() step, since the proxy recognizes your IP.
Rotating Proxies in Puppeteer
If you make many requests in a row, using a single proxy IP can still lead to blocks. To avoid this, you can use rotating proxies, which means switching the proxy IP periodically or per request. By rotating through a pool of IP addresses, Puppeteer can mimic multiple users and reduce the chance of detection or rate-limiting by the target site.
There are two common ways to implement proxy rotation in Puppeteer:
Manual Rotation with a Proxy List: Obtain a list of proxy server addresses (for example, a list of residential IPs from your provider). In your script, you can randomly pick one proxy from the list for each browser launch or each page load. For instance:
const proxies = [
'http://111.11.111.11:8000',
'http://222.22.222.22:8000',
'http://333.33.333.33:8000',
// ... more proxy addresses
];
const randomProxy = proxies[Math.floor(Math.random() * proxies.length)];
const browser = await puppeteer.launch({
args: [`--proxy-server=${randomProxy}`]
});
// ... then use browser as usual
- The code above selects a random proxy from the list and launches Puppeteer with it. On the next run (or next iteration), a different IP is likely to be chosen. By cycling proxies this way, each session or request can go out from a new IP address. Note: Managing a large list of free proxies can be error-prone (they often die or get banned quickly), so using a reliable proxy list from a provider is recommended.
- Using a Rotating Proxy Service: Some providers offer a single endpoint that automatically rotates IPs for you. For example, a service might give you a domain like proxy.yourprovider.com:PORT which internally rotates through millions of residential IPs. In this case, you simply use that endpoint as your --proxy-server (and still call page.authenticate if credentials are required) – the provider will ensure each request exits from a different IP. This is easier since you don’t have to manage the list of IPs yourself. Proxys.io offers both static and rotating residential proxy plans, so you can choose a rotating plan to get a new IP on each connection or session automatically.
Using rotating residential proxies is especially useful for web scraping at scale, because it dramatically reduces the risk of getting blocked. Each request looks like it’s coming from a real user in a different location when using a high-quality residential proxy network.
Choosing the Best Proxies for Puppeteer
Not all proxies are equal. Free proxies or public lists often have poor reliability and can be blocked by many websites. For serious Puppeteer usage, you’ll want premium proxies. In particular, residential proxies are the most effective for scraping and automation, because their IP addresses are assigned by real ISPs to homeowners – they appear as ordinary user traffic. Websites are less likely to flag or ban residential IPs compared to datacenter IPs.
Residential proxies come in two flavors: static residential (the same IP each time, useful if you need a consistent identity) and rotating residential (IP changes on every connection or request). Rotating residential proxies provide the highest anonymity and success rates for large scraping jobs, while static ones might be useful for tasks like managing a long-term session. All Proxys.io residential proxies support both HTTP and SOCKS5 protocols, so they can be used with Puppeteer without issues.
When choosing a proxy provider for Puppeteer, look for a large pool of residential IPs across various regions, high uptime, and integration support for automation tools. For example, Proxys.io offers millions of residential IPs in 70+ geolocations, with easy integration steps (as shown above) to get you started quickly. A high-quality residential proxy network ensures your Puppeteer scripts can scrape data with minimal blocks.
Troubleshooting Puppeteer Proxy Issues
Even with the correct setup, you may encounter some issues when using proxies in Puppeteer. Here are a few tips to troubleshoot common problems:
- Test Without Proxy: If your Puppeteer script isn’t working with the proxy, try running it without the proxy to see if the issue lies with your code or the proxy. If it works without the proxy but fails with it, double-check the proxy address and launch arguments. Make sure you passed the --proxy-server option correctly (and the proxy is live and reachable). Typos in the host or port can cause connection failures.
- Authentication Errors (407): An HTTP 407 error means the proxy authentication failed. Ensure that the username and password you provided are correct (and that your account with the proxy provider is active). If you’re not using page.authenticate() for a proxy that requires it, the request will be denied.
- Correct Proxy Protocol: Ensure you used the proper protocol in the proxy URL. For example, if you have a SOCKS5 proxy, the address should start with socks5:// rather than http://. Using the wrong scheme can result in Puppeteer silently not using the proxy.
- IP Whitelist Issues: If you have IP authentication set up but the proxy still isn’t working, verify that your current machine’s IP is indeed whitelisted with the provider. A change in your network IP might require updating the whitelist in the proxy dashboard.
By following the above guidelines, you should be able to successfully route Puppeteer through a proxy. Setting up a proxy in Puppeteer is straightforward: launch with the proxy address and handle any credentials as needed. With residential proxies in particular, you can significantly improve your web scraping success rate, as these proxies make your automated traffic look like it’s coming from real users. Using a trusted provider like Proxys.io for residential proxies will ensure you have a reliable pool of IPs to work with, helping your Puppeteer scripts run smoothly without getting blocked.