How and why to use proxies to work with any APIs?

api

In this article we will consider how to use proxies to work with API. We will show how to add a proxy to the code to, for example, change geolocation.

What are APIs and why do you need them?

An API (application programming interface) is like a menu in a restaurant. Instead of cooking yourself, you choose from the menu and the chefs (or APIs) do all the work. 

APIs allow programs to share data and functions. For example, an online store can process payments through APIs without needing a deep understanding of how payment systems work.

 

Why use proxies to work with APIs?

Proxies help in several ways:

  • Preserving anonymity. Proxies hide your IP address, which protects your personal information and allows you to change geolocation.
  • Avoiding restrictions. Some APIs limit the number of requests from a single IP. Proxies allow you to distribute requests and avoid such restrictions.

Steps for working with APIs and proxies

API selection and code preparation

In our example, let's take an API for real-time parsing of product data from Amazon. We took it from this library of different APIs. The site is useful; you can find APIs for any task here. Many of them are available for free.

api

How to add a proxy to the code

We will use Python and the requests library. 

How to write proxy data in code?

Here is how to write proxy data in code:

Proxy format

We write proxies in the format:

IP:port:login:password

For example:

181.214.117.124:3686:user94438:nyp6os.

Adding a proxy to the code

In the code, you need to add a list of proxies and a function to convert the data into a format that the requests library understands. Here are the steps to do it:

Create a list of proxies

First, create a proxy list in the following format 

IP:port:login:password

For example:
proxy_list = [
    "181.214.117.124:3686:user94438:nyp6os",
    "109.196.175.126:4800:user94438:nyp6os"
]

Create a function for proxy conversion

Create a function get_proxy_dict that converts a proxy string into a dictionary suitable for use with requests. Function example:

def get_proxy_dict(proxy):
    ip, port, user, password = proxy.split(':')
    return {
        "http": f"http://{user}:{password}@{ip}:{port}",
        "https": f"https://{user}:{password}@{ip}:{port}"
    }

This function takes a proxy string, parses it into components, and returns a dictionary that the requests library can use to send requests through the proxy.

Using proxies in requests

In the main code to send requests using proxies, you need to define the URL, request parameters, and headers. The proxies are applied to each request. In a loop, use the get_proxy_dict function to get the correct format for the proxy:

for _ in range(len(proxy_list)):
    proxy = next(proxy_cycle)
    proxies = get_proxy_dict(proxy)
    
    try:
        response = requests.get(url, headers=headers, params=querystring, proxies=proxies)
        print(response.json())
        break
    except requests.RequestException as e:
        print(f"Ошибка при использовании прокси {proxy}: {e}")

Here proxy_cycle is used to loop through the proxies. get_proxy_dict(proxy) converts the proxy string into a format that can be passed to the proxies parameter of the requests.get function.

Now you know how to properly dictate and use proxies in code. This will help you bypass restrictions and change geolocation when working with the API.

Here is a simplified code with our API and added proxies:

import requests
from itertools import cycle

# Прокси в формате IP:порт:логин:пароль
proxy_list = [
    "181.214.117.124:3686:user94438:nyp6os",
    "109.196.175.126:4800:user94438:nyp6os"
]

def get_proxy_dict(proxy):
    ip, port, user, password = proxy.split(':')
    return {
        "http": f"http://{user}:{password}@{ip}:{port}",
        "https": f"https://{user}:{password}@{ip}:{port}"
    }

proxy_cycle = cycle(proxy_list)
url = "https://real-time-amazon-data.p.rapidapi.com/search"
querystring = {"query":"Phone","page":"1","country":"US","sort_by":"RELEVANCE","product_condition":"ALL","brand":"Apple"}
headers = {
    "x-rapidapi-key": "YOUR_API_KEY",
    "x-rapidapi-host": "real-time-amazon-data.p.rapidapi.com"
}

for _ in range(len(proxy_list)):
    proxy = next(proxy_cycle)
    proxies = get_proxy_dict(proxy)
    try:
        response = requests.get(url, headers=headers, params=querystring, proxies=proxies)
        print(response.json())
        break
    except requests.RequestException as e:
        print(f"Ошибка при использовании прокси {proxy}: {e}")

Here, instead of “YOUR_API_KEY” you should insert the API key (token).

Code launch

Install the requests library using the pip install requests command.

Code launch

Place the code in a text file with a .py extension (e.g., parser.py).

code

Open a command prompt, navigate to the directory with the file, and run the command:
python filename

phyton

After executing the code, get the data. You may need to filter out unnecessary information. However, this is a topic for another article.

Now you know how to add proxy data to your code. Following this algorithm, you will be able to configure a proxy to work with any API or personally written program.