Parsing in VBA: How to collect data through Excel
Imagine: you have an assistant that collects data faster than you can finish your coffee. Want to monitor prices, analyze reviews, or automate routine tasks without learning programming languages? No problem.
The familiar Excel will help. Remember when you were told that you don't know all its features? For example, you don't know about VBA.
What is VBA?
VBA (Visual Basic for Applications) is a built-in programming language for automating tasks in Excel, Access, Word and others. In it you can create macros - automators that perform repetitive tasks: process data, send e-mails or, as in our case, parse pages.
How VBA works for parsing
VBA is the translator between you and the Internet. The main parsing tool is Windows' built-in XMLHTTP, which sends requests to a site and retrieves data. That data is then parsed using an HTML parser such as HTMLDocument.
Here's what the basic algorithm looks like:
- Send a site request.
- Get the HTML code of the page. This is the “raw material” from which you will pull data.
- Find the elements you need. For example, product prices are inside HTML tags, which you can find using the code inspector in your browser.
- Record the data in Excel. And further - analyze, sort, visualize.
No complex libraries or terminals - all actions take place inside the familiar Excel interface. This is a tool for those who want to get down to business quickly, rather than spend weeks learning programming.
If Python is a Swiss knife for programmers, VBA is a surgeon's precision tool. It's simple, suitable for point tasks and the result of parsing is immediately available in Excel. Want to collect price data, build a graph and send a report to your colleagues? With VBA, it's done in a couple of clicks.
Besides, VBA is suitable for parsing small amounts of data. If you don't need a complex script for millions of strings, but need to monitor a dozen sites daily, VBA is the best choice.
Features of parsing with VBA
- Easy to customize: You work in a single Excel window. No need to install third-party development environments.
- Instant integration: Data goes straight into an Excel spreadsheet and can be filtered, sorted, and reported on.
- Minimum requirements: Even the oldest Windows computer will do.
How to connect proxies and why do I need them?
Parsing has a problem: sites may notice you're collecting data and restrict access. Proxies are your disguise. They hide your IP address, allowing you to parsing without risk.
How to get started?
When people talk about parsing, they think of standard tasks: collecting prices, reviews or article titles. But VBA is capable of so much more. Did you know you can use Excel to:
- Analyze stock exchange rates in real time? Connect to financial analytics sites, pull fresh data, and build dynamic charts.
- Collect flight schedules? For example, to plan a vacation or create an automatic system for monitoring the cheapest tickets.
- Track changes on job sites? Want to be the first to know about a new dream job? VBA can help you.
How to do all this? We won't give you a step-by-step solution, but we will show you the general outline on the example of a stock exchange rate parser.
Track stock exchange rates in real time
With VBA, you can parse currency, stock or crypto rates from Yahoo Finance or CoinMarketCap, for example. And then build dynamic charts that update every minute.
Here is how the code for parsing EUR/USD exchange rates might look like:
Sub GetExchangeRate()
Dim http As Object, html As Object
Dim url As String, exchangeRate As String
' URL of the course page
url = "https://www.x-rates.com/table/?from=EUR&amount=1"
' Site request
Set http = CreateObject("MSXML2.XMLHTTP")
http.Open "GET", url, False
http.Send
' Reading HTML
Set html = CreateObject("HTMLFile")
html.body.innerHTML = http.responseText
' Find rate (example for a table with class “rateTable”)
exchangeRate = html.getElementsByClassName("rateTable")(0).getElementsByTagName("tr")(1).Cells(1).innerText
MsgBox "Курс EUR/USD: " & exchangeRate
End Sub
Why does it work?
VBA is about realizing what's useful quickly and easily. You get a tool that:
- Saves time.
- Simplifies routines.
- Gives you data that you used to get manually or couldn't get at all.
So now you are ready not only to try parsing with VBA, but also to build your own solutions that will surprise you. And if you need to protect yourself from blocking - remember about proxies. It's an easy way to make VBA even more powerful.
Want to move from theory to practice? Open Excel, press Alt + F11, paste the code and try to collect the first data. Don't forget to test with proxies - their simple integration makes working with VBA even more convenient. Here's how to do it.
Using proxy when parsing with VBA
Let's understand how to use proxies with VBA for parsing. We will work with the ready parser code and show how to add a proxy. Let's review the process from the beginning.
Step 1: Prepare the proxy
For example, you have rented several server proxies. They might be as follows:
- IP: 192.168.0.1
- Port: 8080
- Login: user123
- Password: pass123
Step 2: Open the VBA editor
- In Excel, press Alt + F11 to open the VBA editor. Or, if it doesn't work, add “Show Developer tab on the Ribbon” in the options.
Next, in the Developer tab that appears, select Visual Basic.
- In the editor, click Insert -> Module to create a new module for the code.
Step 3: Paste the parsing code from the proxy
Now let's configure the parser to work through the proxy. We will use the MSXML2.XMLHTTP object for parsing because it is one of the easiest ways.
Example code for parsing with proxy:
Sub ParseDataWithProxy()
Dim http As Object
Dim url As String
Dim proxyAddress As String
Dim proxyPort As String
Dim proxyUser As String
Dim proxyPass As String
' URL for parsing
url = "http://example.com" ' Replace with the URL you want to parse
' Proxy settings (replace with your own data)
proxyAddress = "192.168.0.1" ' Your proxy address
proxyPort = "8080" ' Proxy port
proxyUser = "user123" ' Proxy login (if required)
proxyPass = "pass123" ' Proxy password (if required)
' Object creation MSXML2.XMLHTTP
Set http = CreateObject("MSXML2.XMLHTTP")
' Setting up a proxy
http.setProxy 2, proxyAddress & ":" & proxyPort, proxyUser, proxyPass ' 2 — is the value for using a proxy
' Sending GET request through proxy
http.Open "GET", url, False
http.Send
' Retrieving data from the website
MsgBox http.responseText ' Shows the HTML code of the page
End Sub
Step 4: Code Explanation
- URL for parsing:
- Replace the string url = “http://example.com” with the site you want to parse.
- Proxy Settings:
- proxyAddress is the IP address of your proxy server.
- proxyPort is the port through which your proxy server connects (for example, 8080).
- proxyUser and proxyPass are the login and password for the proxy, if they are required.
- Proxy Setting:
- In the line http.setProxy 2, proxyAddress & “:” & proxyPort, proxyUser, proxyPass we specify the proxy parameters.
- 2 is the setting to use an HTTP proxy.
- After that we add the address of the proxy and its port.
- If your proxy requires authentication, specify login and password.
- Sending a request:
- http.Open “GET”, url, False - open a GET request to the specified URL.
- http.Send - send the request.
- MsgBox http.responseText - display a message with HTML code of the site response (or other data) in the window.
Step 5: Start parsing
- Once the code is inserted into the editor, press F5 to run the macro.
- You will get the HTML code of the page in a message box (MsgBox). This data can be further processed, analyzed and recorded in Excel.
Step 6: Using proxy servers from the list
If you have multiple proxies, you can set up a loop to use them. For example, if you want to alternate using different proxies for requests, you can do it like this:
Sub ParseDataWithMultipleProxies()
Dim http As Object
Dim url As String
Dim proxyList As Variant
Dim i As Integer
' URL for parsing
url = "http://example.com" ' Replace with the desired URL
' Proxy list (IP:PORT)
proxyList = Array( _
Array("192.168.0.1", "8080", "user123", "pass123"), _
Array("192.168.0.2", "8080", "user456", "pass456") _
)
' Proxy cycle
For i = LBound(proxyList) To UBound(proxyList)
' Object creation MSXML2.XMLHTTP
Set http = CreateObject("MSXML2.XMLHTTP")
' Setting up a proxy
http.setProxy 2, proxyList(i)(0) & ":" & proxyList(i)(1), proxyList(i)(2), proxyList(i)(3)
' Sending GET request through proxy
http.Open "GET", url, False
http.Send
' Output of the received data
MsgBox "Data from proxy " & proxyList(i)(0) & ": " & vbCrLf & http.responseText
Next i
End Sub
Step 7: What should I do with the resulting data?
Now that you have configured parsing from the proxy, the resulting data (HTML code) can be processed using VBA. For example:
- Use regular expressions to extract the required data.
- Write the data into Excel for further analysis.
- Save the results to text files.
Now you have a rough idea of how to parse with VBA, as well as why they do it.