Proxychains is a great tool to have on your PC, either if you plan on using it or not on a daily basis. If you plan on using Beagle Scraper for e-commerce scraping, you will need proxychains, with public or private proxies, to direct your requests through various IPs. In this way, you will avoid getting your machine’s IP banned by the target website(s).
In this article, you will learn
- Why proxychains is better with private proxies
- Windows alternatives to proxychains
- How to install proxychains
- How to setup proxychains [the complete config file at the end of this article]
- How to use proxychains
I don’t try to brag the benefits of using private proxies over public ones, but here’s a short list with the reasons why you should consider private proxies over public ones:
- 99.99% uptime – your IPs are always available for use
- High speed – you are the sole users of these proxies, so you don’t share the server’s speed and bandwidth with hundreds of other users
- clean IPs – with private proxies you are guaranteed the IPs will connect to any website
For scraping, I recommend you look into SEO proxies, these are cheaper proxies and you can get more IPs for your budget.
If you have a Windows PC and you need to do your scraping from your PC, you have two alternatives, because proxychains support only Unix based systems.
Proxyfier does the same job, on a Windows PC, as Proxychains on a Linux or Mac. However, there is a small fee that you will need to pay for using this tool with your PC.
Virtual Box and install a virtual machine
The free alternative to Proxyfier is to use VirtualBox and create a virtual Linux machine on your PC. Basically, you need to install VirtualBox, create a Linux image and use proxychains inside your virtual machine. Have a look at this setup, it’s easy to install VirtualBox and there are plenty of tutorials on how to create your little virtual machine.
Proxychains is an open source project and you can download the source code from GitHub, here. However, if you just plan on using proxychains for scraping or any other low-security web automation or browsing, there is no need to read the source code or any documentation.
All you have to do is to install it first and then set it up.
Installing on Mac with homebrew
$ brew install proxychains
Installing on Ubuntu or any other Linux Distro with apt-get
$ sudo apt-get install proxychains
That’s all! Proxychains is now installed on your PC. Next, right after the installation, before using proxychains, you will need to setup the config file and insert your proxies IPs and login details.
To setup proxychains, you will need to access the config file, comment and uncomment a couple of lines and paste your public or private proxies’ IPs, ports and login details.
Access the config file
After installing, use the same terminal to access the config file.
For Ubuntu, I will use nano with root access because the /etc/ folder is restricted for editing by normal users. The command used in the terminal is:
$ sudo nano /etc/proxychains.conf
Mac users can use Vi to access the config file, by using the following command:
$ vi /usr/local/etc/proxychains.conf
Voila! Now proxychains’ config file is opened in my terminal. Next, you will have to comment and uncomment some lines to setup proxychains.
First, you need to comment strict_chain. All you have to do is to put a # (number sign) in front of strict_chain.
Second, you will need to uncomment random_chain. by removing the # from the line. And your config file should look like this.
By now, I think you know what these two lines are doing to proxychains. Simply put, you told proxychains to not use your proxies’ IPs in strict order (strict_chain) but to randomly rotate your proxies when making connections (random_chains). In other words, you instructed proxychains to connect like Tor and randomly use your IPs.
Watch your DNS setup! Although proxychains comes with the proxy_dns uncomment, just make sure it is uncommented. In this way, you will avoid any DNS leak that could disclose your real IP.
Setup proxies IPs
The final part of setting up proxychains is to pass it your proxies’ details. First, before starting, make sure you delete or uncomment the default socks4 127.0.0.1 9050 line. You won’t need it if you are going to use your own proxies’ IPs.
Afterward, insert each proxy’s details on a separate line, exactly as it shows it the proxychains config file example or as this format: proxy_type ip port user login
http 188.8.131.52 4416 chris password
You can add as many IPs as you have. The more IPs you have, the better, because proxychains will loop through them and use each individual IP less often.
Note: You can even mix public and private proxies. To start I recommend a private/public ratio of 1/1. Meaning that for each private proxy, try to use one public proxy as well. Thus, if you have 20 private proxies, you can get 20 public proxies and save them in your proxychains IPs lists. Make sure the public ones are working and that you do not pass the username and password of your private ones (public proxies only require their type, IP and port).
You can find my proxychains config file at the end of this article. Copy it, insert your proxies details and paste it over your default proxychains config file.
To use proxychains, all you need to do is to pass the proxychains command in a terminal, before the actual app’ name you plan on using. The format is
$ proxychains app_name
Proxychains with Beagle Scraper
To divert your scraping requests through proxychains, all you need to do is to navigate in the folder where you have downloaded Beagle Scraper and pass this command in the terminal:
$ proxychains python start_scraper.py
This command instructs proxychains to start proxychains and divert all requests for Beagle Scraper.
Proxychains with web browser
If you want to browse through your newly tor-like proxychains setup, all you have to do is to open a terminal and pass the command for instructing proxychains to divert the browser’s requests through your proxies’ IPs. Let’s use Mozilla Firefox with proxychains by passing this command in a terminal:
$ proxychains firefox
To wrap up
There are only a few simple steps required to setup proxychains for Beagle Scraper, web browsing or for any other script or app you plan on diverting its request through various IPs.
Here’s the complete proxychains config file
# proxychains.conf VER 3.1 # # HTTP, SOCKS4, SOCKS5 tunneling proxifier with DNS. # # The option below identifies how the ProxyList is treated. # only one option should be uncommented at time, # otherwise the last appearing option will be accepted # #dynamic_chain # # Dynamic - Each connection will be done via chained proxies # all proxies chained in the order as they appear in the list # at least one proxy must be online to play in chain # (dead proxies are skipped) # otherwise EINTR is returned to the app # #strict_chain # # Strict - Each connection will be done via chained proxies # all proxies chained in the order as they appear in the list # all proxies must be online to play in chain # otherwise EINTR is returned to the app # random_chain # # Random - Each connection will be done via random proxy # (or proxy chain, see chain_len) from the list. # this option is good to test your IDS :) # Make sense only if random_chain #chain_len = 2 # Quiet mode (no output from library) #quiet_mode # Proxy DNS requests - no leak for DNS data proxy_dns # Some timeouts in milliseconds tcp_read_time_out 15000 tcp_connect_time_out 8000 # ProxyList format # type host port [user pass] # (values separated by 'tab' or 'blank') # # # Examples: # # socks5 192.168.67.78 1080 lamer secret # http 192.168.89.3 8080 justu hidden # socks4 192.168.1.49 1080 # http 192.168.39.93 8080 # # # proxy types: http, socks4, socks5 # ( auth types supported: "basic"-http "user/pass"-socks ) # [ProxyList] # add proxy here ... # meanwile # defaults set to "tor" #socks4 127.0.0.1 9050 http 184.108.40.206 21237 username password http 220.127.116.11 21263 username password http 18.104.22.168 21267 username password