How does URL pattern matching works for proxy sources
Can you show examples for each pattern matching field?
I am asking about fetching main page then extracting sub pages urls to crawl those pages and get proxies in those pages
Ty
Also what does text only option do
@Sven
I am asking about fetching main page then extracting sub pages urls to crawl those pages and get proxies in those pages
Ty
Also what does text only option do
@Sven
Comments
Also it would be very very good to show which pages it is fetching and processing right now
For example I am fetching 21 pages of http://www.mfqqx.com/daili/index_%page%.html with same mask as yours and it doesnt show me anything
I only see searching for new proxies. I have no idea whether my mask is working, unnecessary pages are getting fetched and processed, etc
There can be another window which would show which urls are getting fetched and parsed
By the way there are thousands of proxies (at least 150 pages) here and software can't extract them : http://www.proxylists.net/us_0.html
Also can not parse the proxies listed on this site. It is a default site in GSA ser > http://www.proxytm.com/public-http-proxy-server-lists/type-distorting.htm
Can not parse proxies listed here as well. There are thousands of proxies > http://www.proxz.com/proxy_list_high_anonymous_0.html
Another can not be parsed > https://premproxy.com/socks-list/01.htm . They use some sort of special span class to print ports
Can not parse here as well it is also in official list > http://www.cybersyndrome.net/pla6.html
Where is saved all of these settings so i can back up them
I mean the location in the harddrive
Certainly I can code myself because i have written so many crawlers for my own purposes
But right now my only aim is getting more proxies and increasing LPM
I do not think anyone can easily compete with your software right now as yours has been getting developed for many years now. I know that developing a decent software alone takes many many years
By the way if you ask my opinion, biggest weakness of your software is it is 32 bit and supports maximum like 2 GB ram memory. But i know that it is your intentional marketing practice
By the way I have collected over 1 million proxies with fine tuning GSA Ser and testing how many working at the moment
Could you at least answer my questions in this thread? Thank you