Skip to content

Import taget URL maximum URL's

I am running into another problem.  I have lists of URL's that have > 27 million links (file size > 2.5 gigs.).  When I attempt to import these links into a project, after running for a long time, GSA reports that the links are imported.  If I attempt to check the remaining target URL's GSA throws an out of memory exception.  If I attempt to post to the links without checking GSA give the message that there are no target URL's.  I am doing no searching, and all links are scraped with ScrapeBox and deduplicated.  If I remove duplicate links after an import, no links are removed.The failure appears to be that GSA is a 32 bit application.  However, I am not importing anywhere near 2^32 - 1 links, so that I have to assume that I am simply overrunning the input buffer.  These events lead to the question of:

What ais the maximum number of URL's that GSA will accept when importing target URL's?

Comments

  • Use a file splitter ie. TextWedge, Notepad++ or I think there was a user on the forum that made one. 

    Split it up into 10MB files at least (around 500k lines).

    Import targets by file. Depending on how fast you can process those URLs will depend on when to import the next target file.

    There might be a cool automatic way to do it but that's how I do mine.
Sign In or Register to comment.