Skip to content

Target URL cache - number of URLs multiplies in ONE cache within minutes

can be repeated = YES
since when? = many weeks or even few months

until FEW weeks ago below bug happened to VERY first project ever created
NOW same happens to very LAST Tier created

I import target URLs, up to several times every day - usually 1-8 thousand at a time into each unit (project and Tiers).

I noticed that ONE unit (NOW very LAST Tier, some time ago VERY first project ever) gets an unnatural HIGH number of URLs into its cache
unnatural HIGH = approx the total SUM of all URLs imported into ALL project / Ts found in ONE single unit-cache

several hours ago, I emptied the cache of the LAST Tier ever created = some 200'000+ URLs have been there from no-where accumulated wihtin past 2 or so days

then I imported some 7000 NEW target URLs into each unit (project and Ts)
NOW, just hrs later I find in that LAST T ever created a total of 60'000+ URLs

this behavior is many weeks or even few months old, just before always into FIRST ever project, now into LAST ever unit (T) added

just before I started writing THIS bug report, the URL number in a.m. cache was 60'000+
now it is 130'000+ (roughly DOUBLE) just within some 20 mins WITHOUT having added any new URLs anywhere

Comments

  • SvenSven www.GSA-Online.de
    and the urls look what way? Im sure it's the project itself that adds new urls to the queue after creating accounts.
  • @sven

    100'000 URLs in half an hour added by project itself ???
    u joking
    with 33 threads and just several dozen URLs processed per hr
    ending up with ten thousands of new URLs ....

    these are regular URLs as added from SB = target URLs
    ALL units get same type / quality of target URLs = 1 large URL file =  randomized = then split into junks of equal size
    then imported into all units one by one

    these URLs in that ONE Y is leaked into that one T from all others by SER
    leaked NOT created by project

    and the point is that while each project / T has received thousands of URLs
    always a similar / identical number of very similar high quality,

    most "NEW URLs" never are submitted or used on THOSE other Ts or projects
    when I run all or several units = then the LpM is just about 1+ (less than 2)

    but  it appears that ALL or almost all of the good new targets imported into each unit end up in that ONE T

    when I switch OFF all other projects / Ts
    and run only the one with all the targets

    then of course MST are "already parsed" - may be 90+%
    but inbetween the "already parsed" are all the good ones that were supposed to be for ALL OTHER Ts and projects

    it was a BUG before
    and is one NOW
    just that the "target" has changed from very first project to very last T

    all other projects or Ts are in no way affected and seem to work normally
    currently I have 5 projects and 4 Ts running

    ALL other units have a decreasing number of target URLs in cache = the imported number MINUS the processed number = correct processing

    currently that one T has 218 457 URLs in its cache
    while all other units have between a few dozen to a few thousand ( the 8000 recently imported MINUS the ones processed already the past 12 or so hrs)

    as you can see above the current number is almost double from last reading when posting above bug report
    and for MOST of the time that T was quiet = OFF except the last approx 1 hr
  • SvenSven www.GSA-Online.de

    So much text and still I don't know what the URLs look like. And what type of settings does this project use to get target URLs? Just by import or also other types? Maybe send me the project backup?

  • to get moving in submissions = ALL target URLs deleted and restarted from scratch
    too may OTHER vital problems exist in direct import of target URLs to projects (vs import to global list)

    for example

    1. URLs ending with | (pipe) seem to "block" further use of target URLs - wy does SER hhave a pipe ending the target URLs ?? normally |PR value are removed and only plain text URL found in target URL cache ...

    2. newest problem = with all SE OFF and all global list OFF = ONLY imported target URLs
    SER works and target URLs seem to never be empty as it was the case some days ago (may be 1 or 2 upgrades back)
    instead there is a large number submissions and verifications = 1000+ verifications and target URL cache stilll full ...
    but from WHERE are the target URLs if ALL SE and global list OFF and SER running for 12+ hrs

    until several days back I used this method to switch ALL target sources OFF to test particular URL lists directly imported into projects / Tiers
    and after a while all target URL cache EMPTY (showing 0 URLs) as intended

    3. that above mentioned number of has 218 457 URLs in its cache further increased for a while to 300'000
    then after some 12+ hrs of THAT project INACTIVE
    AND after a reboot of all machine
    ALL = ALL target URL vanished !! and target URL cache empty
    before reboot all target URLs still existed

    for now ALL www resources fully busy but as soon as NEW same situations exist I may send project backup
Sign In or Register to comment.