Skip to content

My VPS 100GB Used Up

AlexRAlexR Cape Town
edited October 2012 in Need Help
I'm using GSA SER and CS and GSA Indexer.

My computer has used up 100GB of diskspace. The lists files (verified, identified, etc) are in 1 folder and total about 1gb (max!)

Where is the other 99gb? (on my other VPS the same programs only use 40GB of the 100GB disk space) There are no other programs so somehow these 3 programs are using a lot of disk space. When I check the three program folders in BOTH appdata and program files, they are each under 50mb. 

Is it possible, that either the SER or Indexer tool is getting the URL's it scraped cached by Windows? If so, where can I find them to delete? 

OR any ideas what else is using up all the disk space? (It's windows 7 on the VPS)

Comments

  • Most propably some Windows backups and page files. Use one of these tools to investigate.
  • AlexRAlexR Cape Town
    A disk cleanup got rid of 200mb. 

    Which tools do you mean? 
  • OzzOzz
    edited October 2012
    one of those you can check folder size with ;)

  • LeeGLeeG Eating your first bourne

    Are you saving unknown captchas?

    Those images, if your doing a lot of blasting, will soon take up space

  • AlexRAlexR Cape Town
    @LeeG - not saving unknown captchas.

    Now - a bigger issue. I updated the program after the 200mb clear up on disk cleanup. BUT I have lost all my data from ALL my projects. 

    I did a backup previously (but on restore from old backup, it only has about 3% of the data in the backup - it seemed it didn't complete it properly)

    What I'd like to know, is the data somewhere in the GSA folder where I can manually reimport it? (I haven't touched any of the GSA folders directly so figure the data should be there somewhere) Basically there are a large number of projects that I had and they aren't coming up at all now. 

    Any help is appreciated...
  • AlexRAlexR Cape Town
    Found the file...dropbox cache! 70GB!

    BUT still lost all my data from my projects...
  • OzzOzz
    edited October 2012
    Maybe you deleted all your cached data (history/target urls) which caused the different  sizes.

    Your data is stored under normal circumstances in ...\AppData\Roaming\GSA Search Engine Ranker\projects

    But I don't think that it will differ from the projects you'll see after the restore. 
  • AlexRAlexR Cape Town
    It seems that there a number of folders that have grown massively!

    1) 58 GB - Dropbox Cache 50GB (But it's hidden and even with "show hidden" in explorer it couldn't be found)
    2) 22GB - C:\ProgramData\Microsoft\Search\Data\Applications\Windows\Projects\SystemIndex\Indexer\CiFiles
    (BUT it can't delete these somehow)

    Total = 80GB!




  • AlexRAlexR Cape Town
    @Ozz - I have about 2500 files in the appdata/GSA PROJECT folder.

    How do I import these back into the program? It looks like all the files that have all my data as I can see the various old project names here. 
  • Hopefully you google about every folder you want to delete first. Not folder should be deleted without knowing what its contain.
  • OzzOzz
    edited October 2012
    @GlobalGoogler: In my understanding SER will recognize the files that are placed in that folder automatically. But maybe I'm wrong. I was never in the situation to make use of that.

    And IIRC than the backup file is just a zipped project folder. Maybe @Sven can give you advice how to zip and rename them properly so you can manually backup and restore your projects.
  • AlexRAlexR Cape Town
    I have gone here: \AppData\Roaming\GSA Search Engine Ranker\projects

    There are about 2500 files - how do I get these back into GSA? They have all project and platform names... I can't do the restore using these files - they are some kind of GSA backup system file with the project data I need?
  • OzzOzz
    edited October 2012
    As I said. SER should recognize that files and show your projects in my understanding. Did you restart SER / reboot your system? 
    If not than they are maybe corrupted.
  • SvenSven www.GSA-Online.de
    the backup will of course only archive files that are required for the program and not taking everything from the project folder.
  • AlexRAlexR Cape Town
    @Sven - So if the backup file is corrupted. (which it is) 

    Is there any way for me to restore the data from the files in:
    \AppData\Roaming\GSA Search Engine Ranker\projects

    I.e. there is no backup file, but all the files are in the folder, can I use these to reload them into my projects?

    P.s. I am now creating a regular backup procedure. 

  • GiorgosKGiorgosK Greece
    edited October 2012
    @globalgoogler
    I would do the following (but have not tested it !!!)

    1. BACKUP all the file in the projects folder (zip)
    2. move them to another folder
    3. create missing project with exact same name (it will probably create the corresponding project files)
    4. close GSA SER
    5. replace those files with the old but good project files in the project folder

  • AlexRAlexR Cape Town
    @Sven - all the files are in the project folder. My backups are broken. Is there any way for me to get the files int he project folder loaded back into the project? 

    At least the project settings. I see all the data is there in the various .txt files. 
  • SvenSven www.GSA-Online.de
    In txt files? You can backup the whole folder but Im wondering what these files are and who created them.
  • AlexRAlexR Cape Town
    Yes! 

    I think what happened is that the VPS ran out of disk space because of the hidden dropbox cache file at 60GB and the other file- (22GB) - C:\ProgramData\Microsoft\Search\Data\Applications\Windows\Projects\SystemIndex\Indexer\CiFiles
    (BUT it can't delete these somehow).

    So somehow it wasn't able to update some files. 

    So in the project folder there are 2500 .txt files. .hosts. _keywords _done type files. 
    The data is here but somehow it's not reading it.

    Anyways...It's also lost all the verification and submission stats too....all sitting at 0 & 1. 

    Decided I will just call it a good lesson (I did backup but backup only backed up 3 of my projects), and redo them and run them again. It will pick up any of the links anyway. 

    Also - useful to know that dropbox runs a cache file, so if you use it for your sitelists or GSA files, even though dropbox actually shows 1gb, there can easily be a 60GB hidden cache file. (Not visible in explorer even with hidden files enabled)!

    Maybe you can advise what the other file is that is 22GB? It seems to be a product of running GSA. Some sort of microsoft cache of all the threads?
  • OzzOzz
    edited October 2012
    Regarding your other files.
    Or google something about the folder destination "SystemIndex\Indexer\CiFiles"

    Are your projects files looking like this: "Project 01.prj.txt", "Project01.static.txt" or "Project01..hosts_done.FD28.txt", .....?
    I'm asking because there shouldn't be any *txt-file in that folder and you somehow renamed that files to txt-format. If thats the case than remove all ".txt" so your files look like this: "Project 01.prj", "Project01.static" or "Project01..hosts_done.FD28"
  • AlexRAlexR Cape Town
    It's like this:
    ExternalBlogs.hosts_done.6F93

    OR
    BD-Website.prj

    I am busy redoing the projects manually, but the data is all there. 

    Basically it has forced me to develop a neat backup system and warned me about how to correctly use dropbox. Will do a post about it later. (For those with more than 1 VPS running) I have a nice very easy system now. :-)
  • SvenSven www.GSA-Online.de
    You can delete all files with a *.<hexnumber> ... in a very old version these files had not been deleted.
Sign In or Register to comment.