Skip to content

GSA Image Spider NEW GUI/Version

2

Comments

  • SvenSven www.GSA-Online.de
    new version should fix things
  • It works perfect now, many thanks!
    Have you removed the exif remove checkbox or it's just me who can't find it now?
  • SvenSven www.GSA-Online.de
    that is only added when you save in jpg format or use a filter. saving in original will not touch anything.
  • Cool. I've got some gui suggestions. Would it be possible to allow deleting the images with a key like "del" or "backspace" Currently it only works through a right click menu and it's a nightmare to go through hundreds of images like that. Is there a setting to limit the amount of images per keyword somewhere? Also, check images doesn't seem to do anything for me. What's that supposed to do?
  • SvenSven www.GSA-Online.de
    delete: added in next update / though you can turn on the viewer and click delete there.
    --
    max images: you have that in next update
    --
    check images will check if that url delivers a valid one...thats however not showing a message when done and when you scrape all images should already be fine. so i will fix that as well in next update
  • SvenSven www.GSA-Online.de
    update is out btw.
  • Thanks! Two more things for today.

    Is it possible to add a URL blacklist to "search by keywords" or at least just omit the results from pixabay.com when using Google Image Search. I only selected Google Image
    Search and that returns quite a few results from Pixabay. The problem is that Image Spider can't download them or it downloads a grey picture saying "Discover pixabay free images No hotlinking"
    If I use the built in pixabay scraper it works fine. It returns different results but at least those work fine.

    Thanks for the del function! Could you make the selection bar not to disappear and not to go back to the beginning or at the end of the list after every deleted image?

    Thanks!
  • Also I am not sure that in the "Search by Keyword" function the Maximum amount to scrape works as it supposed to do.
    I've got 12 keywords and Google Image Search selected. When I hit OK it returns the results for some of the keywords and stops as if there were no more keywords. When I hit search again it scrapes some random kw again including some of the already scraped ones.
    Is any way to switch between the licensed and free to use images in google search?

  • SvenSven www.GSA-Online.de
    pixabay: fixed in next update
    ---
    that option is for all images and all keywords. It would stop after adding 10+ images if you enter 10 there even if you have 1000 keywords.
    ---
    google: was that an url parameter?
  • TheGypsyTheGypsy Madrid
    edited March 2018
    Thanks
    -----
    Would it be possible to make it go through all the keywords and respect the limiter as a minimum for each keyword? That would make much more sense when scraping for multiple keywords for various projects.
    ---
    I only used it through gui in scrapebox image grabber add-on or rarely in google but I guess it has a URL parameter as well. This may help (3.2): http://jwebnet.net/advancedgooglesearch.html
  • There is one more thing. Would it be possible to remove the file extension from the files when importing files into IS and then exporting them with %file% or %title% macro into a folder?
    Currently it looks like this:
    Imported filename:
    crazy-widget.jpg
    slow-widget.jpg

    Exported filename:
    crazy-widget.jpg-some-modification.jpg
    slow-widget.jpg-some-modification.jpg

    It would be nice to have something like this:
    crazy-widget-some-modification.jpg
    slow-widget-some-modification.jpg

    Also, if you are looking for to further expand the filter functions in IS an exposure and zoom function would come handy.

    Thanks

  • SvenSven www.GSA-Online.de
    file macro is fixed in next update.
    the other filters are basically a resize + shave thing no?
  • Exposure is more like a brightness thing but it changes a different tone in post processing. There is one more filter that is quite popular in post photo processing. It's called Fill Light which fills the darker areas with light. It's quite cool one.

    Yeah, I guess you can skip the zooming. Resize and shave does the job.
  • SvenSven www.GSA-Online.de
    latest version has no new filters but other nice options like re-compression to reduce size of jpg/png without losing quality and many other things.
  • Really nice options Sven! Thank you.

    I've got some minor feedback for you:

    -Wouldn't it be easier just to include the %number% macro in the file name template field? If no %number% used then IS would default on standard numbering. This would expand the flexibility of the renaming as we could add numbers in the middle of the filename.

    -When scraping for images and aborting the process it takes an eternity for IS to stop if it does at all.

    -When applying filters IS slows down as a snail. Even entering number into the boxes takes seconds let alone using the slider. I don't see any extreme memory or cpu usage though.

    -In the filter windows could you add a next and previous arrow to the preview? Currently there is no way to check how would the applied filters would look on the rest of the images. Similar to the Viewer window would be perfect.
  • SvenSven www.GSA-Online.de
    Will go over your suggestions tomorrow.
    Meanwhile the docu: http://docu.gsa-online.de/image_spider
  • SvenSven www.GSA-Online.de
    %number% : sorry, I think that adding a number in middle of the file name is a very uncommon setting and it would break stuff as well as the tool would have to cut out %number% and also spaces or chars around it on it's own thinking.
    ---
    abort: should be faster in next update
    ---
    filter: thats all happening in main thread as it has to due to gdi operations, i cant do much of a speedup here sorry
    ---
    next on filter: you can use the tool->use filter and it will work even if not exporting. though i might be able to add s next/prev when being in that mode.
  • I meant by using the number in the middle like this:
    My-super-keyword-1-BrandName.jpg
    My-super-keyword-2-BrandName.jpg
    Although I understand that it is kind of specific as people don't even bother renaming images anymore as they use wordpress anyway. I just thought I could save a step here but it's fine like this too.

    Thanks!
  • KaineKaine thebestindexer.com
    edited March 2018
    For "Parse Urls" it's possible to add more option ?
    Like : Only images "with that's" in url.
              Only images "with that's" in images name.

    One or all of this options possible or not in same time.

    For exemple if i want specific images type on one website :

    https://exemple.com/0389/58740389/pics/photo_58740389_avatar_4.jpg

    Another thing is to block some others urls or mandatory another for spider by mask :

    https://exemple.com/pics/photo_58740389_avatar_4.jpg

    After scraping is done it can be good too select all images or Urls by word.

  • KaineKaine thebestindexer.com
    Maybe add (for Google) langage selection ?
  • SvenSven www.GSA-Online.de
    edited March 2018
    @Kaine you have all that in options to skip or accept special URLs.
  • KaineKaine thebestindexer.com
    edited March 2018
    @Sven Ok just see that :)

    I see the option "skip parsing urls", but would not it be interesting to also make a rule that forces a crawl only in one directory (and below)?

    There may be pages that can not be visited (javascript) page1,2,3 ... it could be useful to allow the incrementation of the page.

    Just for exemple :

    https://exemple.com/photos/géométrie/?&pagi=2
    https://exemple.com/photos/géométrie/?&pagi=[INCREMENT]

    Enter the 2 values, start and end to visit and the spider visit all the beach without having to click (in the rare cases where he could not).

    For "accepted image type", would it be possible to define the extension ourselves? (like .gif .svg ... and all future format).
  • SvenSven www.GSA-Online.de
    edited March 2018
    i can try adding {1-100} as parameter in URLs.
    I have added support for this in latest update. Read here for more:
    http://docu.gsa-online.de/image_spider/scrape_settings#parse_urls
  • KaineKaine thebestindexer.com
    edited March 2018
    0-1000 thank you Sven
  • Hey Sven when you have a bit of time could you have a look at two little things, please:

    1. When using the del key to delete images the highlighting row disappears. It comes back when I hit the one of the arrows but it goes back to the top of the urls. It would be nice if it could remember the positions same as the delete radio button works.

    2. Clicking the green update button in the bottom right corner shuts down IS and nothing happens.

    Thanks!
  • SvenSven www.GSA-Online.de
    both fixed in next update
    Thanked by 1TheGypsy
  • It works nice now, thanks!

    Any chance adding more filters to it?

    An exposure slider is greatly missed as that is the proper way to lighten up pictures.

    Others like vividness, highlight and shadows settings would be fancy too but not as crucial. 

  • SvenSven www.GSA-Online.de
    if you find me some code samples i can use for this (preferable in pascal/delphi or c++), Im all for adding it ;)
  • KaineKaine thebestindexer.com
    edited March 2018
    Maybe display multiple images simultaneously could be good (loading time of sites that can be long).
    Would it be possible to decently load the images in ram? checkbox if the machine has a lot of ram to visualize faster images to keep or delete. So there would be no need to download the images after the selection but just save them.

    Regarding the crawl of a site, why do threads go down to 0 before continuing? It is not faster to fill a list where threads could take urls without interruption?
  • Displaying images simultaneously would be a fancy feature as the downloading really takes some time sometimes. +1 on that

    Regarding to code samples I haven't given up yet but I'm very close to that. I thought it's all over the net as these things are kind of generic when it comes to image processing...

    Thanked by 1Kaine
Sign In or Register to comment.