Troubles understanding some aspects of the software

Minidou · October 2020

Hello,

As a beginner, I have a bit of trouble understanding certain notions or functionalities of the tool, and since the tool has evolved a lot since the demo videos, I have several questions that hinder me in using the software. Here are my questions:

1) Amount of Threads to use / Wait between search queries / Max time to search for articles: How to determine the optimal values to use according to your hardware ? Are there standard values to compare what is desirable with what is not ?

2) Keep cache for X days: What is the usefulness of the cache? When does it intervene and how to determine the optimal value to assign to this option?

3) What is "synatax analysis" ? How does this option work?

4) Bold/Italic/etc in "HTML decorations": How does it work ? From what I understood from my tests, if I put 100% for Bold for example, a whole part of the paragraph will be in Bold... But not 100% of the text.
Which parts of the scrapped content do the HTML variations apply to?
Does this work in addition to the "insert random paragraph spoilers/decorations" option, or are they two different functions?

5) I have troubles determining what exactly are the differences between "Keywords must be in Title" here :

And "Title with keywords", here :

What are the purposes of both these options ? I thought it was self-explanatory at first, but then I didn't find exactly what do what...

Thanks for your help !

Sven · October 2020

1) You would start with default settings and only change things if you are not satisfied:

Amount of Threads to use: on a good system you can probably increase this to 100 or even more. Just watch the memory usage that it stays below 2gb

Wait between search queries: if you use proxies, you can keep it as such or if you are using a lot other sources beside google. Else I would increase this to 60

Max time to search for articles: you better increase this a lot. 5minutes would mean that the software will stop the whole article scraping process after 5 minutes and work on what it got. If you create a lot content, you want this to be much longer

2) Thats useful to not download the same data again and again. It makes sense if you have projects with same topic. Else you can decrease this value. 5 days however sounds ok to me.

3) this method is not really something you should use. That said it reminds me to finally write a good docu. Anyway, Syntax analysis will take a sentence and e.g. exchange certain words with something completely different but from that same word group. That means the resulting sentence might be perfectly readable and unique but it's meaning can be mixed. It will e.g. exchange the word "good" with "perfect" or the word "house" with "car". You see, it's more a special use case.

4) This HTML Variations are some functions previously used in SER to fight duplicate content. Honestly I don't think it's that useful anymore and to me it looks unnatural. I would not use it but the "spoilers/decoration" thing is very useful. It's something completely different. It will put small headings or extracts to the front of a paragraph and it reads way better.

5) The option on the INPUT tab is for the scraping part. An article will only be used once it has a keyword in title. The option on the output will generate articles with keywords in title.

Minidou · October 2020

Thank you very much Sven, it's perfectly clear now !

Minidou · October 2020

Hey @Sven,

As I appropriate the tool and better understand its mechanisms, new questions come to mind. I ask them here, following my previous questions, so as not to pollute the forum thread with a new post.

Following the last update, I will also have some additional questions to submit to you.

1) I need a clarification on how the scrap works and how the content is created.

- Does the tool scrap ONE keyword and then create the content from that one keyword? Or does it uses ALL the keywords we added and then create the content from all the data collected from these keywords ?

Example: I use the keywords "marketing", "emailing" and "copywriting", combined with the option "Mix Paragraphs (from all articles) + Reorder Sentences".
Will the paragraphs and sentences that will be mixed come from all the articles scraped from "marketing" (for example) ? Or from "marketing" + "emailing" + "copywriting" ?

- How deep in the SERP will the tool scrap the content? 10 first results? 100 first results? Beyond ?

2) - I have troubles interpreting the new "Keywords should be used in resulting Titles" option (Input tab), because when I launch the generation with this option checked, I don't end up with "Marketing" as a title, for example, but "What is Marketing? - The Definition of Marketing - AMA", which is the title of an article scraped in the SERP.

In my example I don't want my content generation to use the title of scrpped articles from the SERP. I only want the articles to have as keyword the keyword used for the scrap ("Marketing", to reuse my example).
I thought the "Keywords should be used in resulting Titles" would do that, but it's not the case.

Moreover it doesn't matter if the "Keyword in Titles" option (from the Output tab) is checked or not, the result is the same: The titles of my articles are the one from the articles scraped in the SERP.

How to attain the desired result, following the recent changes ?

- Besides, I didn't understand what the "Title with keyword" option (from the Output tab) is used for now that we have the "Keywords should be used in resulting Titles" option in the Input tab ?

3) Regarding the bullet lists, how does the scraping process work?
It looks like the data scraped mainly consist of dates, or useless/out of subject informations, which make it so that the lists don't bring much to the article, semantically speaking.
Not all the time, luckily, but often.

4) Input tab > Keywords tab > "Check All/Check Selected/Uncheck all/Uncheck selected/Toggle" (right click) & "Edit focused keywords" (Edit button) : It's unclear what these functions do. It seems they're not implemented ?

Thanks again for the time you took to read me.
Take care !

Sven · October 2020

1) A search is done with any of the input keywords, content is collected and mixed between all of the found articles even across different keywords. The only way where this is not done is the algorithm "Same Article".

2) The title for a generated article is taken from an existing article found. That can be limited to only take the titles where a certain keyword is found (OT column) when using "Title with keywords". If none title it found, it will generate one on it's own by using titlegen.dat file (only in English language present unless you create it yourself).

All the checkboxes below the keyword listing are for scraping and will only take the article found when they match title/url and the keywords defined for it in listing.

3) Please provide some samples so that I can improve this by filtering out certain things.

4) You are right, that popup menu should not be there. I will fix that on next update...sorry.

Minidou · October 2020

Sven said:
1) A search is done with any of the input keywords, content is collected and mixed between all of the found articles even across different keywords. The only way where this is not done is the algorithm "Same Article".
2) The title for a generated article is taken from an existing article found. That can be limited to only take the titles where a certain keyword is found (OT column) when using "Title with keywords". If none title it found, it will generate one on it's own by using titlegen.dat file (only in English language present unless you create it yourself).
All the checkboxes below the keyword listing are for scraping and will only take the article found when they match title/url and the keywords defined for it in listing.
3) Please provide some samples so that I can improve this by filtering out certain things.
4) You are right, that popup menu should not be there. I will fix that on next update...sorry.

1) Okay, now I understand better the difference between "from all articles" and "from same article" !

From all article : Content is generated from all the keywords mixed together

From same article : Content is generatd from only one keyword

It's well thought out.

What about the scrap depth though ? To generate content from a keyword, the tool picks the first 10 results from the SERP ? The first 100 ? Is it a customizable setting ?

2) - Sorry, I wasn't very clear in my formulation, I'll try to rephrase.

There are two options in the tool that seem to have the same function: "Keywords should be used in Resulting Title" (OT column in the "Input" tab), and "Title with keywords (OT column in Input -> Keywords)", from the "Output" tab.

What confuses me is the reference to the OT column with the checkbox "Title with keywords" from the Output tab, which suggests that it is actually the same thing as "Keywords should be used in Resulting Title".

So I didn't understand if they are two options that fulfill different functions, or if both options fulfill the same function (in which case it would be better to remove the option from the Output tab, to avoid confusion...?).

- Therefore, it is currently not possible to voluntarily use only the keywords used for the scrap to generate the title ?

Example: I use "Marketing" and "Emailing" in my scrap keywords, I use the algorithm "Mix Paragraphs (from same article) + Reorder sentences heavily" and I want to generate 2 articles.

I want the Titles of these articles to be the same as the scrap keywords, in other words I want to have 2 articles whose Titles will only be "Marketing" and "Emailing".

3) OK, here are several examples.

As you can see, the lists in English are more qualitative than in French. I'm going to do a quick translation of the contents framed in red, so that you understand that these contents are off-topic.

In both generations, the keywords used to scrap are the same (they are identical in both languages, so no problem).
There are some artifacts that are not relevant, as you can see.

I also take this opportunity to report 3 problems in the generation !

1.EN : Although the tool has scraped several images, sometimes there are errors such as here.

2.EN : An anchor has been created right in the middle of a word, and outside the frame of anchors defined in the concerned field ("de" is not an anchor I asked to place in articles).

1.FR : 1- "March 21: International Desktop Tidying Day / International Forest Day"

Also, a piece of code was incorrectly inserted in the article.

2.FR : 1- "3 000 characters = 1 A4 equivalent = 50 lines = 95€"

2- "May 14th : WebCampDay"

3.FR : 1- "February 9: 92nd Academy Awards Ceremony (2 a.m.)"

2- "June 29: Wimbledon (until July 12)"

4.FR : 1- "June 27: Tour de France (until July 19)"

2- "June 6: Day of the mini-skirt"

5.FR : 1- "It's great Anne-Clotilde, congratulations"

2- "September 27: World Tourism Day / Google's birthday (22 years old)"

Sven · November 2020

1) It really depends on the scraper. Most scrapers are defined to take the first 100 results, some simply take all until there is no more results left.

2) One is letting you choose to what this entered keyword should be used for (input table) and the checkbox is letting you choose if you want to make use of this filtering/option. By default a keyword is used for everything (scraping, title check, title generation, highlight in article...). But you can of course change that or turn on/off the options to make use of it or not.

3) I will try to optimize this on next update. Though in cases where you see such strange style misplacement or links inside words, please send me the project backup with that article only so I can try to reproduce and debug.

ElHermano · November 2020

hey sven, sorry im stupid but i read the topic and didnt understand like minidou

you say the checkbox in output option let us choose if we want to use the OT option in the input tab or not, but isnt checking the OT checkbox doing the same?

ive done four tests to play with the option and here are the results

create title with keywords (checked) + keyword should be used in resulting title (checked): keyword is in the title

create title with keywords (checked) + keyword should be used in resulting title (unchecked): keyword is in the title

create title with keywords (unchecked) + keyword should be used in resulting title (checked): keyword is in the title

create title with keywords (unchecked) + keyword should be used in resulting title (unchecked): keyword is in the title

so no matter what i check/uncheck, the result is the same. it's strange, is there a bug?

nice software btw!!

Sven · November 2020

I am currently writing a manual for all our products. Today I worked through GSA Content Generator...I just started but maybe this will answer things already:

https://docu.gsa-online.de/content_generator/project_settings

ElHermano · November 2020

i think i understand better how the input tab work now, but in term of ux i still dont understand why there need to be additional checkboxes for IU/IT/OT columns to work?

by additional checkboxes, i mean the ones below Add/Edit/Remove buttons in the input tab

what i mean is, why should users have to check two checkboxes for the same function to work when they can just check the concerned checkboxes in the keyword field (the columns, simply said)?
is it because of a technical limitation that cannot be bypassed which force you to implement two checkboxes instead of just one?

also, if it's impossible to remove the additional checkboxes, i suggest to move the "create title with keyword" one from output to input tab, so as to have all the additional checkboxes in one place

oh and btw in the document you wrote you talk about "Gramma-xyz" to generate content but it doesn't exist in the tool

Sven · November 2020

There need to be an additional checkbox to use the definitions on the keyword-input tab because some customers simply don't want this behavior and then they can turn the title generation simply off by one click instead of unchecking all the keywords columns.

--

Moving the checkboxes is something I don't want to do as the O* columns are for the output and not for the input. It's a logical thing that I want to keep as it is.

--

Gramma-xyz is only available for English as I haven't defined this for any other language.

Troubles understanding some aspects of the software

Comments