How to use AI in content writing?

Automatic content generators continue to evolve, especially with artificial intelligence (AI) and the democratisation of "autoregressive language models" such as GPT-3. Indeed, you can now create your own content that is automatically referenced by search engines, although AI requires human intervention. How does this impact the way we work in marketing?

First of all, let's go back to the term "artificial intelligence", as defined by the Council of Europe, which is "a young discipline of about sixty years, which brings together theoretical and technical sciences (in particular mathematical logic, statistics, probabilities, computational neurobiology and computer science) and whose aim is to succeed in having a machine imitate the cognitive abilities of a human being.

What is an autoregressive language model ?

In simple terms, an "autoregressive model" is simply a model that predicts future values from past values. Technically speaking, it is a Deep Learning model, these language prediction models are considered as an artificial neural network (a system whose patterns have been inspired by the functioning of biological neurons) with a long term memory. This type of model is part of the branch of artificial intelligence known as NPL (Natural Language Processing) which enables computers to understand, generate and manipulate human language. This is an algorithm designed to receive, as input, a piece of language and complete it, as output, with what it predicts to be the most relevant piece of language for the user.

Let's take GPT-3 as an exemple

GPT3 replaced GPT2, which was released in 2019, and represents its most advanced evolution. GPT3 stands for Generative Pre-trained Transformer 3 and is an intelligent language system. It is the result of a project by the artificial intelligence department of the company OpenAI. This model has been available to users via an API (application programming interface) since July 2020. This model has 175 billion parameters, compared to its predecessor GPT-2 which had 1.5 billion parameters.

Among the many applications in artificial intelligence, GPT-3 belongs to the category of language prediction models. GPT-3 uses these algorithms to generate text. They have been previously trained using a large database. Quantifying the amount of text collected from the Internet to train this NPL model generates over 570 GB of data. The bulk of this collection comes from CommonCrawl (60%), WebText2 (22%), Books1 (8%), Books2 (8%) and Wikipédia (3%). In simple terms, it uses a "pre-trained" algorithm to generate text. OpenAI reportedly spent $4.6 million to conduct the training. The algorithm learned how languages are constructed using semantic analysis techniques. This method, also known in marketing, involves studying not just words, but their meanings and how their combinations affect other words used in the text. This training falls into the category of "unsupervised" Machine Learning. The training data was not pre-labelled, i.e. during training they received no feedback as to whether their answers were correct or incorrect. GPT-3 determines that the choice is the "correct choice" by examining the original input data. When it is certain that it has found the correct output, it will assign a "weight" to the algorithmic process that produced the correct result. In this way, it gradually learns which process is most likely to provide the correct answers. It is essentially the first artificial intelligence capable of passing The Turing Test (a test of a machine's ability to show signs of human intelligence. This test is still the standard for determining the intelligence of a machine, despite many criticisms over the years) and write very convincing copy as a human.

For more information on Deep Learning

How have language prediction models revolutionised content generation?

They are able to complete a story from the creation of a sentence, thus creating a whole text. Just as they can have the ability to write about any subject and can be directed to write in any voice, style or tone... For example, a text can be translated into a foreign language, just as they can adapt to different styles of writing, such as novels, poetry, articles, some can answer questions, summarise long texts, take notes, write short texts. GPT-3 for example can even write computer code. Indeed, in a demonstration video posted on YouTube, the AI uses a plugin for Figma software (a tool for designing applications and websites) which allows GPT-3 to create an application very similar to Instagram. This new advance will have a major impact on the future of software development.

Involvement in research marketing

Content creation itself is set to be completely transformed with widespread adoption of the use of an autoregressive language model. But other aspects of digital marketing will also be heavily impacted. Approximately all aspects of online brand and product marketing will be impacted by language prediction models, across the entire landscape of content, SEO, advertising, etc.

Involvement in search engines

As Google announces that nearly 100% of US searches now use BERT (a transformer network like GPT-3), it seems almost obvious that the future of search will incorporate this technology to its eventual limits. The biggest potential impact could be a significant improvement in Google's ability to generate dynamic response results and eventually replace website snippets. Furthermore, the architecture of this transformer is not unique. Over the next few years, many companies will be able to create models that could easily rival the quality of BERT. If this happens, Google could lose its historical leadership position as a search engine.

Involvement for research

If you combine this content production with a tool, which helps you identify the keywords, word count, etc. you need to compete in SERPs (search engine generated web page), content generated by a language prediction model has a high chance of being ranked.

As far as SEO is concerned, they are currently more about being writing assistants that help improve the speed of content writing. Unfortunately, it seems that some of the applications related to its predictive models could be detrimental to the search ecosystem. Some features may help users in an evolving form of spam of all types that we have never seen before such as:

Massive blogspam sites (written content will not be useless, but it will be filled with inaccuracies if not written by a human)
Astroturfing, undetectable human disinformation
Massive manipulation of undetectable human opinions through evolutionary opinion creation
Generation of fake news for link building or social media purposes

Implications for e-commerce

For example, GTP-3 can be used to create product descriptions for e-commerce websites. Creating comprehensive and attractive product descriptions is an important but time-consuming step in marketing. These templates will be allies in greatly reducing the time and effort invested, allowing you to focus on research, conversions and other aspects of marketing. There is one negative point to note. Indeed, it is conceivable that some marketers will generate mass product reviews, which means that a mix with real human reviews would justify a deterioration in the trustworthiness of reviews on the web. The e-commerce sector is already heavily impacted by this lack of trust to a lesser extent, it seems important to look further into ways of overcoming this lack of authenticity before the problem gets worse.

Can this impact on SEO?

It is essential to bear in mind that an AI system will mimic the data it is trained on. SEO is built alongside the progress of search engines, blogs, books and interviews. So the AI will probably learn from all the available SEO content. But there are limits to AI-driven SEO:

Solving unknown or poorly documented problems.
AI cannot find solutions with data that is not part of its analysis.
Take into account all existing material.
It is currently impossible to provide all source code and project documentation to an AI system to find an accurate answer that takes into account all existing information.
Successfully measuring the true quality of a content or idea.
AI systems do not yet work in real time and lack originality and creativity. As such, the true measure of content quality is currently minimal. Humans are the only judges of content quality. AIs generally detect spam and poor quality content.

In summary, AI has the ability to outperform SEO novices, but it still has a long way to go to replace an SEO expert. The bottom line is that AI and humans will always be the best combo, with humans focusing on the higher value tasks.

What is the future of GPT-3?

OpenAI does not disclose all the details of how its algorithms work, so anyone looking for answers or developing content using GPT-3 or any other language prediction model should ask themselves how the extracted information was obtained and whether these tools can really be trusted. Nevertheless, the system shows promise, even if there are still details to be modified. It is an interesting tool for working on short texts or simple applications, but the results it produces for more advanced tasks are not real answers. Nevertheless, I must admit that GPT-3, with all its limitations, still achieved more than satisfactory results in a relatively short time. But I have come to realise that GPT-3, like any other predictive model, will be a key catalyst for change in content marketing and search. It is important that practitioners stay informed and explore the implications of this technology for themselves, as they will soon encounter it in many different forms.

Sources :

Semruch : https://www.semrush.com/blog/the-biggest-threat-to-seo-isnt-human/
Search Engine Journal : https://www.searchenginejournal.com/seo-experiment-gpt-3/444988/
Yeeply : https://fr.yeeply.com/blog/gpt-3-revolutionnaire-de-lia/
Le big data : https://www.lebigdata.fr/openai-gpt-3-tout-savoir