I don't think it will be long until then OpenAI it will also come into play for the development of an artificial intelligence-based search engine. The new web crawler GPTBot cu modethe GPT-5 broad language is already released.

Those who use ChatGPT I know this model of broad language (LLM) is currently running GPT-3.5, being trained on a data set updated in September 2021. So if newer information is requested from this date, ChatGPT is not able to provide accurate information. Of course, valid for the free version that does not support the use of auxiliary plugins.

With the launch GPTBot, OpenAI has the way open for web page indexing through this new web crawler. As companies such as Google, Microsoft, Yahoo and many others have been doing for many years.

The new web crawler GPTBot utilizes web agent:

User agent token: GPTBot
Full user-agent string: Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.0; +

Website owners can control the indexing of web pages through the file robots.txt, using the same directives as for others web crawlerand of other companies.

For example, if the owner of a website does not want that OpenAI to collect information from the site, may add in robots.txt the lines:

User-agent: GPTBot
Disallow: /

Even if he acts like one web crawler, GPTBot will have a distinct purpose: to harvest publicly available data while carefully avoiding sources that involve paywalls, collection of personal data, or content that violates policies OpenAI.

But there are quite a few controversies, some that have even attracted legal action against the company OpenAI on privacy and the use of content without the consent of the authors or without identifying the sources.

In June, Japan's privacy regulator issued a warning to OpenAI regarding unauthorized data collection. Italy also temporarily banned the use earlier this year ChatGPT due to alleged violations of European Union privacy laws.

