An insurance company in Kenya has deployed billboards in the city of Nairobi with the inscription “Someone tell ChatGPT that GA Insurance is the 3rd largest General Insurer in Kenya”. Similar copies of the company’s acclaimed position in various insurance categories are dotted across the city.
This could be seen as a marketing campaign seeking to take advantage of the craze and buzz of “ChatGPT” - and everything may be right with it context-wise.
However, inherent in this “call to action” are risks users of ChatGPT particularly corporate ones may be exposed to due to the kinds of data either confidential or otherwise they input into the interactive chatbot or any underlying source data.
Therefore, this article seeks to highlight some of these risks and propose ways companies can promote the use of emerging technologies while safeguarding their proprietary data from disclosures.
ARTIFICIAL INTELLIGENCE (AI), GENERATIVE AI AND CHATGPT REVOLUTION
The rollout of ChatGPT by OpenAI in November 2022 amplified the general consumer interest in the emerging technology of Artificial Intelligence (AI). Although ChatGPT is not the first use case of Artificial Intelligence, its ability to ‘humanize’ the use of technology particularly in a conversational manner has heightened the applicability of AI to many human endeavors.
Through iterative processes, AI leverages machine learning techniques to learn from large volumes of data to produce reasonable, predictive, and near-human cognitive information. Using publicly available data sources, AI is able to produce and create its own data sets (information) from these existing ones although not always accurate and current due to the time gap in its data sources.
The iteration process combines large volumes of data with fast and intelligent algorithms to allow the underlying software to learn automatically from patterns or features in its source data to produce results that have grabbed the attention and curiosity of many - thinking and acting like humans in its processed responses.
This cognitive ability of AI represents computerized machines with comparable human intelligence using machine learning, and deep learning among others to perform various tasks with ease.
The power of Machine Learning to feed on large volumes of data using different statistical techniques combined with the influence of deep learning using artificial neural networks to process information, solve complex problems, etc has enabled AI to discover various patterns in data and learn through high volumes of unstructured data in varied forms including text, images, and videos.
Comparably, Generative AI has emerged as a category that uses high volumes of raw data in iterating and learning patterns within same to generate the most likely accurate responses when prompted with a question.
This form of AI relies on large language models (LLMs) to produce natural language outcomes and generate texts and other language-based mediums. The potential of Generative AI to do this is what has enabled OpenAI’s revolutionary chatbot - ChatGPT.
Simply, ChatGPT is an AI-powered chatbot that uses natural language processing and machine learning algorithms to understand user queries and respond with relevant information in a conversational manner equal to human cognitive responses.
Its processes are optimized for dialogue using the Reinforcement Learning with Human Feedback (RLHF) method which uses human demonstrations and preference comparisons to guide Generative Pretrained Transformer (GPT) models toward desired behaviors.
To achieve the desired results, the models are trained on vast amounts of data from the internet including conversations, news items, articles, etc to enable human-like responses.
As part of its core functionality, ChatGPT analyses data, token by token, by identifying patterns and relationships; an ability that has enabled its superhuman responses and generated a worldwide craze within the shortest period of its launch.
THE CALL TO ACTION
By the advertisement, GA Insurance is asking the general public to provide data in confirmation of its positions within the insurance sector in Kenya in a manner that could become a source of data for subsequent iterations by ChatGPT and become responses to queries about which insurance company holds the positions the billboards seek to embed.
This call to action is for people to feed ChatGPT with data that enables it to one day produce responses in confirmation of GA Insurance’s claims. This desired outcome by GA Insurance is possible because of the way ChatGPT and every other Generative AI works. Generative AI as indicated earlier processes data from identified sources particularly online and based on processed data, makes predictive outcomes of the best possible answer based on the data sets.
Therefore, should people including workers need to the call and take to online platforms acknowledging GA Insurance for the various advertised positions, over time, ChatGPT and other Generative AI will come to learn and process such claims as correct positions of GA Insurance in respect of queries related to same.
Such marketing campaigns could be effective in helping the company establish itself and be validated somehow by emerging technologies such as Generative AI - as many may without more take as accurate responses from these chatbots.
THE ASSOCIATED RISKS
Every Generative AI tool needs data for its iteration process. As of now, they do not generate their own data sets - they process what is available. Therefore, the call to action is a call to provide or feed these Generative AI tools with data.
The temptation in doing this is to provide more than what the call to action requires. And anyone who has used any of the Generative AI tools will attest to the temptation to be specific and particular with queries because of the near-perfect responses one gets from these chatbots.
Over time, people could begin to input or ask queries or have conversations that border on confidential information relating to themselves, others, or the companies they work for.
In a FAQ by OpenAI on whether such conversations will be used for training, the developer of ChatGPT answered in the affirmative saying “Yes, your conversations may be reviewed by our AI trainers to improve our systems”. (Don’t be mistaken by the use of the word “may”. It means “it will be used” as that’s the only way the technology could be improved).
Also, on the question of whether specific prompts can be deleted, OpenAI said, “No, we are not able to delete specific prompts from your history. Please don’t share any sensitive information in your conversations”.
This is the clearest warning any technology developer could give. It’s plain black and white. Anything you input as part of your conversation with any chatbot will not be deleted. It will become part of the new dataset that the system will be trained on as part of its improvement process.
The risks therefore can be quantified. There is no guarantee for the protection of the confidentiality of the prompts you input into any chatbot. Eventually, such prompts will be processed and iterated. It could form part of an answer or response a chatbot will be providing a user asking similar questions in the future.
So, whether personal or company information, confidential or otherwise, once inputted into a chatbot, such information will become subject matter data for processing by the chatbot and made widely available on request by others.
And as such information are not capable of deletion from the memory of these chatbots, the unintended consequences of exposing hitherto confidential information to disclosure, and use can be avoided by implementing some of the recommendations below.
WHAT NEEDS TO BE DONE
AI and its use cases such as Generative AI has tremendous opportunity to improve the productivity of workers. Therefore, companies should not seek to ban its use in workplaces despite its risks.
To mitigate the risk, the following initiatives could be instituted.
1. Policy rollouts: As first steps, companies must redefine employee conduct and expected uses of these emerging technologies through policy initiatives and procedures. Such policies and procedures must clearly define the scope and permitted use cases to limit the compromising company confidential information and prevent the breach of the intellectual properties of others.
Companies must ensure all stakeholders participate in the design of these policies to ensure their respective concerns are accounted for. Additionally, companies must adopt a feedback loop for the measure of policy impacts and reviews that accommodate new updates/upgrades in these emerging technologies.
2. Training and re-training of staff: With employee conduct defined, companies must ensure comprehensive training and refresher programs are instituted to build employee capacities on the use of emerging technologies such as Generative AI. The training programs must leverage the benefits of theoretical and practical sessions to test and confirm employee understanding and appreciation of these technologies.
Where internal resources exist for this training, companies must use them or procure the services of external people with demonstrated expertise and practical ways of safeguarding the use of these technologies.
3. Intermittent audits and compliance checks: Compliance remains the surest way to ensure employees use these AI tools in accordance with defined conducts. Compliance checks such as intermittent audits or spot checks will help build and measure employee compliance. Additionally, initiatives such as compliance awards, the establishment of helpdesks, departmental champions, etc will help promote a strong compliance culture among employees.
4. Adoption of company-wide AI tools: Some companies have the resources to adopt company-wide AI tools built to the specific use cases of such companies. Where such capacities exist, such companies must invest in their own AI tools with control over the confidentiality of data or information uploaded or shared via such platforms. Companies may explore this option as an add-on productivity tool that enables new ways of working in a secure environment.
CONCLUSION
Increasingly, we are recognizing the immense importance of AI tools and Generative AI to drive greater productivity across many industries. This provides greater justification for their uses than their complete ban despite their risks. Therefore, the adoption of some of the recommendations in this article will help safeguard the use of these emerging tools and must be highly considered by companies to ensure their proprietary or confidential information is not exposed unintentionally to disclosures and breaches.