Glossary

AI Bias

The phenomenon where artificial intelligence (AI) systems exhibit biased behaviour, often due to biased training data or algorithms, leading to unfair outcomes or decisions.

Algorithm

A set of rules or instructions given to an AI, machine learning (ML), or computer program to help it learn and make decisions.

Artificial Intelligence (AI)

The simulation of human intelligence in machines programmed to think and learn like humans.

Black Box AI

AI systems whose internal workings are not transparent or understandable, making it difficult to explain how they arrive at decisions or outputs.

ChatGPT

ChatGPT is a conversational chatbot developed by OpenAI, based on large language model GPT3.5 or GPT-4, which enables users to refine and steer a conversation towards a desired length, format, style, level of detail, and language. ChatGPT can assist with specific tasks like writing different kinds of creative text formats, like poems, code, scripts, email, letters, etc. The freely available version is ChatGPT3.5 (not connected to the internet and with a training data cut-off of January 2022) with limited features, with the paid version ChatGPT4 having more advanced capabilities.

context window

The context window in ChatGPT refers to the amount of text (in tokens) that the model can consider when generating responses. A token is a unit of text, which can be as short as one character or as long as one word. The context window determines how much of the conversation history, including the current input and previous interactions, the model can "remember" and use to generate coherent and relevant responses. GPT-4o currently has a context window of 128k. According to OpenAI, in terms of word count, this translates to approximately 85,000 to 96,000 words. Understanding the context window is crucial for effectively managing long interactions and ensuring that important details are preserved for as long as needed within the conversation.

Copilot (formerly called Bing Chat)

AI conversational chatbot tool powered by GPT4, OpenAI's LLM, integrated in Microsoft Edge browser and free to use. Connected to the Internet but limited to five responses on a single conversation, and can only enter up to 2,000 characters in each prompt. As of February 2024, it supports basic forms of multimodal input. You can upload images alongside text prompts for tasks like creative writing, code generation, and multimodal search, and generate summaries of videos.

Enterprise Data Protection

Enterprise data protection refers to the strategies, processes, and technologies used by organizations to safeguard their critical data from unauthorized access, corruption, theft, or loss. This involves a comprehensive approach to securing data across all platforms and environments, including on-premises servers, cloud services, and mobile devices. The goal is to protect sensitive information such as customer data, financial records, intellectual property, and employee details against cyber threats like malware, ransomware, phishing, and data breaches.

Key components of enterprise data protection include encryption, access control, data backup and recovery, data loss prevention (DLP), network security measures, and regular security audits. These measures are designed to ensure data integrity, confidentiality, and availability, complying with legal and regulatory requirements. Effective data protection strategies are essential for maintaining trust, supporting business continuity, and mitigating the financial and reputational risks associated with data breaches.

Gemini (formerly known as Google Bard)

AI conversational chatbot tool from Google, powered by Ultra 1.0 model. Connected to the internet and free to use, Gemini is a multimodal model, capable of processing input in the form of text, code, images, videos, and audio, and, when prompted, can generate new content across these same formats.

Generative AI Model

A generative AI model is a type of artificial intelligence system designed to create new content, data, or information that resembles the training material it has been fed. Unlike discriminative models, which are used to classify or predict outcomes based on input data, generative models can generate text, images, music, speech, and other forms of media by learning from vast amounts of existing data. These models understand patterns, structures, and relationships within the data, enabling them to produce outputs that are often indistinguishable from those created by humans. Generative AI models include technologies like GPT (Generative Pre-trained Transformer) for text, and DALL·E for images, which have applications in various fields such as entertainment, design, education, and more, offering innovative ways to automate and enhance creative processes.

Generative AI Tool

A generative AI tool is a software application or platform powered by generative artificial intelligence models. These tools are designed to automatically produce new content, insights, or data that mimic or are inspired by their training data. They can generate text, images, videos, music, code, and other forms of digital output. Generative AI tools are used across a broad range of applications, including content creation, design, music composition, coding, and problem-solving. They leverage the capabilities of AI models like GPT for text generation and DALL·E for image creation, enabling users to generate novel content by inputting specific prompts or parameters.

GPT

GPTs are a new way for anyone to create a tailored version of ChatGPT to be more helpful in their daily life, at specific tasks, at work, or at home—and then share that creation with others. For example, GPTs can help you learn the rules to any board game, help teach your kids math, or design stickers.

Anyone can easily build their own GPT—no coding is required. You can make them for yourself, just for your company’s internal use, or for everyone. Creating one is as easy as starting a conversation, giving it instructions and extra knowledge, and picking what it can do, like searching the web, making images or analyzing data. Try it out at chat.openai.com/create(opens in a new window).

Example GPTs are available today for ChatGPT Plus and Enterprise users to try out including Canva(opens in a new window) and Zapier AI Actions(opens in a new window).

GPTs Generative Pre-trained Transformer (GPT)

GPTs (Generative Pre-trained Transformers) in ChatGPT refer to a series of AI models developed by OpenAI that power the conversational abilities of ChatGPT. These models are based on the Transformer architecture, which is designed for understanding and generating human-like text through deep learning techniques. Here's a breakdown of how GPTs function within ChatGPT:

1. Pre-training: GPT models undergo a pre-training phase on a vast corpus of text data. During this phase, they learn a wide range of language patterns, structures, and information by processing and analyzing the text. This extensive training helps the model grasp grammar, context, facts, and the nuances of natural language.

2. Fine-tuning: After pre-training, GPT models can be fine-tuned with specific datasets or objectives to enhance their performance in particular tasks, such as conversational responses, providing more accurate and contextually relevant outputs.

3. Generative Capabilities: The "generative" aspect of GPTs allows ChatGPT to produce text that is not simply a selection from pre-existing responses but is generated anew for each query. This means ChatGPT can create coherent, contextually appropriate responses to a wide range of prompts, making the conversation flow naturally.

4. Transformers Architecture: The core technology behind GPTs, the Transformer architecture, excels in handling sequences of data (like sentences), making it highly effective for language tasks. It employs mechanisms like attention and self-attention to weigh the importance of different words in a sentence, enabling the model to generate responses that are both relevant and coherent.

5. Iterations: Over time, GPT models have evolved through iterations (e.g., GPT-2, GPT-3, GPT-4), with each version offering improvements in understanding, generating text, and handling more complex conversations. These advancements come from larger training datasets, more sophisticated training techniques, and architectural improvements.

In ChatGPT, GPT models serve as the foundation for understanding user inputs, generating human-like responses, and supporting various conversational tasks, from answering questions to creative storytelling, making them a versatile tool for natural language processing applications.

Hallucination

In artificial intelligence, hallucination refers to the generation of incorrect or nonsensical information that is not based on the input data or reality.

Knowledge Cut Off

The point in time up to which a language model has been trained, beyond which it does not have information.

Large Language Model

A Large Language Model (LLM) is a type of artificial intelligence system designed to understand, generate, and interact using human language. It is "large" because it is trained on vast amounts of text data, enabling it to grasp a wide range of language patterns, nuances, and contexts. LLMs can perform a variety of tasks, from writing essays to answering questions and translating languages, by predicting the likelihood of a sequence of words. These models are central to many AI applications, offering insights and assistance by simulating human-like understanding of language. In summary, Large Language Models are large scale, per-trained, statistical models based on neural networks.

Machine Learning (ML)

A subset of artificial intelligence that involves the development of algorithms that can learn and make predictions or decisions based on data.

Multimodal

In the context of artificial intelligence, multimodal refers to systems that can process and analyse different types of data, such as text, images, audio, and computer code. This allows them to make more accurate and informed decisions compared to systems that rely on just one type of data.

Natural Language Processing (NLP)

A field of AI that focuses on the interaction between computers and humans through natural language.

Neural Network

A series of algorithms that mimic the operations of a human brain to recognize relationships in a set of data.

OpenAI

An AI research and deployment company that aims to ensure that ‘artificial general intelligence (AGI) benefits all of humanity.’ Read more about Open AI

Parameters

Parameters in artificial intelligence are the aspects of the model that are learned from the training data, determining the model's behaviour.

Prompt

In the context of artificial intelligence, especially language models, a prompt is an input given to the model to elicit a response or output.

Prompt Engineering

The practice of designing and refining prompts to effectively communicate with AI models and achieve desired responses.

Reinforcement Learning

A type of machine learning where an algorithm learns to behave in an environment by performing actions and seeing the results.

Reinforcement Learning from Human Feedback (RLHF)

A type of machine learning where models are trained and refined based on feedback from human interactions, improving their performance and alignment with human values and expectations.

Supervised Learning

A type of machine learning where models are trained on labelled data (data is tagged with the correct answer).

Tokens

In the context of ChatGPT and similar language models, "tokens" refer to the pieces of text that the model processes. These can be words, parts of words (like prefixes or suffixes), punctuation marks, or even spaces, depending on the tokenization process used. Tokenization is the first step in processing natural language text, where the input text is split into these manageable units or tokens.

Language models like ChatGPT are trained on massive datasets of text, learning the relationships between tokens to understand and generate human-like text. The number of tokens a model can process in a single prompt or response is often limited due to computational constraints. For example, GPT-3 has a maximum token limit for each input and output sequence, which affects how much text can be processed or generated at one time.

Tokens are crucial for understanding the structure and meaning of language, as they allow the model to analyze text at a granular level. The effectiveness of a language model in understanding and generating coherent, contextually appropriate responses depends significantly on its tokenization process and its ability to manage these tokens efficiently.

Transformer

A type of neural network architecture used in deep learning models, particularly effective for understanding the context in language tasks.

Turing Test

A test of a machine's ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human.

Unsupervised Learning

A type of machine learning where models are trained on data without labels, discovering hidden patterns in the data.

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

Are You AI Ready? Investigating AI Tools in Higher Education - Student Guide Copyright © 2024 by SATLE 'Are You AI Ready?' Project Team, University College Dublin is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.