You could be forgiven for being overwhelmed by the flurry of AI announcements over the last few weeks. Us too. This short summary is an attempt to get a handle on current landscape. Please accept it as a work in progress…comments and corrections very welcome…I think it will be updated regularly in the coming days.
Definitions
A few definitions just to get us going:
- Generative AI is a broad category of AI models that are designed to generate new content, such as images, music, or text.
- Large Language Model (LLM) a generative AI model that focuses on generating text. They are trained on large amounts of text data and can predict the next word or sentence based on the input text.
- Natural Language Processing (NLP). Algorithms and models that enable computers to understand, interpret, and generate human language in a way that is useful and meaningful to humans.
OpenAI
OpenAI has of course been front and centre over the past few weeks and so too, by association, Microsoft. The two companies appear to have a symbiotic relationship:
- Microsoft were an early and now dominant investor
- Current and planned integration of OpenAI models into their products (see below) including ‘exclusive’ elements.
- OpenAI has reliance on Azure infrastructure.
Founded in 2015 by usual suspects such as Elon Musk and Peter Thiel, Open AI originally had a ‘do no evil’ vibe. It seems they have now thought better of it but of course as Bob Dylan says, you never ask questions when God is on your side, so long as God has signed an NDA. Microsoft similarly seem to be experimenting with an ‘ethics-lite’ approach.
Chat GPT
Chat GPT-4 was announced this week and is ‘one louder’ than GPT-3. You can follow the latest on the following ways:
- Documentation.
- Blog (openai.com)
- Brief overview on YouTube.
- Community on discord
- OpenAI (@OpenAI) / Twitter
- (10) OpenAI: Overview | LinkedIn
You can dive deeper with their Developer Livestream. There is also a 99 page GPT-4 Technical Report.
Access
There are 3 ways you can interact with directly with ChatGPT (not including accessing it through another tool like Bing search)
- The main chat UI: New chat (openai.com). This is currently free but I believe the free site does not grant access to GPT4. Back in February they introduced a paid service GPT Plus which is $20/month.
- The playground which currently seems only to access GPT-3. I have noticed that the UI of the playground differs from the one in the GPT-4 demo but I am assuming that will update when I get access to the GPT-4 API.
- You can create an API key and interact with the models programmatically. Currently GPT-3 but you can join a waiting list for GPT-4. It also says you can jump the queue by contributing to evals.
Availability of the various interfaces has been very patchy as they scale to the demand. They are currently throttling the GPT-4 UI to 25 messages every 3 hours and it is likely this will fall further.
GPT4 Features
Documentation of the model can be found here.
Capacity
The quantity of text you can put in and get out of ChatGPT is measured in ‘tokens’. Words are made of tokens and the ratio varies with the complexity of the language you are using. The initial capacity of c2k ‘tokens’ was restrictive as a decent length newspaper article that you might input for summarisation. This has now been increased to the much more workable 8k with some flavours going up to c32k and hopefully more in future. 32k seems to us to allow the brute forcing of large blocks of text for summarisation and other tasks.
Multi-Modal?
The announcement said that GPT-4 is multi-modal. However at present image capability seems to be restricted to only one partner, a charity that helps those with vision problems.
Plug Ins
Just announced: ChatGPT plugins (openai.com)
Other Open AI models
As Chat GPT constantly reminds you, it is an LLM. However OpenAI do offer other types of Generative AI:
- DALL·E is a AI system that can create realistic images and art from a description in natural language.
- Whisper is a general-purpose speech recognition model.
- Embeddings are a numerical representation of text that can be used to measure the relatedness between two pieces of text.
- The Codex models are descendants of our GPT-3 models that can understand and generate code
- Moderation is a fine-tuned model that can detect whether text may be sensitive or unsafe
OpenAI/Chat GPT Rivals
I don’t have ETA on any of these:
- Anthropic, funded by Google, have announced a waiting list for ‘Claude’.
- Rumours of Apple doing something with Siri.
- I am not really a Facebook user, but I see this from Meta.
Search
Search may be the major strategic play in this first phase. Despite all of their moonshot investments and code reds, Google seem at first pass to have been caught asleep at the wheel, and since they have done little to diversify their revenue they could be squeezed between Bing on one side and Tik Tok ads on the other:
- Google have announced adding Bard to search.
- Microsoft have added ChatGPT to Bing.
- And now Bing Image Creator: Image Creator from Microsoft Bing
ERM/CRM
- Microsoft are adding Copilot AI to Dynamics 365. Sign up for trials and waiting list.
- Salesforce announced adding Einstein AI to their CRM tools.
Productivity
- Microsoft have announced that they plan to add Copilot AI to Microsoft (Office) 365 which we think is a big deal as it will put in in front of a large proportion of business technology users in their day to day work. I couldn’t see an ETA but they link to Enterprise Connect on 27th March
- Google have announced they are going to add AI to Workspace. I couldn’t see an ETA.
Pro/Low Code Platform
- AI has long been a small part of the Power Platform especially Power BI augmented analytics. Recently we have been experimenting with calling the Chat GPT API through Power Automate. However Microsoft are adding Copilot to Power Apps in a US-based preview initially.
- GitHub has had Copilot for some time but have now upgraded: Introducing GitHub Copilot X
- I am not very familiar with the Google equivalent but I guess this is the announcement. I couldn’t see an ETA but I imagine more news on March 29th Google Data Cloud and AI Summit.
- A huge range of features will doubtless follow, for example: Zappier
NLP
We see the more established NLP tools being used in tandem with Generative AI. The more deterministic results of the former models can be used to pre-process input for the more creative and non-deterministic features of the latter. We have been exploring:
- NLTK and particularly this book.
- Hugging Face
- spaCy
Resources
We are just working out a way of putting our mouths to the fire hose of daily news, hints, tips and warnings of impending Armageddon. When we do we will share, but in the meantime here are a few aggregators you could look at.
- The best daily update on these topics is Bens Bites. Ben has an amazing database of 4k+ resources and AI widgets.
- Prompt Engineering daily gives advice on getting the best from AI.
- We know Kaggle as an source for datasets, but they have added a models repository
- Similarly, Techmeme is a general news aggregator but since the hot trend has turned away from crypto winter to AI spring, the Techmeme River is a rich source, as is the daily Techmeme Ride Home podcast from Brian McCullough
Ethics
Joking apart, we do want to do a considered piece on the ethical challenges but that might be next weekend’s project. For the moment, though, suffice it to say we have been 100% focused on exploring how these tools can deliver practical benefit in a range of very real day to day business problems, rather than trying to get it to threaten to kill anyone.
Standswell
Here at Standswell, we are responding to the ‘Cambrian Explosion of Generative AI capabilities in three ways:
- We have always had a deep interest in the foundational issues of metacognition to better understand how this thinking can support our overall Application Life Cycle approach from value proposition, requirements, visualisation, analysis and interactivity. We now think that far from being a useful set of background knowledge, this area is fundamental to our entire operation. If we don’t know how humans think, learn and make decisions then it will be difficult to respond to technology which seeks to augment those skills.
- We are turning that thinking into concrete knowledge engineering processes, utilizing our great experience in sourcing structured and unstructured data from inside and outside the organisation, analysing and visualising it and now using generative AI capabilities and ontology structures to add order of magnitude improvements to speed and value.
- We are now starting to embed these features in the existing platforms to enhance the existing automation, analysis and visualisation capabilities to bring value to day to day business processes.
If you want to discuss any of these issues then please don’t hesitate to get in touch!
