A monthly recap of news, product announcements, beta leaks, and other interesting Generative AI action that caught our attention. Subscribe to our newsletter on LinkedIn, or join our mailing list to receive this recap straight in your inbox.

Past issues:

April 2024

March 2024

February 2024

January 2024

December 2023

Welcome to May’s recap of news, product announcements, and other interesting GenAI pieces for your reading pleasure. It’s been a busy month of AI news, hence the longer-than-usual issue.

  • Only fitting to open this month’s recap with a massive piece of news – the announcement of OpenAI‘s new flagship OpenAI GPT, GPT-4o. GTP-4o is a momentous step in improving human-computer interaction, with faster reaction time and the ability to communicate in more formats. “She” interacts, reacts, even laughs! It accepts as input any mix of text, audio, image, and video and intuitively generates text, audio, and image outputs as needed.
  • TII (Technology Innovation Institute) announced the launch of Falcon 2, their new AI Model Series, which has two versions. With new improvements, they claim Falcon 2 to be the only model with such sophisticated vision-to-language capabilities on this level that can seamlessly and quickly convert visual input into text. Multilingual and multimodal, the new series surpasses priors, and because it is open-source, there is no limitation to its use.
  • A new and improved version of Yi, Yi 1.5 (which comes in 3 model sizes) delivers stronger performance in key areas such as coding, math, reasoning, and instruction-following, while maintaining its prior capabilities at high levels. Check out the tutorial on how to leverage this new open-source model.
  • The FineWeb data claims to be “15 trillion tokens of the finest the web has to offer,” consisting of cleaned and deduped tokens of English web data from CommonCrawl. See the Hugging Face piece here to read more about how this new upgrade outperforms competing models and their own prior RefinedWeb version.
  • Google recently released AI OVerviews search, which became a viral sensation, garnering attention sadly for all the reasons you hope to avoid with new features. Social media was humming with inaccurate, odd, and even dangerous responses people received from the new Google AI overviews, including recommendations to eat rocks, leaving their reputation at the moment between said rock, and a hard place… The company has responded that the majority of the answers are high-quality and relevant, and they are working to disable problematic searches. In the interim, don’t use glue to get pizza toppings to stick…
  • Elon Musk’s plans to build an enormous supercomputer with “a gigafactory of compute” would help power its Grok AI chatbot, and has spurred rumors of a collaborative effort between Microsoft and OpenAI to create an even larger project together that would rival these efforts. We’re all for healthy competition, it’s what makes the world go round, and the runners of the race surge forward. In this case, the finish line keeps moving, and we all enjoy watching the progress.
  • Speaking of Elon Musk, he and Yann LeCun (Chief AI scientist at Meta) engaged in a very public debate on what qualifies as science. This is more than friendly sparring, as the two work for rival platforms and have differing views, which they shared. Lecun, known for his pivotal work in neural networks and deep learning, is openly critical of what he sees as Musk’s problematic, erratic behavior on social media and conspiracy theories intermixed with his supposed pursuit of truth. Musk hit back, citing how many technical papers he’s recently published, and calling LeCun out on his lack of parallel publications, with LeCun’s response escalating the situation into a discussion of the definition of science that has the internet in uproar.
  • Trained from scratch with a new, improved batch of 2000B tokens, a new version of Yuan – Yuan 2.0-M32, uses a refined mixture-of-experts architecture with 32 experts (out of which 2 are active, a unique architectural choice to improve effectiveness) and a new, better router network, Attention Router, for improved expert selection and accuracy. This is an advancement in the field of AI because it combines efficiency, accuracy, and scalability, and they note that it even outperformed Llama 3-70B on some benchmarks.
  • Thumbs up to this new technology 👍 Dani Clode, who works at Professor Tamar Makin’s lab at the University of Cambridge has developed the Third Thumb. This extra robotic thumb extends the range of movement of the wearer and expands holding and carrying capacities, especially of activity usually too challenging for completion with only one hand. We don’t know about you, but we’re eager to “get a hold of” this, which will certainly bring new meaning to “all hands”.
  • With a serious modernization of the Sentence Transformer training approach, v3.0.0 relies on 5 new components and is a newer framework that generates vector representations for sentences, making it easier to compare and understand similarities. For more information on the major refactor and other changes, and how to install, check out release notes on github.
  • Chameleon, a family of early-fusion token-based mixed-modal models, can understand and generate images and text in any arbitrary sequence and in any mix. We’re all for multitasking and combined abilities, so this unified model is another accomplishment we have our eye on. Why blend in when you can stand out? Yeah, we see you, Chameleon. 👀🤩
  • A new Salesforce study found that “nearly two-thirds of C-suite executives say trust in AI drives revenue, competitiveness, and customer success,” with factors such as accurate and secure data boosting trust, though many employees lag in this newest chapter of the love affair with AI. The bottom line seems to be similar to that of many other areas in life – what you get is only as good as what you put in.
Credit: Ukraine Ministry of Foreign Affairs. Victoria Shi
  • We often hear that companies are democratizing something, but Higgsfield AI has made clear advances in the field of bringing stories to life, enabling any and everyone to blow their former visual content out of the water. Higgsfield is training a foundational video model that transforms text into incredibly realistic, dynamic images including adept mimicry of human characters and motion. Have a selfie? That’s all you need to use the Diffuse app, with customization and personalization that will have you wondering if you have a digital clone. AI meets content creation for social media. Don’t say you didn’t know…

It’s been a fruitful month in the world of GenAI! Stay tuned for our next edition, and don’t forget to sign up to make sure you stay in the know. Thirsty for more content? Check out our blog, or explore previous editions of our roundup.