Banner

Latest News

  1. Start ChatGPT instantly without having to sign up. “There are many benefits to creating an account including the ability to save and review your chat history, share chats, and unlock additional features like voice conversations and custom instructions. For anyone that has been curious about AI’s potential but didn’t want to go through the steps to set-up an account, start using ChatGPT today.” [Source]

  2. Princeton SWE-agent gets 12.29% on SWE-bench (ODD Devin got 13.84) : SWE-agent is our new system for autonomously solving issues in GitHub repos + it’s open source [Tweet]

  3. OpenAI’s Sora just made its first music video and it’s like a psychedelic trip : The ambient track coupled with the footage results in a uniquely ethereal experience. It’s half pleasant and half unsettling. [Youtube]

  4. Cohere - CommandR PLUS - 104B RAG optimized Sonnet competitor. [Tweet]

  5. OpenAI expands its custom model training program : OpenAI’s Custom Model program now offers assisted fine-tuning and custom-trained models, catering to enterprise needs for tailored generative AI solutions. [Source]

  6. Lambda Announces $500M GPU-Backed Facility to Expand Cloud for AI : Lambda, led by COO Mitesh Agrawal, introduces a novel financing method for deploying NVIDIA GPUs, removing barriers for AI startups. The asset-based structure leverages GPU cash flows, enabling on-demand cloud access without lengthy contracts. With over 100,000 sign-ups, Lambda Cloud supports NVIDIA GPU deployments for training and inferencing generative AI models. [Source]

  7. Groq lands function calling : Tool Use/Function Calling (beta) for Groq API is now available! This highly anticipated feature allows models available on GroqCloud to take user-defined functions as inputs and generate structured output to invoke them from external tools / codebases. [Source]

  8. Anthropic adds function calling support : Tool use is now available in beta to all customers in the Anthropic Messages API, enabling Claude to interact with external tools using structured outputs.[Source]


Articles

  1. Open/Close source LLM

  2. The Perfect Prompt: Prompt Engineering Cheat Sheet

  3. The ABC of LLMs: A Beginner’s Guide

  4. RAG Implementations Are Becoming More Agent-Like

  5. Mixtral AI generated comment to US regulators about Open Models

  6. Introducing Buddhi: Open-Source Chat Model with a 128K Context Window


Papers and Repositories

  1. Google presents Noise-Aware Training of Layout-Aware Language Models [Paper]

  2. Deepmind - Mixture of Depth paper : This work introduces a method where transformers dynamically allocate compute resources across input sequences, optimizing allocation for different layers. By capping token participation in computations at each layer, it achieves predictable yet context-sensitive compute expenditure, resulting in efficient models matching baseline performance with fewer FLOPs and faster training. [Paper]

  3. Jamba whitepaper is out! : The whitepaper details our in-depth ablations on this novel hybrid SSM-Transformer architecture, and how we chose to interleave Mamba, Transformer and MoE. [Paper]

  4. Show HN: I built a tool to crowdsource and share LLM experiments [Github]

  5. Octopus v2: On-device language model for super agent : It introduces a method for on-device models with 2 billion parameters to outperform GPT-4 in accuracy and speed, reducing context length significantly. Compared to other models, this improves latency by 35 times, suitable for real-world applications on edge devices. [Paper]

  6. Representation Finetunning (ReFT): A Powerful, Parameter-Efficient, and Interpretable way of fine-tuning [Github]


Thank you for reading ! 

If you have any comments or feedback, please do comment. You can find me on [Linkedin].

Find the medium post here: https://shresthakamal.medium.com/llm-news-and-articles-weekly-digest-april-8-2024-466fe73f6233