Artificial Intelligence

Neuromorphic chip architecture points to faster, more energy-efficient AI: IBM North Pole

This paper explains that there is a strong need for designing energy-efficient AI computers. It describes a chip with a neural inspired architecture, IBM calls NorthPole, that achieves substantially higher performance, energy efficiency, and area efficiency compared with other comparable architectures.

Inspired by the organic brain and optimized for inorganic silicon, NorthPole is a neural inference architecture that blurs this boundary by eliminating off-chip memory, intertwining compute with memory on-chip, and appearing externally as an active memory chip. NorthPole is a low-precision, massively parallel, densely interconnected, energy-efficient, and spatial computing architecture with a co-optimized, high-utilization programming model.

The paper compares NorthPole with dozens of other AI chips including chips from Intel, Nvidia, Google, Qualcomm, Amazon, Applied Brain Research, and Baidu. Advanced AI chips use neuromorphic architectures. The moving and shuffling of data takes a lot of energy. In neuromorphic architectures the memory elements are intertwined with the processing elements at a very fine scale. This decentralized memory model along with high data parallelism are key factors for greater energy efficiency. NorthPole is about 5 times faster and more energy efficient than the Nvidia H100. For more details, check out this YouTube video from Anastasi In Tech.

Tensors Data may be organized in a multidimensional array that is referred to as a "data tensor"; however in the strict mathematical sense, a tensor is a multilinear mapping over a set of domain vector spaces to a range vector space. Observations, such as images, movies, volumes, sounds, and relationships among words and concepts, stored in a data tensor array may be analyzed by artificial neural network tensor methods. Computations involve matrix representations of linear transformations, calculating the null space and range of linear transformations, and the rank of linear transformations. This linear algebra article shows the math in a simple example. Here is an image processing example:

Since 2020, OpenAI has developed its generative artificial intelligence technologies on a massive supercomputer constructed by Microsoft, one of its largest backers, that uses 10,000 of Nvidia's graphics processing units (GPUs). An effort to develop its own AI chips would put OpenAI among a small group of large tech players such as Google and Amazon.com, that have sought to take control over designing the chips that are fundamental to their businesses.

It is not clear whether OpenAI will move ahead with a plan to build a custom chip. An acquisition of a chip company could speed the process of building OpenAI’s own chip - as it did for Amazon.com and its acquisition of Annapurna Labs in 2015.

Building an AI-Friendly Culture

Practice active listening
Emphasize the importance of your people
Share the vision

AI and the Organizational Structure

AI will automate operational tasks
Report generation and project tracking can be automated with AI
Flatter, team-based structures will drive innovation

Lessons in AI Implementation

Walmart implemented AI across the business, from inventory management to customer service
IKEA uses AI for routine customer inquires
Bank of America uses AI to monitor transactions
- Start with the problem
- Avoid layering new tech onto old processses
- Involve end users early and often
Genpact

AI and Your Workforce

AI and employee retention
- Provides real-time feedback
- Tracks rewards and recognition
- Foster work-life balance
- Automates routine tasks
AI and employee development
- Learning and Development Teams
- Personalized learning
- Recommend resources
- Automate evaluations
- Pair mentors
AI and performance management
- Analyze multiple data sources
- Personalize performance plans
- Highlight individual contribution
- Uncover new metrics
- Streamline roles and responsibilities
AI and team effectiveness
- AI augments our ability to work together and boosts communication
  - Microsoft Teams
- Create transparancy and accountability
  - monday.com
- Balance workloads
  - Trello
- Create positive teams
  - Humu

AI and Business Value

AI and business differentiation
- Foccus on efficiency
- Broaden AI tools
- Work smarter
  - otter.ai
  - Zapier
Mitigate AI risks
- Data classification framework
- Use acronyms and mnumonics
- Use diverse datasets
- Implement data governance
- Proactive training
Ethics and Fairness
- Ensure transparancy
  - Test regularly
  - Update models
  - Audit
  - Align with goals
- Prioritize privacy
- The right talent
  - legal
  - regulatory
  - ethical

Leadership in the age of AI

Just finished the course “AI Challenges and Opportunities for Leadership” by Conor Grennan!

Gen AI - Building AI agents

Tuesday, May 21 - Thursday, June 13 2024

Generative AI: From Prototype to Monetization

It’s been a little over a year since Generative AI took the technology sector (and the whole world) by storm. Just in that time, we’ve already seen enormous improvements in the capabilities of large models as well as a burgeoning ecosystem of products built around them.

This session will provide a refreshed overview from our Q1 class on generative AI, covering the following topics:

The Google model ecosystem: led by Gemini, our largest and most capable AI model incl.:
What’s new in Gemini 1.5
How to use larger context windows (up to 1 million tokens) effectively
Multimodal use cases for businesses
Function Calling in Gemini to connect to external systems
More efficient training and serving, while increasing model performance
Better understanding and reasoning across modalities
Overview of AI on Google Cloud, including 1P and 3P foundation models: We will provide an overview of the AI services offered by Google Cloud, including foundation models from Google and third-party providers incl.
Big partners and OSS projects in this space that our users are finding value with: LangChain, LlamaIndex, Chroma DB, Milvus, etc., and how to use them w/ Gemini
Open and Partner models, including Llama, Claude 3, and more
Limitations of generative AI: We will discuss the main limitations of generative AI, such as bias, frozen training data, and safety, and how we’re overcoming some of these limitations with larger context windows and function calling.
Choosing the right model and approach: We’ll discuss best practices for using generative models, in terms of model selection and how to determine when to use off-the-shelf models vs. fine-tuned models.
Protection with generative AI services: We’ll discuss our industry-first approach to indemnity for potential legal risks related to copyright infringement claims for both the training data used by Google and the outputs generated by customers

Products covered:

Cloud AI Generative AI Portfolio, Vertex AI, Model Garden, Gemini, LangChain, LlamaIndex

Deep dive into the Gen AI ecosystem

In this session, we’ll take a closer look at all of the core components and design patterns in the Gen AI ecosystem that you’ll use in later sessions to build a generative AI app, focusing on a customer service chat app.

What you’ll build: Chatbot for customer service that is grounded in your content and connects to your CRM, inventory, and support systems.
Gen AI stack for developers: Model, tools & functions, orchestration, and deployment
Backend and frontend: Backend development in Python and other SDKs vs. application development with GenKit
Gen AI tools and patterns: Prompt tuning in Gemini, Retrieval Augmented Generation (RAG)

Products covered:

Gemini, Vertex AI Studio, Vertex AI Agent Builder, Document AI, Firebase, Firebase Extensions, GenKit

Building out code pipelines for your Gen AI customer service app

In this hands-on session, we’ll start to build out a code pipeline and backend for a customer service application powered by generative AI, specifically leveraging the capabilities of the Gemini API in Vertex AI.

We’ll cover effective prompts to get accurate and relevant responses from the Gemini model and dive into prompt tuning techniques to optimize the model's performance for your specific customer service use case. And we’ll discuss techniques and patterns for how to incorporate various data types like images, PDFs, audio, and video into your customer service application.

We’ll cover:

Getting started with the Gemini model and API in Vertex AI
Prompt design and prompt tuning
Generation configuration settings
Overview of different modalities in Gemini
Building an initial RAG implementation grounded in your data
Working with multimodal data: images, PDFs, audio, video
Best practices for accelerating prototyping and development
Prototyping a user interface with Python web frameworks

By the end of this session, you will have a comprehensive understanding of how to build robust code pipelines with Gen AI, and you’ll have an initial version of your customer service application that you’ll continue to add functionality to in later sessions.

Products covered:

Gemini, Vertex AI Studio, Vertex AI Agent Builder

Defining user journeys for your Gen AI customer service app

Powerful Gen AI models need great user interfaces and user experience in order to reach users. Luckily, building a Gen AI conversational experience has never been easier than with Firebase and the Gemini extension, which makes it simple for developers to build Gen AI capabilities into their applications.

In this hands-on session, we’ll provide a brief overview of Firebase and get you started building out an interface for your customer service chatbot powered by Gemini.

Designing the right user experience for your chatbot
Review key conversation design principles
Stub out user experience
Define user flows
Implement a frontend layer (GenKit, Firebase)
Start testing inputs/outputs/eval

By the end of this session, you’ll have a user interface and user journeys defined for your Gen-AI-powered customer service application.

Products covered:

Firebase, Firebase Extensions, Firestore, GenKit

Deep dive into connecting Gen AI models to the real world and agent workflows

In this hands-on session, we’ll provide an overview of the features in the Gemini model and Vertex AI that you can use to build agents to retrieve information in real-time or take action via API calls.

We'll dive into practical use cases like using natural language to interact with SQL databases, automating complex workflows, and enhancing your chatbots with real-time data. You'll be equipped to connect LLMs to any API or system and extend the capabilities of what LLMs can do.

Overview of agents and relevant use cases
Explore connections to your databases, CRM, support system, and other external systems
Different ways to implement grounding
Gemini Function Calling to connect to external APIs
Building RAGs on vector DBs, APIs, and YouTube
Deep dive on tools and function calling
Deep dive on orchestration (LangChain)
Talk through patterns for connecting to external systems

By the end of this session, your Gen AI customer service app will be able to retrieve information from external information sources so that your customers have the latest information and personalized content.

Products covered:

Vertex AI, Gemini, Gemini Function Calling, BQ, GCS, Drive, Sheets, Connectors, LangChain on Vertex

Deep dive into quality, evaluations, responsible AI, and safety

This session dives into the fundamental aspects of building responsible and reliable AI applications with the Vertex AI Gemini API. We’ll explore the built-in safety features of Gemini in Vertex AI, including content filtering and safety ratings, and how you can tailor the output of the Gemini API to your specific use case and business needs.

We’ll also dive deeper into evaluating the quality and effectiveness of your AI models. Understanding how to assess model performance is crucial for ensuring they meet your requirements and expectations. We'll explore:

Improving response quality and accuracy
Evaluation metrics and measures of success
Deep dive on user experiences beyond chatbots
Deep dive on evaluation, responsible AI, and safety

By the end of this session, you’ll understand how to follow responsible AI practices when building your Gen AI customer service app. And you’ll learn various evaluation metrics and techniques, empowering you to identify areas for improvement and optimize your models for better results.

Products covered:

Vertex AI, Gemini API, Vertex AI Agent Builder

Bringing it all together with orchestration, productionization, and deployment

Building a custom generative AI application requires bringing together several key components: a generative model (Gemini), tools to interact with external data and APIs (Function Calling), an orchestration framework to manage the interaction between these elements, and a robust deployment platform.

Once your application is built, deploying it to a managed runtime is crucial for productionization and scaling. Reasoning Engine (LangChain on Vertex AI) enables you to deploy your application with a focus on security, privacy, observability, and scalability. This removes the burden of managing infrastructure and allows you to focus on developing and refining your application.

Walk through options for evaluation and monitoring
Discuss productionization and deployment options
Deploy your application to Vertex AI's Reasoning Engine for a secure, scalable, and managed runtime environment
Pattern and approaches for custom orchestration
Common pitfalls and challenges of orchestration

By the end of this session, you’ll have a solid understanding of the different layers in your Gen AI customer service app, and you’ll deploy your app while learning about common production patterns.

Products covered:

Vertex AI Gemini API, Cloud Run, Cloud Functions, GKE, Vertex AI Reasoning Engine, LangChain on Vertex