author: All About AI

OpenAI Dev Day: Testing the New APIs

Yesterday, OpenAI hosted their Dev Day, and they unveiled several new features and APIs that got me really excited. Today, I had the opportunity to test three of these features: the GPT Vision API, the Dolly 3 API, and the Text-to-Speech API. Additionally, I experimented with the new 128k context window in the GPT-4 Turbo engine. In this article, I will walk you through my experience with each of these features.

GPT-4 Turbo with 128k Context Window

One of the highlights of OpenAI's Dev Day was the introduction of GPT-4 Turbo with a 128k context window. This means that we can now input up to 300 pages of text in a single prompt. The best part is that it's not only faster, but also more cost-effective. Imagine having the ability to analyze vast amounts of text with just one API call. This feature has been a long-awaited one, and it's finally here.

GPT-4 Turbo with a 128k context window allows inputting up to 300 pages of text.
It is faster and more affordable than its predecessors.

OpenAI also made some updates to function calling and instruction following. The code interpreter now supports JSON mode, which enhances the flexibility of the output. These updates will be further explored in a dedicated video.

The Exciting New Features and APIs

During the Dev Day, OpenAI unleashed a range of exciting features and APIs that caught my attention. Let's delve into some of the most significant ones:

1. GPT Turbo with Vision API

One of the most exciting features for me personally was the combination of GPT-4 Turbo with the Vision API. This integration unlocks a whole new realm of possibilities. With just a simple prompt and an image URL, we can harness the power of AI to provide detailed descriptions or analyze images. The pricing for this API depends on the size of the image input, starting at a reasonable cost of 0.00765 per image for a 1080x1080 resolution.

GPT-4 Turbo with Vision API enables detailed descriptions and image analysis.
Pricing is based on the size of the image input.

2. Dolly 3 API

Another noteworthy addition is the Dolly 3 API. It opens up endless creativity by allowing users to generate, modify, and retrieve a wide range of content. With this API, we can create diverse text outputs by specifying different models, prompts, and sizes. The simplicity of the setup combined with the affordable pricing of just 4 cents per image generated makes it an attractive choice.

Dolly 3 API provides the ability to generate, modify, and retrieve content.
Pricing is 4 cents per image generated.

3. Text-to-Speech (TTS) API

OpenAI has also introduced the Text-to-Speech (TTS) API, which allows us to convert text into speech with ease. With a starting price of 15 cents per 1,000 characters, this API offers several voice models to choose from. While the current selection may not include the most natural-sounding voices, they are still decent for basic usage. OpenAI has hinted at the possibility of adding new voices in the future.

TTS API converts text into speech.
Pricing starts at 15 cents per 1,000 characters.
Currently available voice models may not be the most natural-sounding.

Testing the APIs

Now, let's dive into my firsthand experience of testing these exciting new APIs.

GPT Turbo with Vision API

To start off, I decided to test the GPT-4 Turbo with the Vision API. Setting up this API was incredibly straightforward. Simply by passing the API key, the desired model, and an image URL, we can obtain detailed descriptions and information about the image. The response time was impressively quick and the cost per image at 0.00765 for a 1080x1080 resolution felt quite reasonable.

To demonstrate the usage, I created a simple function called analyze_image. The code snippet showcased the ease of use and displayed the pricing based on different image sizes. I decided to analyze an image of the front page of The Wall Street Journal from 2008. The analysis was swift and accurate, providing information about the headline and key financial details.

Dolly 3 API

Moving on, I explored the capabilities of the Dolly 3 API. This API offers an array of possibilities by allowing us to generate and modify content. The setup was as simple as the previous API. With just the API key, the desired model, prompt, size, and output format, we can generate unique content. During my testing, I used a prompt about a '90s hacker setup and saved the resulting image in my designated directory. The generated image was exactly what I had envisioned, and the ease of use left me excited to further explore its potential.

Text-to-Speech (TTS) API

Lastly, I tested the Text-to-Speech (TTS) API. With this API, we can easily convert text into speech. The setup, similar to the previous APIs, involves selecting the desired model, voice, input, and response format. Although there is room for improvement in terms of voice quality, the overall functionality was satisfactory. For my test, I inputted a simple phrase and downloaded the resulting speech as an MP3 file. The audio output matched my input text accurately.

Analyzing Earnings Call Transcripts with OpenAI's GPT-4

As data scientists and finance experts, we are always on the lookout for tools that can simplify our analysis process. OpenAI's GPT-4 has emerged as a powerful language model that shows promise in generating summaries and reports. In this article, we will explore how GPT-4 can be used to analyze earnings call transcripts and provide detailed finance summaries for users.

Collecting and Preparing the Data

To begin our analysis, we collected transcripts from two earnings calls: Tesla Q3 and Meta Q3. These transcripts were then fed into a text file, resulting in a sizable document of approximately 34 pages, with 100,000 characters and 177,000 words.

Generating a Summary Using GPT-4

Using GPT-4's language model, we attempted to generate a summary of both the Tesla and Meta earnings calls. By running the input file through the GPT-4 prompt specifically designed for finance experts, we obtained concise summaries of the calls. The process was relatively quick, taking approximately 50 to 20 seconds to generate the summaries.

The Tesla earnings call summary provided bullet points on various topics such as the Cybertruck sales, expansion plans, and developments in AI. On the other hand, the Meta earnings call summary highlighted the launch of the Quest 3 headset, advancements in AI technology, and the Meta smart glasses.

Introducing the Assistant Feature

Excited about the capabilities of GPT-4, we delved deeper into OpenAI's playground. One remarkable addition is the Assistant feature, which enables users to create their own chatbots. We decided to create a Finance bot that can analyze earnings call transcripts and generate detailed reports.

We trained the Finance bot using the new GPT-4 Turbo 128 model, allowing it to understand finance-related instructions and generate in-depth reports. Additionally, we uploaded the Tesla Q3 earnings call transcript in PDF format for analysis.

Analyzing the Earnings Call Transcript

With the Finance bot set up, we initiated a conversation by requesting a detailed report on the Tesla Q3 earnings call. The Assistant feature analyzed the uploaded document and started generating the report, considering the provided system instructions.

The process involved a step-by-step analysis of the document, iteratively generating insights and sentiment reports. Eventually, a coherent report on the Tesla earnings call was created.

The report included an executive overview highlighting a mix of robust operational performance and cautious optimism in the face of challenging economic conditions. It presented key financial and operational highlights, discussed investments in AI technology, and emphasized the market strategy for autonomous vehicles.

Sentiment Analysis andOpenai's dev day was a promising event that introduced various new features and apis. i had the opportunity to test the gpt vision api, the dolly 3 api, and the text-to-speech api. the experience was seamless and exciting, showcasing the vast potential of ai-powered applications. while there is room for improvement, the new apis provide us with powerful tools to create innovative solutions. in subsequent videos and articles, i will explore further possibilities with these apis and share my findings. so, stay tuned for more exciting content!

Curious about the overall sentiment conveyed during the earnings call, we asked the Finance bot to summarize it in one sentence. The generated response described the sentiment as cautiously optimistic, with Tesla's leadership expressing confidence while acknowledging economic and production challenges ahead.

The results of our analysis using the Assistant feature were impressive. OpenAI's GPT-4 allowed us to generate detailed reports and gain valuable insights into the earnings call. Moreover, the ease of setting up the Finance bot and the quality of the generated reports left us excited about exploring this tool even further.

Conclusion

In this article, we have explored the potential of OpenAI's GPT-4 in analyzing earnings call transcripts. We have witnessed its ability to generate concise summaries for both Tesla and Meta earnings calls. Additionally, by using the Assistant feature, we have created a Finance bot capable of analyzing earnings call transcripts and generating detailed reports.

OpenAI's GPT-4 shows promise in simplifying the analysis process for data scientists and finance experts. With its language model and the Assistant feature, it has the potential to become an invaluable tool in the financial industry. As we continue our exploration of the GPT-4 model and its various features, we look forward to uncovering more applications and insights. Stay tuned for further updates on OpenAI's groundbreaking advancements.