- author: Chad Skelton
Exploring Data Analysis and Visualization with Chat GPT Plus
By Chad Skelton, former data journalist and current journalism and data visualization instructor
In this article, we will be exploring the capabilities of Chat GPT Plus for data analysis and visualization. Chat GPT Plus is a plugin from Notable, which is a Jupyter notebooks website. Our focus will be on a dataset on bike thefts from the Vancouver Police Department, which includes data for five years, such as the date and time of the thefts, location (in terms of 100 block), and latitude and longitude coordinates.
Analyzing the Bike Theft Dataset with Chat GPT Plus
Before we delve into the exciting part of what Chat GPT Plus can do, let's first go through the steps on setting it up for data analysis and visualization:
- Upload the dataset, in this case, the bike thefts data set, to Notable.
- Set up Chat GPT Plus plugin.
- Analyze the data and produce some data analysis and visualization.
Once we have completed these steps, we can start exploring the potential of Chat GPT Plus for data analysis and visualization. One of the key things we have found in working with Chat GPT Plus is that it is useful to use a dataset with which we are familiar. In this case, since the author had been a reporter at the Vancouver Sun and worked on this dataset, it made it more accessible for exploring the tool's capabilities.
Moving on to the analysis, we provide a brief summary of the dataset, which includes a date-time field, a hundred-block field, and latitude and longitude fields. It's worth noting that Chat GPT Plus provides additional metadata on the fields that are not present in the dataset.
Next, Chat GPT Plus checks for missing values, which is a crucial data hygiene step that it does with ease, saving us time. We can also see some basic statistics about the data, such as the count, mean, minimum, 25th, 50th percentile, and maximum, among others.
We can move onto the exciting part, which is data visualization. Here are some of the analyses and charts Chat GPT Plus provided:
- The distribution of thefts over different districts: This visualization shows that most thefts occur in districts one and four, with fewer in districts two and three.
- The trend of bike thefts over time: As we can see, there has been a significant rise in the number of bike thefts every year.
- The distribution of thefts by description: Most of the stolen bikes were of a value less than 5,000 dollars, but there were a few exceptional cases.
- The correlation between the time of theft and the district: This visualization shows that the patterns are similar between districts one and four in terms of bike thefts in the late afternoon/early evening.
- Thefts by time of day: This bar chart shows a spike in bike thefts around noon and in the evening hours, but interestingly, there's a significant surge of thefts at midnight.
In conclusion, Chat GPT Plus has proven to be an exciting tool for data analysis and visualization, which can achieve accurate and comprehensive results with efficient and minimal prompts. By exploring the bike theft dataset, we have seen how Chat GPT Plus analyzes and visualizes the data, providing us with valuable insights. With the tool's capabilities and potential, we can expect to save time and efforts while obtaining accurate and meaningful information for our analysis.
Bike Theft Analysis using Chat GPT and Notable Plug-in
Have you ever wondered why there are so many bike thefts after midnight? This is a question that was asked by a data journalist, and with the help of Chat GPT and a Notable Plug-in, it was analyzed and answered.
Theories on Midnight Bike Thefts
With the data analysis done by Chat GPT, it was discovered that more bike thefts are not necessarily occurring after midnight, but rather it is an artifact of the data reporting process. Sometimes, if the exact time of the theft is not known, it might be reported as occurring at the start of the day, which is midnight. Or, if the time of the theft was not recorded, it might be entered as 00:00 by default, leading to an over-representation of thefts at midnight.
Impressive Analysis with Little Prompting
What is impressive about this analysis is that almost all of it came with virtually no direction. The data journalist simply asked Chat GPT to analyze the data and produce visualizations, and it did a pretty impressive job. With a little bit of extra prompting, Chat GPT was also able to answer directed questions on the distribution of thefts in the hour between midnight and 1 am.
Seasonal Pattern in Bike Thefts
Another interesting finding was that there is a clear seasonal pattern in bike thefts in Vancouver, with more bike thefts in July and August than other months. This pattern could be due to more people riding bikes in the warmer months, leading to more opportunities for bike thefts.
The Notable Plug-in
All of the work done by Chat GPT is saved in a Notable workbook, which can be shared publicly with others. This can be useful for those trying to learn code, as they can see the code segments of the analysis and perhaps learn more about what Python can do in terms of data analysis and visualization.
To use Chat GPT and the Notable Plug-in, the paid version of Chat GPT (Chat GPT Plus) is needed, as well as an account on Notable. Once these are obtained, you can start analyzing data with just a few prompts to Chat GPT. Although this technology is still in its early stages, it has the potential to allow people to do a lot with their data, even if they are not proficient in coding.
Using Chat GPT and Notable for Data Analysis and Visualization
If you want to use Chat GPT and Notable for data analysis and visualization, you'll need to have the paid version of Chat GPT, which is Chat GPT Plus, and an account on Notable. Here's a step-by-step guide on how to get started:
Sign up for a free account on Notable's website.
app.notable.ioto access the main page of Notable.
Create a new project in Notable by clicking "Create Project" and giving it a name.
In Chat GPT under GPT4, click on "Plugins" and then find the Notable I/O plugin in the list of plugin stores and click "Install." There will be a simple authorization step to connect Chat GPT with your Notable account.
To set your default project in Notable, click on the project and say "Please set my default project in Notable to [project name]."
Upload the data sets you want to use to your project in Notable. To do this, click on "Upload" and select the data set you want to use.
When you're ready to analyze your data, simply say to Chat GPT, "Please analyze the data in the [data set name] file and produce some data analysis and visualizations."
It's important to note that while Notable can sometimes use APIs to grab data from other sources, uploading the data sets to your project is often the simplest way to work.
In a future video, the author plans to discuss different ways to get data into Notable and Chat GPT, including using APIs. For now, it's recommended to use a data set that you're comfortable with and know the ins and outs of.
If you're discovering Chat GPT and Notable for data analysis and visualization, the author invites you to explore further and share your experiences in the comments. The author's website is
chadskelton.com, where he occasionally blogs about his discoveries with these tools. On his GitHub profile page under the "Chat GPT" folder, you can find the data files he's using for his videos.
In conclusion, Chat GPT and Notable can be powerful tools for data analysis and visualization. With a few straightforward steps, you can start discovering new insights and patterns in your data sets.