ChatGPT Vision To Coords

Is a way to send ChatGPT vision a image broken into 9 sections, where it can then classify objects into those sections. Once a section or sections are identified, it will take those sections again and redivide them to obtain better precision.

Example

User: "I spy with my little eye something that is spooky"

Assistant: "I've identified section 4 as having a skeleton in it"

User: "With this image of section 4 where is the skeleton?"

Assistant: "Section 5"

How it works:

The image is broken into 9 sections with a green background and black outline around each section.
The resulting image is sent to ChatGPT Vision to be processed.
Once a response is received, function calling is used to extract the sections that were identified.
The identified sections are then broken into 9 sections and sent again.
The process repeats to the point where no more sections can be identified or the image is too small to be broken into 9 sections.

How it use

Git clone 'https://github.com/nickandbro/chatGPT_Vision_To_Coords' and pip install requirements.txt
Insert API key into config.py.
Open up main.py and change the image path to the image you want to use then run.

Problems

-Some objects that make up multiple sections have a hard time being correctly identified. -When subdividing a section again, the loss or resolution leads to a decrease in performance of the model. -Sometimes chatGPT functions do not properly pickup the mentioned sections

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.idea		.idea
images		images
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arial.ttf		arial.ttf
config.py		config.py
function_utility.py		function_utility.py
main.py		main.py
requirements.txt		requirements.txt
sectioning.py		sectioning.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ChatGPT Vision To Coords

Example

How it works:

How it use

Problems

About

Releases

Packages

Languages

License

nickandbro/chatGPT_Vision_To_Coords

Folders and files

Latest commit

History

Repository files navigation

ChatGPT Vision To Coords

Example

How it works:

How it use

Problems

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages