Image Caption Generation

Image Caption Generation is one of the classic AI problem that uses both domains from NLP and CV making it a really interesting project. Objective of the system is to generate a caption( A one line description) about an Image which is accurate as much as possible. Caption generation is a challenging artificial intelligence problem where a textual description must be generated for a given image input. It requires methods from both computer vision and natural language processing. Computer vision to understand the content and features of the image and natural language processing to turn the understanding of the image into words in the right order. Recently, deep learning methods have achieved state-of-the-art results on examples of this problem.

Please read the complete description and README to be clear about the implementation.

Technical Report is saved here with all the references

Requirements

Minimum Requirements

Python with Keras and other important libraries including tensorflow, numpy et cetera
4GB RAM
Any Operating System would do
Ipynb editor like Jupyter or Ipython
Intel i3 7th Gen or above

Dataset Requirements

We would be using Flickr8K_ dataset . As the name suggests the particular dataset contains around 8000 images with around 5 captions per image. The reason is because it is realistic and relatively small so that you can download it and build models on your workstation using a CPU.

The Dataset can be downloaded through the request form at this Dataset Request Form Download the datasets and unzip them into your current working directory. You will have two directories:

Flickr8k_Dataset: Contains 8092 photographs in JPEG format.
Flickr8k_text: Contains a number of files containing different sources of descriptions for the photographs.

The dataset has a pre-defined training dataset (6,000 images), development dataset (1,000 images), and test dataset (1,000 images).

Main Architecture

Model Main Summary CHECK THE REPORT FOR DETAILED DESCRIPTION

:

Steps to implement this locally on your system

You can easily implement the project locally on your system easily with the following Steps:

Download the Dataset that is linked in the README
Now try using clone method to clone this repository in your local system
Before implementing see to it that all the path variables are set correctly
After setting everything up , You can run the cells
Your program would run and generate the needed outputs

Results

Check out my other repos as well. Enjoy and be Safe

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.vscode		.vscode
.gitignore		.gitignore
17BCE1328_17BLC1046_ImagCaptionGen.pdf		17BCE1328_17BLC1046_ImagCaptionGen.pdf
Project.ipynb		Project.ipynb
README.md		README.md
abc.png		abc.png
index.png		index.png
model.png		model.png
prac.ipynb		prac.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Caption Generation

Please read the complete description and README to be clear about the implementation.

Technical Report is saved here with all the references

Requirements

Dataset Requirements

Main Architecture

Steps to implement this locally on your system

Results

About

Releases

Packages

Contributors 2

Languages

Dibyanshu-gtm/ImageCaptioning

Folders and files

Latest commit

History

Repository files navigation

Image Caption Generation

Please read the complete description and README to be clear about the implementation.

Technical Report is saved here with all the references

Requirements

Dataset Requirements

Main Architecture

Steps to implement this locally on your system

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages