A.I. Writer: Automatic Image Captioning

What is automatic image captioning?

AI writes a description of the contents of an image

Why is image captioning important?

Image search using sentences.
Help blind to see the world.
One step towards AI.

Try the demo!

Available on Oct 22nd for CityU Virtual Information Day!

Step 1: Send an email to imagecap.visal@gmail.com
- Attach/insert a photo.
- The email subject should be “Image captioning”.
- The below QR code could help you send the email.
Step 2: Wait…AI is thinking (takes about 30 seconds).
Step 3: Receive an email with the generated image caption.

Tips:
- It works better on photos of everyday scenes.
- You can send multiple photos in one email.
- Images should be .jpg, .jpeg, or .png

How to achieve automatic image captioning?

The system consists of 3 parts that mimic humans:
- Vision part: to view different parts of the images, which is like the eyes.
- Language part: to create a sentence.
- The link between vision and language.

Captioning System

For the “vision part”, we use a convolutional neural network (CNN) to extract image features. For the “language part”, we use a causal convolution network to represent high-level word concepts. An attention model focuses on the important regions while processing the high-level concepts. A gated recurrent unit (GRU) fuses the image and language features.

Example Results

Input image, attention map, and generated caption