What is automatic image captioning?
- AI writes a description of the contents of an image
Why is image captioning important?
- Image search using sentences.
- Help blind to see the world.
- One step towards AI.
Try the demo!
Available on Oct 22nd for CityU Virtual Information Day!
- Step 1: Send an email to imagecap.visal@gmail.com
- Attach/insert a photo.
- The email subject should be “Image captioning”.
- The below QR code could help you send the email.
- Step 2: Wait…AI is thinking (takes about 30 seconds).
- Step 3: Receive an email with the generated image caption.
- Tips:
- It works better on photos of everyday scenes.
- You can send multiple photos in one email.
- Images should be .jpg, .jpeg, or .png
How to achieve automatic image captioning?
- The system consists of 3 parts that mimic humans:
- Vision part: to view different parts of the images, which is like the eyes.
- Language part: to create a sentence.
- The link between vision and language.
Captioning System
- For the “vision part”, we use a convolutional neural network (CNN) to extract image features. For the “language part”, we use a causal convolution network to represent high-level word concepts. An attention model focuses on the important regions while processing the high-level concepts. A gated recurrent unit (GRU) fuses the image and language features.
Example Results
- Input image, attention map, and generated caption