The Gemini API can generate text output when provided text, images, video, and
audio as input.
This guide shows you how to generate text using the
generateContent and
streamGenerateContent
methods. To learn about working with Gemini's vision and audio capabilities,
refer to the Vision and Audio
guides.
What's next
Now that you have explored the basics of the Gemini API, you might want to
try:
Vision understanding: Learn how to use
Gemini's native vision understanding to process images and videos.
Audio understanding: Learn how to use
Gemini's native audio understanding to process audio files.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Missing the information I need","missingTheInformationINeed","thumb-down"],["Too complicated / too many steps","tooComplicatedTooManySteps","thumb-down"],["Out of date","outOfDate","thumb-down"],["Samples / code issue","samplesCodeIssue","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-01-08 UTC."],[],[]]