LLAVA-PHI-3 LLAVA-PHI-3

Analiza imaginilor cu LLAVA-PHI-3

Un loc în catre inteligența artificială este de mare ajutor este înlocuirea sarcinilor repetitive de a compune descrieri la imaginile de pe website-uri, descrieri care contează foarte mult din punct de vedere SEO.

Pentru aceasta putem folosi modele LLM multimodale care pot analiza și interpreta imagini și pot genera un text care să răspundă cerințelor utilizatorilor.

Vom porni de la premisa că avem instalat un server Ollama local și în folderul curent sunt deja imaginile pe care vom face testul.

import requests
import json
import base64

url = "http://localhost:11434/api/generate"

# Convert image to base64
with open("imagine4.jpeg", "rb") as image_file:
    encoded_string = base64.b64encode(image_file.read()).decode('utf-8')

payload = json.dumps({
    "model": "llava-phi3",
    "prompt": "You are a marketing assistent. You are provided with images of hotels and different attractions. Describe the image for usage in marketing for those locations.",
    "images": [encoded_string],
    "stream": False
})
headers = {
    'Content-Type': 'application/json'
}

response = requests.request("POST", url, headers=headers, data=payload)
# Parse the JSON response
json_response = response.json()

# Display the "response" field
print(json_response['response'])

#print(response.json())

În exemplul nostru vom folosi modelul LLAVA-PHI3 care are capacitate de analiză a imaginilor și este bazat și pe modelul conversațional PHI3.

Server-ul Ollama rulează local pe portul 11434 și poate fi accesat și pe baza unor simple request-uri.

Imaginile testate sunt codate base64 și sunt adăugate în reqest.

De asemenea atunci când apelâm modelul îi transmitem și instrucțiuni cu privire la comportament și la ce fel de răspuns așteptăm de la el.

Prima imagine este:

Răspunsul modelului este:

The image displays a serene and inviting resort setting. A large swimming pool is the central focus, surrounded by well-maintained landscaping and lounge chairs for relaxation. Adjacent to the pool is a villa with a Mediterranean architectural style, featuring beige walls and terracotta roof tiles. The villa has multiple windows and doors, suggesting a multi-bedroom layout suitable for families or groups of friends.

In front of the villa, there’s a well-maintained lawn area that provides a green space for outdoor activities or events. To the right of the pool, a building with a flat roof can be seen. This structure appears to be part of the resort and could serve as additional guest rooms or facilities.

The resort is set against a backdrop of clear blue skies, indicating fair weather conditions which are ideal for outdoor activities like swimming or lounging by the pool. The overall impression is that of a luxurious vacation destination offering comfort, relaxation, and possibly recreational amenities not visible in the image.

Dacă dorim ca această descriere să fie în limba română redusă la o singură propoziție putem folosi de exemplu un alt model (în cazul de față Gemma2) căruia îi vom cere să facă o simplă sumarizare și reformulare, iar descrierea devine:

Resort luxos oferind relaxare și confort cu piscină, vilă și vederi panoramice

A doua imagine pentru care dorim o descriere este:

Răspunsul modelului este:

The image captures a serene and inviting evening at a Mediterranean-style resort. The focal point of the scene is a large, two-story villa with beige stone walls and tiled roofs, topped with terracotta tiles. A white railing wraps around the upper level of the villa, adding a touch of elegance to its facade. A porch on the ground floor features a wooden pergola, inviting guests to relax and enjoy the view.

In front of the villa is a well-maintained garden with palm trees that add a tropical vibe to the setting. The villa is nestled among other buildings and establishments, suggesting that it’s part of a larger resort or community.

The sky transitions from a warm orange to a deeper blue as dusk settles in, creating a peaceful ambiance. This image could be used to convey a sense of tranquility, luxury, and the charm of Mediterranean architecture for potential guests or clients considering a stay at this location.

În fine, ultima imagine pe care am testat scriptul este:

Răspunsul modelului este:

The image presents a breathtaking view of an overwater bungalow resort. The resort is surrounded by turquoise waters, providing a stunning contrast to the coral reef visible beneath the surface. The villas are scattered across the water, some closer to the shore, offering guests a unique and immersive experience living amidst nature.

The resort’s architecture features traditional thatched roofs, blending seamlessly with the natural surroundings. The overwater bungalows seem to be designed for luxury, suggesting a high-end getaway.

The water is dotted with several small boats, indicating that the resort may offer various water activities such as snorkeling or boat tours. The sky above is clear and blue, perfect weather for an island vacation.

The image also shows a variety of tropical plants and palm trees on the land surrounding the bungalows, enhancing the tropical ambiance. The overall impression is one of tranquility and escape from the hustle and bustle of everyday life, offering potential guests the opportunity to experience a paradise holiday.

După cum se vede descrierile sunt excelente!

Lasă un răspuns

Adresa ta de email nu va fi publicată. Câmpurile obligatorii sunt marcate cu *